PhD Defence Weiwei Wang

Supervisor: Prof. Dr. Michel Dumontier

Co-supervisor: Dr. Stefano Bromuri

Keywords: Categorical Data, Data Representation Learning, Knowledge Graph, Graph Embedding
 

"Categorical Data Embedding"


This PhD thesis presents new methods for representing categorical data, such as country names or job titles. These types of data are common across many domains but often pose challenges for data analysis and machine learning models. The research introduces three innovative approaches that transform categorical data in tables into graph structures, which help uncover hidden relationships between different categories. Graph embedding techniques are then used to generate representations of these categories. The proposed methods were evaluated on multiple datasets and consistently outperformed traditional categorical data embedding techniques. In addition, the thesis includes a comprehensive review of existing approaches and provides practical guidance for researchers and practitioners. Finally, it explores potential applications in real-world domains, such as the pension industry, where understanding complex categorical data is crucial. 

Click here for the live stream.