This scholarship has closed - for the next intake

Benchmarking LLM-Based Synthetic Data Generators for Structural Coherence in Behavioral and Human-Centered Datasets

Scholarship details

Benchmarking LLM-Based Synthetic Data Generators for Structural Coherence in Behavioral and Human-Centered Datasets
Study levels Degree
Close date Monday, 22 September 2025
Domestic/international Domestic Only

About the scholarship

This project benchmarks large language model (LLM)-based synthetic data generators, such as GReaT and TabulaLLM, with a focus on their ability to preserve structural coherence in behavioural and human-centred datasets. These datasets encompass psychological, educational, and user behaviour data that often include ordinal scales, categorical variables, logical constraints, and complex theory-driven relationships unique to human-centred research. The project will evaluate how effectively current LLM-based models generate synthetic data that maintains these important structural and semantic properties.

Entry requirements

A completed online application must be submitted by 4:30 pm on the closing date. Any required supporting documentation (including references) must also be received by the closing date in order for the application to be considered.