Benchmarking LLM-Based Synthetic Data Generators for Structural Coherence in Behavioral and Human-Centered Datasets
Scholarship details
| Study levels | Degree | 
|---|---|
| Close date | Monday, 22 September 2025 | 
| Domestic/international | Domestic Only | 
About the scholarship
This project benchmarks large language model (LLM)-based synthetic data generators, such as GReaT and TabulaLLM, with a focus on their ability to preserve structural coherence in behavioural and human-centred datasets. These datasets encompass psychological, educational, and user behaviour data that often include ordinal scales, categorical variables, logical constraints, and complex theory-driven relationships unique to human-centred research. The project will evaluate how effectively current LLM-based models generate synthetic data that maintains these important structural and semantic properties.
Entry requirements
A completed online application must be submitted by 4:30 pm on the closing date. Any required supporting documentation (including references) must also be received by the closing date in order for the application to be considered.