Investigating the Role of Data Heterogeneity in AI Safety and Fairness
| Title: |
Investigating the Role of Data Heterogeneity in AI Safety and Fairness |
| DNr: |
Berzelius-2026-163 |
| Project Type: |
LiU Berzelius |
| Principal Investigator: |
Stefano Sarao Mannelli <s.saraomannelli@chalmers.se> |
| Affiliation: |
Chalmers tekniska högskola |
| Duration: |
2026-06-01 – 2026-12-01 |
| Classification: |
10210 |
| Homepage: |
https://stefsmlab.github.io/research/ |
| Keywords: |
|
Abstract
Modern generative AI systems, including diffusion models and transformers, exhibit unprecedented capabilities but are increasingly scrutinised for safety and fairness vulnerabilities. Empirically, these models frequently underrepresent or overfit minority demographics (bias). While these failure modes are widely documented, the underlying mechanisms linking structural data heterogeneity to representation learning dynamics remain poorly understood.
The primary goal of this project is to rigorously investigate how intrinsic data structures—specifically class variance, centroid geometry, and sampling imbalance—dictate the temporal evolution of generalisation and memorisation during training. By bridging high-dimensional analytical frameworks with large-scale empirical experiments, we aim to formally characterise the mechanisms that cause models to prioritise majority or high-variance classes while delaying, or entirely failing to generalise, minority populations.
Utilising the Berzelius infrastructure, we will train and evaluate generative architectures on both controlled synthetic datasets (e.g. Gaussian mixtures) and standard public benchmarks. We will systematically ablate training distributions to isolate the effects of structural heterogeneity, continuously tracking sample-level memorisation and class-specific learning hierarchies across the training and reverse-diffusion trajectories.
The impact of meeting this goal is twofold. Theoretically, it will establish a rigorous, mathematically grounded understanding of how data structure biases representation learning. Practically, identifying the exact stage-wise dynamics of memorisation and feature emergence will inform the development of principled interventions—such as targeted curriculum learning or dynamically adjusted sampling rates—to mitigate algorithmic bias and prevent unintended memorisation (e.g. copyright infringement or privacy leaks). Ultimately, this research provides the foundational insights required to design and deploy trustworthy and safe generative AI systems.