Dataset Assessment Tools for the AI Act Compliance of High-Risk AI Systems
DOI:
https://doi.org/10.13135/2785-7867/12840Abstract
The European Union’s Artificial Intelligence Act (AI Act), effective as of August 2024, introduces stringent data governance requirements for high-risk AI systems. Central to these obligations is the mandate for data used in training, validation, and testing to be subject to rigorous quality controls and governance practices aligned with the system's intended purpose. This paper examines the AI Act’s novel emphasis on responsible data handling as a foundation for mitigating bias, ensuring safety, and protecting fundamental rights. It argues that while existing scholarship largely focuses on algorithmic fairness at the model level, insufficient attention has been given to fairness embedded within the data itself. Addressing this gap, the paper explores the complex ethical and epistemological tensions inherent in defining “fairness”, and critiques current policy frameworks for their narrow focus on technical fixes. The study highlights the FAnFAIR tool as a pioneering approach to data fairness evaluation, offering a hybrid methodology that integrates automated statistical diagnostics with qualitative, context-aware assessments. FAnFAIR aligns its features with the AI Act’s legal mandates and facilitates compliance through continuous bias monitoring, data quality enhancement, and informed decision-making about dataset suitability. This paper provides a socio-legal and technical analysis of data governance under the AI Act, advocating for a shift toward embedded ethical oversight throughout the AI lifecycle and, in particular, in the data processing phase.


EJIF has been approved for inclusion in
The Journal of Law, Market & Innovation is indexed in
The Journal of Law, Market & Innovation is indexed in