[LUM#22] Governing Data

Artificial intelligence relies on the availability of data in sufficient quantity and quality. But how can we ensure that research data is made available in compliance with the law? This is the focus of the task entrusted to Agnès Robin by the Ministry of Higher Education and Research.

“In a scientific context—whether it involves AI research or other fields—the challenge is to clearly identify the legal and regulatory framework applicable to the data we wish to reuse, explains Agnès Robin, a faculty researcher at the Innovation, Communication, and Market Laboratory (Licem*). While the principle of open research data has been established since 2012, it nevertheless has limitations. The first is justified by the protection of personal data. A historian sifting through archives, for example, might wish to use civil registry data. “The researcher will only be able to disseminate the data resulting from their research after anonymization,” explains the legal expert.

Legal Resources Platform

If the data is protected by copyright or the sui generis database right, the user must obtain the consent of the rights holders, “unless the activity involves text and data mining, as this operation was granted an exception in 2019.” Finally, data covered by confidentiality (defense secrets, medical confidentiality, trade secrets, etc.) “enjoy absolute protection and may not be made available, except through a trusted third party.”

These three types of constraints have different implications that require close attention. As Agnès Robin explains , “The purpose of my role is to provide researchers and research support services with a platform that helps them analyze the datasets they use or produce by supplying them with accurate legal information.”

Data sharing

“While the open science policy clearly aims to make as much data as possible available to researchers, it requires that this be done in strict compliance with the rules : ‘as open as possible, no more closed than necessary.’” “At the heart of open science is the idea that data shouldn’t remain on hard drives when it could advance research by being used by others,” explains the researcher.

To ensure data sharing, the European Union has established a common data space for open science known as the European Open Science Cloud. Health data, meanwhile, is subject to specific regulations that are still in the process of being adopted and will eventually enable their sharing viathe European Health Data Space. “The stakes are enormous for both research and patients,” concludes Agnès Robin!

AI Law

On May 17, 2024, the European Union adopted a regulation aimed at ensuring that AI respects fundamental rights. “This is the very first regulation on AI in the world,” emphasizes Agnès Robin. “Before this, there was nothing!” In particular, it establishes liability that may arise from the implementation of generative AI systems. “It is now mandatory to ensure what is known as algorithmic explainability and transparency—in other words, one must be able to explain how AI analyzes data, what data the analysis is based on, and what rules may be used to make decisions affecting individuals.” This is far from always being the case .

UM podcasts are now available on your favorite platform (Spotify, Deezer, Apple Podcasts, Amazon Music, etc.).