[LUM#22] Governing data

Artificial intelligence relies on the existence of sufficient data in terms of both quantity and quality. But how can we ensure that research data is made available in accordance with the law? This is the task entrusted to Agnès Robin by the Ministry of Higher Education and Research.

"In a scientific context, whether it's AI research or something else, the challenge is to clearly identify the legislative and regulatory framework applicable to the data you want to reuse," explains Agnès Robin, a lecturer and researcher at the Innovation, Communication, and Market Laboratory (Licem*). Although the principle of open research data has been established since 2012, it does have its limitations. The first of these is justified by the protection of personal data. A historian researching archives, for example, may want to use civil registry data. "The researcher will only be able to disseminate the data from their research after it has been anonymized," explains the lawyer.

Legal resources platform

If the data is protected by copyright or sui generis database rights, the data subject must obtain the consent of the rights holders, "except in the case of text and data mining, as this operation was exempted in 2019." Finally, data that is covered by secrecy (defense secrecy, medical secrecy, business secrecy, etc.) "enjoys absolute protection and cannot be made available, except through a trusted third party."

These three types of constraints have different effects that require close attention. "The purpose of my mission is to provide researchers and research support services with a platform that helps them analyze the data sets they use or produce by providing them with accurate legal information," explains Agnès Robin.

Data sharing

While open science policy clearly aims to make as much data as possible available for research, it requires that this be done in strict compliance with the rules of "as open as possible, no more closed than necessary." The idea behind open science is that data should not remain on hard drives when it could advance research by being used by others," explains the researcher.

To ensure sharing, the European Union has created a common data space for open science called the European Open Science Cloud. Health data is subject to specific regulations that are still in the process of being adopted, which will ultimately enable it to be shared viathe European Health Data Space. "The stakes are enormous for both research and patients," concludes Agnès Robin!

AI law

On May 17, 2024, the European Union adopted a regulation aimed at ensuring that AI respects fundamental rights. "This is the very first regulation on AI in the world," emphasizes Agnès Robin. "Before, there was nothing!" In particular, it introduces the responsibility that may result from the implementation of generative AI systems. "It is now mandatory to organize what is known as algorithmic explainability and transparency. In other words, it must be possible to explain how AI analyzes data, on what data the analysis is based, and what rules may be used to make decisions that affect individuals." This is far from always being the case.

Find UM podcasts now available on your favorite platform (Spotify, Deezer, Apple Podcasts, Amazon Music, etc.).