[LUM#22] Toward Better-Protected Data

While data is the raw material of artificial intelligence, it is sometimes particularly sensitive, especially in the healthcare sector. How can we reconcile AI, privacy, and sovereignty? According to Aurélien Bellet, a researcher at the Desbrest Institute of Epidemiology and Public Health1, the solution lies in particular in federated learning.

To include a large number of patients in clinical trials and thereby make them more meaningful, researchers rely on multicenter studies. This means that these studies involve multiple hospitals or clinics at the same time, sometimes even across several different countries. Advantage: this approach allows for large-scale studies involving patients from diverse social and geographic backgrounds.

Sharing without sharing

This approach also has a drawback: multicenter studies require health data from multiple institutions to be consolidated on a single server, “which makes it impossible to maintain control over the data and could also jeopardize its confidentiality,” explains Aurélien Bellet, a researcher at Idesp. How can medical research implement these collaborations while reducing the risk of sensitive information being disclosed? One solution is to share… without sharing. This is federated learning. “It allows data from each institution to be processed on-site, without having to exchange, transfer, or transmit it, explains the federated learning specialist.

To meet this challenge, researchers are developing machine learning algorithms capable of operating using data stored locally rather than centralized on a server, as is the case with conventional machine learning methods. “It is the intermediate results of this learning process that are exchanged as the process unfolds, rather than the data itself. We thus alternate between local learning and aggregating the results,” explains Aurélien Bellet, who collaborates with university hospitals in Lille, Caen, Amiens, and Rouen, among others.

Democratization

"To promote confidentiality and adherence to medical ethics, federated learning 'is part of the solution, even if it is often not enough to guarantee data confidentiality, ' explains the researcher, whose team is also collaborating with the French Data Protection Authority (CNIL) on the complex issue of data protection. "

Because the benefits of federated learning extend far beyond the medical field. “This may also be of interest to companies that want to collaborate with competitors without giving them access to certain sensitive information, adds Aurélien Bellet, who also sees federated learning as an opportunity to make AI more accessible. “It’s a form of democratization of artificial intelligence and machine learning, because it doesn’t require investment in a large infrastructure, thereby paving the way for collaborative uses, for example by citizen groups.”

See also:

Aurélien Bellet's presentation on federated learning

UM podcasts are now available on your favorite platform (Spotify, Deezer, Apple Podcasts, Amazon Music, etc.).

  1. Idesp (Inserm, UM)
    ↩︎