What is it about?
IDEA4RC is a Horizon Europe project led by Istituto Nazionale dei Tumori (Milan, Italy). The project started in September 2022 and will stretch over the course of four years.
Its objective is building an intelligent ecosystem to improve the governance, the sharing, and the re-use of health data for rare cancers.
Why do we need such an ecosystem?
Every year in Europe 650’000 people receive a rare cancer diagnosis. Taken together they represent nearly 25% of all cancer diagnoses in the continent.
Analyzing large and diverse datasets collected by different clinical centers would greatly advance the knowledge on rare cancers.
Current hurdles include lack of interoperability and the difficulties to comply with EU data protection requirements when sharing health data.
The principles
IDEA4RC plans to develop a new IT infrastructure that implements data protection and privacy by design and by default required by EU regulations and complies with the FAIR principles of scientific data management (Findability, Accessibility, Interoperability, and Reusability).
IDEA4RC will develop the data ecosystem taking into account the incentives and obstacles that different stakeholders have in sharing and re-using health data, from healthcare professionals to researchers, clinicians and patients.
It will do this thanks to a Responsible Research and Innovation approach which will involve and engage the different actors through dedicated co-creation workshops carried out by a group of social scientists with a long experience in this area. During the workshops, participants will have the chance to share their values and expectations towards the ecosystems. These activities will last throughout the life of the project to adapt the infrastructure while it is conceived, built and refined.
The ecosystem will be tested on pilot projects involving 11 centers of excellence of the EURACAN network (EUropean reference network for Rare Adult solid CANcers) partners of the project.
The ecosystem
The ecosystem will be designed in order to take into account the values of the different stakeholders involved in the rare cancers care. Social scientists will organize workshops to engage with different stakeholders and discuss with them their values and expectations.
The ecosystem will follow a federated learning approach.
- Data of different centres will be analysed without leaving their original location.
- The data of each center will be translated into a common data format and stored in “data capsules”, secure processing environments.
- Unstructured data, such as clinicians notes, will be exploited thanks to the development of dedicated machine learning algorithms that extract useful information in a standard format so that they can be analysed. Beside English, different European languages will be taken into account.
- The access to each dataset will be regulated by a data governance layer implementing the principles of data protection and privacy by design and by default required by EU regulations. Through this layer each center in the ecosystem will also be able to express its values and its willingness to share data.
- The meta data layer will clarify which data each center share with the ecosystem, so that the user who is setting up the analysis knows which capsules can contribute.
- The algorithm needed for the analyses will be defined in the federated AI layer and executed locally and iteratively over each dataset until it converges. This iterative procedure returns results very close to those the user would obtain if they ran the algorithm over the aggregated dataset.
The workflow
Here is a simplified representation of the contribution of each work package (WP) to the realisation of the project.
The 25 IDEA4RC partners participate in one or more WPs, and many activities are intertwined.
Here is the list of the WP leader institutions.
- WP1 Coordination: Fondazione IRCCS Istituto Nazionale dei Tumori
- WP2 Ecosystem conception: Universiteit Utrecht
- WP3 FHIR capsules, data federation and AI layer: Universidad Politecnica de Madrid
- WP4 Virtual registry layer: Stichting Integraal Kankercentrum Nederland
- WP5 Natural EU Languages Processing toolkit: Universidad de La Iglesia de Deusto Entidad Religiosa
- WP6 Augmented Analytics and Multimodal AI Data Navigator: Universidad Politecnica de Madrid
- WP7 Data Governance Layer: Ethniko Kentro Erevnas Kai Technologikis Anaptyxis
- WP8 Pilot data sharing for rare cancer epidemiology, research and care: Fondazione IRCCS Istituto Nazionale dei Tumori
- WP9 Pilot technical deployment and measurement: Multimed Engineers srl
- WP10 Rare Cancer Data Ecosystem enlargement and liaison: Digital Institute for Cancer Outcomes Research
- WP11 Exploitation, dissemination and communication activities: Engineering – Ingegneria Informatica spa
- WP12 Ethics requirements: Fondazione IRCCS Istituto Nazionale dei Tumori