
Building SHARE: the Synthetic Health dAta REpository
project description
The SHARE initiative is focusing on the creation of a platform for synthetic health data. This should be publicly accessible, data protection-compliant and standardised. The data is primarily focussed on the area of rare diseases.
Synthetic data is becoming increasingly important in the healthcare sector. It offers a solution to one of the biggest challenges of medical AI applications: secure and legally compliant access to large, diverse health data. Such data makes it possible to develop and test AI models under realistic conditions – without having to rely on sensitive patient data. This creates data protection-compliant access to medical information, which opens up new opportunities for research and development.
Synthetic data can be specifically generated for rare diseases or underrepresented patient groups in order to close data gaps and minimise bias in AI systems. This improves the quality and fairness of AI models and can contribute to fairer medical care in the long term.
Another key aspect of SHARE is the creation of an internationally accessible repository. This is intended to facilitate collaboration between different groups. As synthetic data cannot be traced back to individuals, it can be shared and utilised in a legally secure manner – without having to deal with legal hurdles or data protection risks.
The interdisciplinary SHARE team brings together experts from various institutions with the common goal of creating an international infrastructure for synthetic health data. In doing so, SHARE aims to make an important contribution to the further development of AI in the healthcare sector – openly, responsibly and with a focus on the future.
The SHARE project is led by Richard Noll and Jannik Schaaf from the Institute of Medical Informatics in Frankfurt.
SHARE Sandpit Workshop

From June 2 to 4, 2025, the Sandpit “Building SHARE: the Synthetic Health dAta REpository” took place in Frankfurt am Main on the campus of the University Hospital. The aim of the event was to develop innovative solutions for building an international synthetic health data repository (SHARE) that facilitates research and development in the field of artificial intelligence (AI) and data-driven science in particular. The focus was on the use of rare diseases as a use case, as real patient data is often scarce in this area and synthetic data opens up new possibilities. The Sandpit provided an experimental space in which experts from different disciplines – including patient advocacy groups, AI and machine learning researchers, computer scientists, data management experts and clinicians – worked together to identify key challenges, develop solutions and define concrete next steps for future collaboration.
