Use Case 2 - From Data to Knowledge

Clinico-Molecular Predictive Knowledge Tool

In the MIRACUM consortium’s use case “From Data to Knowledge – Clinico-Molecular Predictive Knowledge Tool”, the aim is to develop and establish methods for the cross-site analysis of patient data in the participating university hospitals. The methods will be used to generate knowledge that can be directly applied in clinical practice.

The gradual expansion of the data integration centers at the medical university sites of the Medical Informatics Initiative will create a basis for identifying patient cohorts based on clinical parameters, biomarkers and molecular/genomic studies and dividing them into subgroups. In Use Case 2 of the MIRACUM consortium, predictive models are to be developed on this basis, which can contribute to medical knowledge and potentially support physicians in their diagnostic and therapeutic decisions. In the clinical area, the use case focuses on patients with lung diseases (asthma and COPD) and brain tumors.

MIRACUM – Gemeinsam gegen Asthma und COPD (in German, source: BMBF)

A concrete example: Alpha-1-antitrypsin deficiency (AATM) is a hereditary disease in which the enzyme alpha-1-antitrypsin is missing in the body. As a result, tissue damage to the lungs and liver can occur, leading to chronic obstructive pulmonary disease (COPD) at a young age. Thus, COPD patients with and without AATM often differ fundamentally – both in age and in smoking history, the biggest risk factors for COPD. The problem is that COPD with AATM is rather rare, which is why prognostic factors for complications and emerging comorbidities have usually been established in COPD records of patients without AATM. The use case “From Data to Knowledge” now wants to investigate whether these factors can be used for COPD patients with AATM despite the fundamental differences.

The corresponding data in MIRACUM are to be regarded as particularly worth protecting from a data protection perspective. A centralized collection across all locations is potentially too great a risk. Therefore, the goal is not to bring the data to analysis, but to bring the analysis to the data. More precisely, only aggregated and anonymous data should leave the sites. This principle is implemented by the software DataSHIELD, which was developed at the University of Newcastle. The software is published under an open source license and can be used freely. DataSHIELD offers various procedures that are part of the statistical toolkit, ranging from the calculation of simple ratios, such as averages or frequencies, to more complex regression models that are used in the clinical application described above. In addition to these already implemented analysis procedures, DataSHIELD also offers a flexible and expandable infrastructure to develop new types of artificial intelligence methods and apply them to networked data. To this end, the MIRACUM consortium is in close exchange with the development team and the DataSHIELD community.

In addition to the use of anonymous aggregated data, the use of synthetic data is researched in use cases to meet data protection requirements. Synthetic data are data that do not contain real observations and patient information, but rather replicate general characteristics and statistical relationships of real data. For the use of data in research, this means that virtual patient data are created for each site, which are not bound to the data of an individual patient. Such data can then be shared and allow the use of different analysis concepts, such as standard statistical analyses or artificial intelligence techniques. Machine learning approaches are required to generate synthetic data from real data. Specifically, so-called generative models are used, which map the systematic and random variability of the original data. This is made possible by artificial intelligence techniques, in particular techniques from the field of deep learning. The generation of virtual patient data is distributed over different MIRACUM locations. The DataSHIELD infrastructure is also used for this purpose. In this way, the analysis of the data with established procedures and the development of new methods for the data protection-compliant analysis of distributed patient data can be jointly advanced.

Name	Default Cookie
Provider	Owner of this website
Purpose	Saves the visitors preferences selected in the Consent Banner.
Privacy Policy	https://www.miracum.org/en/privacy/
Hosts	www.miracum.org
Cookie Name	rrze-legal-consent
Cookie Expiry	1 Year

Name	WordPress
Provider	No transmission to third parties
Purpose	Test if cookie can be set. Remember User session.
Privacy Policy	https://www.miracum.org/en/privacy/
Hosts	.www.miracum.org
Cookie Name	wordpress_[*]
Cookie Expiry	Session

Name	SimpleSAML
Provider	No transmission to third parties
Purpose	Used to manage WebSSO session state.
Privacy Policy	https://www.miracum.org/en/privacy/
Hosts	www.miracum.org
Cookie Name	SimpleSAMLSessionID,SimpleSAMLAuthToken
Cookie Expiry	Session

Name	PHPSESSID
Provider	No transmission to third parties
Purpose	Preserves user session state across page requests.
Privacy Policy	https://www.miracum.org/en/privacy/
Hosts	www.miracum.org
Cookie Name	PHPSESSID
Cookie Expiry	Session

Accept	Twitter
Name	Twitter
Provider	Twitter International Company, One Cumberland Place, Fenian Street, Dublin 2, D02 AX07, Ireland
Purpose	Used to unblock Twitter content.
Privacy Policy	https://twitter.com/privacy
Hosts	twimg.com, twitter.com
Cookie Name	__widgetsettings, local_storage_support_test
Cookie Expiry	Unlimited

Accept	Vimeo
Name	Vimeo
Provider	Vimeo Inc., 555 West 18th Street, New York, New York 10011, USA
Purpose	Used to unblock Vimeo content.
Privacy Policy	https://vimeo.com/privacy
Hosts	player.vimeo.com
Cookie Name	vuid
Cookie Expiry	2 Years

Accept	Slideshare
Name	Slideshare
Provider	Scribd, Inc., 460 Bryant St, 100, San Francisco, CA 94107-2594 USA
Purpose	Used to unblock Slideshare content.
Privacy Policy	https://www.slideshare.net/privacy
Hosts	www.slideshare.net
Cookie Name	__utma
Cookie Expiry	2 Years

Clinico-Molecular Predictive Knowledge Tool

Read more

Publications