Welcome to MOLGENIS/connect
MOLGENIS/connect is a semi-automatic data integration system built in MOLGENIS that can assist researchers in finding, matching and pooling data from different biobanks. During the data integration process, Molgenis/connect not only can suggest relevant data elements from biobanks for the given interest of research variables but also is capable of generating data tranformation algorithms for data integration. In addition, users can easily interact with the system to improve upon the suggested mappings and algorithms.
The demo is created using data from the Healthy Obese Project. The target schema consists of approximiately 90 core data elements e.g. Body Mass Index, History of Hypertension representing the research question. Three biobanks (LifeLines, Prevend and Mitchelstown) are selected in this demo as the source datasets for which we have the data. The task is to harmonize the 90 core data elements in the three biobanks separately and then pool the harmonized results into one dataset.
Although the demo version does not have full functionality, it allows you to view all the mappings and algorithms in the demo mapping project that have been generated in advance. In addition you will be able to try out the semantic search and the algorithm generator functions in the demo. To get access to MOLGENIS/connect, please contact the administrator for login credentials. Try out the examples below, you can directly get results by clicking one of the three example links.
First, we developed the semantic search that uses ontology-based query expansion to find relevant data elements from biobanks, irrespective of variations in the terminologies used. Second, we created the algorithm generator that can automatically generate data transformation algorithms to convert these data elements to the target schema, including unit conversion, category mapping, and more complex recurring conversion patterns e.g. calculation of BMI.