Process-enhanced similarity search has a big potential to improve knowledge discovery and decision support in a number of disciplines, and especially in clinical medicine. Currently, this potential can not be exploited due to a lack of algorithms for both the creation of annotated process representations from unstructured content, and of methods for the effective comparison of such annotated processes. In the simpatix project, we focus on the medical domain where the central concept is a patient’s case, recorded in a (electronic) health record (EHR). Consisting of mostly unstructured or semi-structured data, such as clinical notes from examinations and treatments, tabularized data from quantitative tests (such as blood screenings), or discharge summaries, each case encodes a process describing the individual patient’s disease history. This project’s main objectives are to a) develop methods for the construction of structured, process-oriented case representations from large data sets including unstructured documents; b) research algorithms for process-enhanced similarity search over richly annotated case collections; and to c) design and implement a generic repository to store process-enhanced case collections that allows scalable, effective similarity search.

A three minute overview of the simpatix project was recorded at the Future Medicine Science Match event organized by the Berlin Institute of Health and Tagesspiegel:


Funding: DFG, “Eigene Stelle”
Period: 2016 – 2019
Partnering Institutions: Humboldt-Universität zu Berlin,
Charite Universitätsmedizin Berlin