HIStream-Import: A Generic ETL Framework for Processing Arbitrary Patient Data Collections or Hospital Information Systems into HL7 FHIR Bundles

Stud Health Technol Inform. 2021 May 24:278:75-79. doi: 10.3233/SHTI210053.


Data integration is a necessary and important step to perform translational research and improve the sample size beyond single data collections. For health information, the most recent established communication standards is HL7 FHIR. To bridge the concepts of "minimal invasive" data integration and open standards, we propose a generic ETL framework to process arbitrary patient related data collections into HL7 FHIR - which in turn can then be used for loading into target data warehouses. The proposed algorithm is able to read any relational delimited text exports and produce a standard HL7 FHIR bundle collection. We evaluated an implementation of the algorithm using different lung research registries and used the resulting FHIR resources to fill our i2b2 based data warehouse as well an OMOP common data model repository.

Keywords: Data integration; factual databases; standardization.

MeSH terms

  • Algorithms
  • Data Warehousing
  • Electronic Health Records*
  • Hospital Information Systems*
  • Humans