Federated queries of clinical data repositories: Scaling to a national network

J Biomed Inform. 2015 Jun;55:231-6. doi: 10.1016/j.jbi.2015.04.012. Epub 2015 May 6.


Federated networks of clinical research data repositories are rapidly growing in size from a handful of sites to true national networks with more than 100 hospitals. This study creates a conceptual framework for predicting how various properties of these systems will scale as they continue to expand. Starting with actual data from Harvard's four-site Shared Health Research Information Network (SHRINE), the framework is used to imagine a future 4000 site network, representing the majority of hospitals in the United States. From this it becomes clear that several common assumptions of small networks fail to scale to a national level, such as all sites being online at all times or containing data from the same date range. On the other hand, a large network enables researchers to select subsets of sites that are most appropriate for particular research questions. Developers of federated clinical data networks should be aware of how the properties of these networks change at different scales and design their software accordingly.

Keywords: Algorithms; Hospital shared services; Medical record linkage; Medical records systems; Search engine; computerized.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Security
  • Confidentiality
  • Electronic Health Records / organization & administration*
  • Internet / organization & administration*
  • Meaningful Use / organization & administration*
  • Medical Record Linkage / methods*
  • Models, Organizational*
  • Search Engine*
  • United States