Standardizing the next generation of bioinformatics software development with BioHDF (HDF5)

Adv Exp Med Biol. 2010:680:693-700. doi: 10.1007/978-1-4419-5913-3_77.

Abstract

Next Generation Sequencing technologies are limited by the lack of standard bioinformatics infrastructures that can reduce data storage, increase data processing performance, and integrate diverse information. HDF technologies address these requirements and have a long history of use in data-intensive science communities. They include general data file formats, libraries, and tools for working with the data. Compared to emerging standards, such as the SAM/BAM formats, HDF5-based systems demonstrate significantly better scalability, can support multiple indexes, store multiple data types, and are self-describing. For these reasons, HDF5 and its BioHDF extension are well suited for implementing data models to support the next generation of bioinformatics applications.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology
  • Computer Simulation
  • Database Management Systems
  • Databases, Genetic
  • Sequence Alignment / standards
  • Sequence Alignment / statistics & numerical data*
  • Sequence Alignment / trends
  • Sequence Analysis / standards
  • Sequence Analysis / statistics & numerical data*
  • Sequence Analysis / trends
  • Software / standards
  • Software / trends
  • Software Design
  • User-Computer Interface