Through the increasing availability of more efficient data collection procedures, biomedical scientists are now confronting ever larger sets of data, often finding themselves struggling to process and interpret what they have gathered. This, while still more data continues to accumulate. This torrent of biomedical information necessitates creative thinking about how the data are being generated, how they might be best managed, analyzed, and eventually how they can be transformed into further scientific understanding for improving patient care. Recognizing this as a major challenge, the National Institutes of Health (NIH) has spearheaded the "Big Data to Knowledge" (BD2K) program - the agency's most ambitious biomedical informatics effort ever undertaken to date. In this commentary, we describe how the NIH has taken on "big data" science head-on, how a consortium of leading research centers are developing the means for handling large-scale data, and how such activities are being marshalled for the training of a new generation of biomedical data scientists. All in all, the NIH BD2K program seeks to position data science at the heart of 21st Century biomedical research.
Keywords: Biomedicine; Computing; Data science; Software; Training.
Copyright © 2017 Elsevier Inc. All rights reserved.