Widespread sharing of data from electronic health records and patient-reported outcomes can strengthen the national capacity for conducting cost-effective clinical trials and allow research to be embedded within routine care delivery. While pragmatic clinical trials (PCTs) have been performed for decades, they now can draw on rich sources of clinical and operational data that are continuously fed back to inform research and practice. The Health Care Systems Collaboratory program, initiated by the NIH Common Fund in 2012, engages healthcare systems as partners in discussing and promoting activities, tools, and strategies for supporting active participation in PCTs. The NIH Collaboratory consists of seven demonstration projects, and seven problem-specific working group 'Cores', aimed at leveraging the data captured in heterogeneous 'real-world' environments for research, thereby improving the efficiency, relevance, and generalizability of trials. Here, we introduce the Collaboratory, focusing on its Phenotype, Data Standards, and Data Quality Core, and present early observations from researchers implementing PCTs within large healthcare systems. We also identify gaps in knowledge and present an informatics research agenda that includes identifying methods for the definition and appropriate application of phenotypes in diverse healthcare settings, and methods for validating both the definition and execution of electronic health records based phenotypes.
Keywords: Clinical Research; Data quality; Phenotyping; Secondary Data Use.