Advantages and limitations of anticipating laboratory test results from regression- and tree-based rules derived from electronic health-record data

PLoS One. 2014 Apr 14;9(4):e92199. doi: 10.1371/journal.pone.0092199. eCollection 2014.


Laboratory testing is the single highest-volume medical activity, making it useful to ask how well one can anticipate whether a given test result will be high, low, or within the reference interval ("normal"). We analyzed 10 years of electronic health records--a total of 69.4 million blood tests--to see how well standard rule-mining techniques can anticipate test results based on patient age and gender, recent diagnoses, and recent laboratory test results. We evaluated rules according to their positive and negative predictive value (PPV and NPV) and area under the receiver-operator characteristic curve (ROC AUCs). Using a stringent cutoff of PPV and/or NPV≥0.95, standard techniques yield few rules for sendout tests but several for in-house tests, mostly for repeat laboratory tests that are part of the complete blood count and basic metabolic panel. Most rules were clinically and pathophysiologically plausible, and several seemed clinically useful for informing pre-test probability of a given result. But overall, rules were unlikely to be able to function as a general substitute for actually ordering a test. Improving laboratory utilization will likely require different input data and/or alternative methods.

MeSH terms

  • Clinical Laboratory Techniques*
  • Electronic Health Records*
  • Humans
  • Linear Models
  • Predictive Value of Tests*
  • Regression Analysis
  • Statistics as Topic*

Grant support

The authors have no support or funding to report.