Learning Optimal Individualized Treatment Rules from Electronic Health Record Data

IEEE Int Conf Healthc Inform. 2016 Oct:2016:65-71. doi: 10.1109/ICHI.2016.13. Epub 2016 Dec 8.


Medical research is experiencing a paradigm shift from "one-size-fits-all" strategy to a precision medicine approach where the right therapy, for the right patient, and at the right time, will be prescribed. We propose a statistical method to estimate the optimal individualized treatment rules (ITRs) that are tailored according to subject-specific features using electronic health records (EHR) data. Our approach merges statistical modeling and medical domain knowledge with machine learning algorithms to assist personalized medical decision making using EHR. We transform the estimation of optimal ITR into a classification problem and account for the non-experimental features of the EHR data and confounding by clinical indication. We create a broad range of feature variables that reflect both patient health status and healthcare data collection process. Using EHR data collected at Columbia University clinical data warehouse, we construct a decision tree for choosing the best second line therapy for treating type 2 diabetes patients.