Poly-omic prediction of complex traits: OmicKriging

Heather E Wheeler; Keston Aquino-Michaels; Eric R Gamazon; Vassily V Trubetskoy; M Eileen Dolan; R Stephanie Huang; Nancy J Cox; Hae Kyung Im

doi:10.1002/gepi.21808

Poly-omic prediction of complex traits: OmicKriging

Genet Epidemiol. 2014 Jul;38(5):402-15. doi: 10.1002/gepi.21808. Epub 2014 May 2.

Authors

Heather E Wheeler¹, Keston Aquino-Michaels, Eric R Gamazon, Vassily V Trubetskoy, M Eileen Dolan, R Stephanie Huang, Nancy J Cox, Hae Kyung Im

Affiliation

¹ Section of Hematology/Oncology, Department of Medicine, University of Chicago, Chicago, Illinois, United States of America.

Abstract

High-confidence prediction of complex traits such as disease risk or drug response is an ultimate goal of personalized medicine. Although genome-wide association studies have discovered thousands of well-replicated polymorphisms associated with a broad spectrum of complex traits, the combined predictive power of these associations for any given trait is generally too low to be of clinical relevance. We propose a novel systems approach to complex trait prediction, which leverages and integrates similarity in genetic, transcriptomic, or other omics-level data. We translate the omic similarity into phenotypic similarity using a method called Kriging, commonly used in geostatistics and machine learning. Our method called OmicKriging emphasizes the use of a wide variety of systems-level data, such as those increasingly made available by comprehensive surveys of the genome, transcriptome, and epigenome, for complex trait prediction. Furthermore, our OmicKriging framework allows easy integration of prior information on the function of subsets of omics-level data from heterogeneous sources without the sometimes heavy computational burden of Bayesian approaches. Using seven disease datasets from the Wellcome Trust Case Control Consortium (WTCCC), we show that OmicKriging allows simple integration of sparse and highly polygenic components yielding comparable performance at a fraction of the computing time of a recently published Bayesian sparse linear mixed model method. Using a cellular growth phenotype, we show that integrating mRNA and microRNA expression data substantially increases performance over either dataset alone. Using clinical statin response, we show improved prediction over existing methods. We provide an R package to implement OmicKriging (http://www.scandb.org/newinterface/tools/OmicKriging.html).

Keywords: Kriging; complex trait prediction; polygenic modeling; polygenic prediction; systems biology.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Bayes Theorem
Case-Control Studies
Cell Growth Processes / genetics
Cholesterol, LDL / blood
Computational Biology / methods*
Genetic Predisposition to Disease / genetics*
Humans
MicroRNAs / genetics
Models, Genetic
Multifactorial Inheritance / genetics*
Phenotype
RNA, Messenger / genetics
Simvastatin / pharmacology
Software
Systems Biology / methods
Time Factors

Substances

Cholesterol, LDL
MicroRNAs
RNA, Messenger
Simvastatin

Abstract

Publication types

MeSH terms

Substances

Grants and funding