Building a genome analysis pipeline to predict disease risk and prevent disease

J Mol Biol. 2013 Nov 1;425(21):3993-4005. doi: 10.1016/j.jmb.2013.07.038. Epub 2013 Aug 5.


Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, transcriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice.

Keywords: CNV; ENCODE; Encyclopedia of DNA Elements; GWAS; HGP; Human Genome Project; InDel; NGS; SNP; WES; WGS; copy number variation; genome-wide association study; genotype-phenotype relationships; insertion/deletion variant; medical genomics; next-generation sequencing; single-nucleotide polymorphism; variant burden; variant calling; whole-exome sequencing; whole-genome sequencing.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Genetic Predisposition to Disease*
  • Genome, Human*
  • Humans
  • Mutation*
  • Risk
  • Sequence Analysis, DNA / methods*