Use of the l1 norm for selection of sparse parameter sets that accurately predict drug response phenotype from viral genetic sequences

AMIA Annu Symp Proc. 2005:2005:505-9.

Abstract

We describe the use of the l1 norm for selection of a sparse set of model parameters that are used in the prediction of viral drug response, based on genetic sequence data of the Human Immunodeficiency Virus (HIV) reverse-transcriptase enzyme. We discuss the use of the l1 norm in the Least Absolute Selection and Shrinkage Operator (LASSO) regression model and the Support Vector Machine model. When tested by cross-validation with laboratory measurements, these models predict viral phenotype, or resistance, in response to Reverse-Transcriptase Inhibitors (RTIs) more accurately than other known models. The l1 norm is the most selective convex function, which sets a large proportion of the parameters to zero and also assures that a single optimal solution will be found, given a particular model formulation and training data set. A statistical model that reliably predicts viral drug response is an important tool in the selection of Anti-Retroviral Therapy. These techniques have general application to modeling phenotype from complex genetic data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Anti-Retroviral Agents / therapeutic use*
  • Decision Trees
  • Drug Resistance, Viral / genetics*
  • Expert Systems
  • HIV Reverse Transcriptase / genetics*
  • HIV-1 / genetics*
  • Humans
  • Models, Statistical*
  • Mutation
  • Phenotype
  • Regression Analysis

Substances

  • Anti-Retroviral Agents
  • HIV Reverse Transcriptase