Haplotype reconstruction from SNP fragments by minimum error correction

Bioinformatics. 2005 May 15;21(10):2456-62. doi: 10.1093/bioinformatics/bti352. Epub 2005 Feb 24.


Motivation: Haplotype reconstruction based on aligned single nucleotide polymorphism (SNP) fragments is to infer a pair of haplotypes from localized polymorphism data gathered through short genome fragment assembly. An important computational model of this problem is the minimum error correction (MEC) model, which has been mentioned in several literatures. The model retrieves a pair of haplotypes by correcting minimum number of SNPs in given genome fragments coming from an individual's DNA.

Results: In the first part of this paper, an exact algorithm for the MEC model is presented. Owing to the NP-hardness of the MEC model, we also design a genetic algorithm (GA). The designed GA is intended to solve large size problems and has very good performance. The strength and weakness of the MEC model are shown using experimental results on real data and simulation data. In the second part of this paper, to improve the MEC model for haplotype reconstruction, a new computational model is proposed, which simultaneously employs genotype information of an individual in the process of SNP correction, and is called MEC with genotype information (shortly, MEC/GI). Computational results on extensive datasets show that the new model has much higher accuracy in haplotype reconstruction than the pure MEC model.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Chromosome Mapping / methods*
  • Computer Simulation
  • DNA Mutational Analysis / methods*
  • Genome, Human
  • Haplotypes / genetics*
  • Humans
  • Models, Genetic*
  • Models, Statistical
  • Phylogeny
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*