T-Coffee: A novel method for fast and accurate multiple sequence alignment

C Notredame; D G Higgins; J Heringa

doi:10.1006/jmbi.2000.4042

T-Coffee: A novel method for fast and accurate multiple sequence alignment

J Mol Biol. 2000 Sep 8;302(1):205-17. doi: 10.1006/jmbi.2000.4042.

Authors

C Notredame¹, D G Higgins, J Heringa

Affiliation

¹ National Institute for Medical Research, The Ridgeway, London, NW7 1AA, UK. cedric.notredame@europe.com

PMID: 10964570
DOI: 10.1006/jmbi.2000.4042

Abstract

We describe a new method (T-Coffee) for multiple sequence alignment that provides a dramatic improvement in accuracy with a modest sacrifice in speed as compared to the most commonly used alternatives. The method is broadly based on the popular progressive approach to multiple alignment but avoids the most serious pitfalls caused by the greedy nature of this algorithm. With T-Coffee we pre-process a data set of all pair-wise alignments between the sequences. This provides us with a library of alignment information that can be used to guide the progressive alignment. Intermediate alignments are then based not only on the sequences to be aligned next but also on how all of the sequences align with each other. This alignment information can be derived from heterogeneous sources such as a mixture of alignment programs and/or structure superposition. Here, we illustrate the power of the approach by using a combination of local and global pair-wise alignments to generate the library. The resulting alignments are significantly more reliable, as determined by comparison with a set of 141 test cases, than any of the popular alternatives that we tried. The improvement, especially clear with the more difficult test cases, is always visible, regardless of the phylogenetic spread of the sequences in the tests.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Amino Acid Motifs
Amino Acid Sequence
Animals
Computational Biology / methods*
Databases as Topic
Humans
Molecular Sequence Data
Protein Serine-Threonine Kinases / chemistry
Reproducibility of Results
Sensitivity and Specificity
Sequence Alignment / methods*
Sequence Homology, Amino Acid
Software

Substances

Protein Serine-Threonine Kinases