Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?

PLoS One. 2014 Jul 11;9(7):e101850. doi: 10.1371/journal.pone.0101850. eCollection 2014.

Abstract

Reciprocal Best Hits (RBH) are a common proxy for orthology in comparative genomics. Essentially, a RBH is found when the proteins encoded by two genes, each in a different genome, find each other as the best scoring match in the other genome. NCBI's BLAST is the software most usually used for the sequence comparisons necessary to finding RBHs. Since sequence comparison can be time consuming, we decided to compare the number and quality of RBHs detected using algorithms that run in a fraction of the time as BLAST. We tested BLAT, LAST and UBLAST. All three programs ran in a hundredth to a 25th of the time required to run BLAST. A reduction in the number of homologs and RBHs found by the faster algorithms compared to BLAST becomes apparent as the genomes compared become more dissimilar, with BLAT, a program optimized for quickly finding very similar sequences, missing both the most homologs and the most RBHs. Though LAST produced the closest number of homologs and RBH to those produced with BLAST, UBLAST was very close, with either program producing between 0.6 and 0.8 of the RBHs as BLAST between dissimilar genomes, while in more similar genomes the differences were barely apparent. UBLAST ran faster than LAST, making it the best option among the programs tested.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Genomics / methods*

Grant support

Research supported by Wilfrid Laurier University and by a Discovery grant to GMH by The Natural Sciences and Engineering Research Council of Canada (NSERC). Wilfrid Laurier University provided funds for equipment. The Discovery grant from Natural Sciences and Engineering Research Council of Canada (NSERC) provided funds for equipment and publication fees. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.