Choosing BLAST options for better detection of orthologs as reciprocal best hits
- PMID: 18042555
- DOI: 10.1093/bioinformatics/btm585
Choosing BLAST options for better detection of orthologs as reciprocal best hits
Abstract
Motivation: The analyses of the increasing number of genome sequences requires shortcuts for the detection of orthologs, such as Reciprocal Best Hits (RBH), where orthologs are assumed if two genes each in a different genome find each other as the best hit in the other genome. Two BLAST options seem to affect alignment scores the most, and thus the choice of a best hit: the filtering of low information sequence segments and the algorithm used to produce the final alignment. Thus, we decided to test whether such options would help better detect orthologs.
Results: Using Escherichia coli K12 as an example, we compared the number and quality of orthologs detected as RBH. We tested four different conditions derived from two options: filtering of low-information segments, hard (default) versus soft; and alignment algorithm, default (based on matching words) versus Smith-Waterman. All options resulted in significant differences in the number of orthologs detected, with the highest numbers obtained with the combination of soft filtering with Smith-Waterman alignments. We compared these results with those of Reciprocal Shortest Distances (RSD), supposed to be superior to RBH because it uses an evolutionary measure of distance, rather than BLAST statistics, to rank homologs and thus detect orthologs. RSD barely increased the number of orthologs detected over those found with RBH. Error estimates, based on analyses of conservation of gene order, found small differences in the quality of orthologs detected using RBH. However, RSD showed the highest error rates. Thus, RSD have no advantages over RBH.
Availability: Orthologs detected as Reciprocal Best Hits using soft masking and Smith-Waterman alignments can be downloaded from http://popolvuh.wlu.ca/Orthologs.
Similar articles
-
Ortholog detection using the reciprocal smallest distance algorithm.Methods Mol Biol. 2007;396:95-110. doi: 10.1007/978-1-59745-515-2_7. Methods Mol Biol. 2007. PMID: 18025688
-
Improving the specificity of high-throughput ortholog prediction.BMC Bioinformatics. 2006 May 28;7:270. doi: 10.1186/1471-2105-7-270. BMC Bioinformatics. 2006. PMID: 16729895 Free PMC article.
-
OrthologID: automation of genome-scale ortholog identification within a parsimony framework.Bioinformatics. 2006 Mar 15;22(6):699-707. doi: 10.1093/bioinformatics/btk040. Epub 2006 Jan 12. Bioinformatics. 2006. PMID: 16410324
-
Striped Smith-Waterman speeds database searches six times over other SIMD implementations.Bioinformatics. 2007 Jan 15;23(2):156-61. doi: 10.1093/bioinformatics/btl582. Epub 2006 Nov 16. Bioinformatics. 2007. PMID: 17110365
-
The relative value of operon predictions.Brief Bioinform. 2008 Sep;9(5):367-75. doi: 10.1093/bib/bbn019. Epub 2008 Apr 17. Brief Bioinform. 2008. PMID: 18420711 Review.
Cited by
-
Wolfberry genomes and the evolution of Lycium (Solanaceae).Commun Biol. 2021 Jun 3;4(1):671. doi: 10.1038/s42003-021-02152-8. Commun Biol. 2021. PMID: 34083720 Free PMC article.
-
Comparative genomic and phylogenetic approaches to characterize the role of genetic recombination in mycobacterial evolution.PLoS One. 2012;7(11):e50070. doi: 10.1371/journal.pone.0050070. Epub 2012 Nov 26. PLoS One. 2012. PMID: 23189179 Free PMC article.
-
Pan-genome and phylogenomic analyses highlight Hevea species delineation and rubber trait evolution.Nat Commun. 2024 Aug 22;15(1):7232. doi: 10.1038/s41467-024-51031-3. Nat Commun. 2024. PMID: 39174505 Free PMC article.
-
Global mRNA decay analysis at single nucleotide resolution reveals segmental and positional degradation patterns in a Gram-positive bacterium.Genome Biol. 2012 Apr 26;13(4):R30. doi: 10.1186/gb-2012-13-4-r30. Genome Biol. 2012. PMID: 22537947 Free PMC article.
-
Identification of BERP (brain-expressed RING finger protein) as a p53 target gene that modulates seizure susceptibility through interacting with GABA(A) receptors.Proc Natl Acad Sci U S A. 2010 Jun 29;107(26):11883-8. doi: 10.1073/pnas.1006529107. Epub 2010 Jun 11. Proc Natl Acad Sci U S A. 2010. PMID: 20543135 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
