Domain enhanced lookup time accelerated BLAST
- PMID: 22510480
- PMCID: PMC3438057
- DOI: 10.1186/1745-6150-7-12
Domain enhanced lookup time accelerated BLAST
Abstract
Background: BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i + 1. Biegert and Söding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch.
Results: We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI's Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC5000 of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST.
Conclusions: DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the "Protein BLAST" link at http://blast.ncbi.nlm.nih.gov.
Figures
Similar articles
-
IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices.Bioinformatics. 1999 Dec;15(12):1000-11. doi: 10.1093/bioinformatics/15.12.1000. Bioinformatics. 1999. PMID: 10745990
-
A structure-based method for protein sequence alignment.Bioinformatics. 2005 Apr 15;21(8):1451-6. doi: 10.1093/bioinformatics/bti233. Epub 2004 Dec 21. Bioinformatics. 2005. PMID: 15613392
-
Database indexing for production MegaBLAST searches.Bioinformatics. 2008 Aug 15;24(16):1757-64. doi: 10.1093/bioinformatics/btn322. Epub 2008 Jun 21. Bioinformatics. 2008. PMID: 18567917 Free PMC article.
-
Identifying remote protein homologs by network propagation.FEBS J. 2005 Oct;272(20):5119-28. doi: 10.1111/j.1742-4658.2005.04947.x. FEBS J. 2005. PMID: 16218946 Review.
-
Protein database searches using compositionally adjusted substitution matrices.FEBS J. 2005 Oct;272(20):5101-9. doi: 10.1111/j.1742-4658.2005.04945.x. FEBS J. 2005. PMID: 16218944 Free PMC article. Review.
Cited by
-
Structural Perspectives on the Evolutionary Expansion of Unique Protein-Protein Binding Sites.Biophys J. 2015 Sep 15;109(6):1295-306. doi: 10.1016/j.bpj.2015.06.056. Epub 2015 Jul 23. Biophys J. 2015. PMID: 26213149 Free PMC article.
-
Comparative Genomics of a Bacterivorous Green Alga Reveals Evolutionary Causalities and Consequences of Phago-Mixotrophic Mode of Nutrition.Genome Biol Evol. 2015 Jul 29;7(11):3047-61. doi: 10.1093/gbe/evv144. Genome Biol Evol. 2015. PMID: 26224703 Free PMC article.
-
Mitochondrial genome sequence of the protist Ancyromonas sigmoides Kent, 1881 (Ancyromonadida) from the Sugluk Inlet, Hudson Strait, Nunavik, Québec.Front Microbiol. 2023 Dec 8;14:1275665. doi: 10.3389/fmicb.2023.1275665. eCollection 2023. Front Microbiol. 2023. PMID: 38143861 Free PMC article.
-
The majority of transcripts in the squid nervous system are extensively recoded by A-to-I RNA editing.Elife. 2015 Jan 8;4:e05198. doi: 10.7554/eLife.05198. Elife. 2015. PMID: 25569156 Free PMC article.
-
Molecular evolution of cytochrome bd oxidases across proteobacterial genomes.Genome Biol Evol. 2015 Feb 16;7(3):801-20. doi: 10.1093/gbe/evv032. Genome Biol Evol. 2015. PMID: 25688108 Free PMC article.
References
-
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
