Applications and statistics for multiple high-scoring segments in molecular sequences

S Karlin; S F Altschul

doi:10.1073/pnas.90.12.5873

Applications and statistics for multiple high-scoring segments in molecular sequences

Proc Natl Acad Sci U S A. 1993 Jun 15;90(12):5873-7. doi: 10.1073/pnas.90.12.5873.

Authors

S Karlin¹, S F Altschul

Affiliation

¹ Department of Mathematics, Stanford University, CA 94305.

Abstract

Score-based measures of molecular-sequence features provide versatile aids for the study of proteins and DNA. They are used by many sequence data base search programs, as well as for identifying distinctive properties of single sequences. For any such measure, it is important to know what can be expected to occur purely by chance. The statistical distribution of high-scoring segments has been described elsewhere. However, molecular sequences will frequently yield several high-scoring segments for which some combined assessment is in order. This paper describes the statistical distribution for the sum of the scores of multiple high-scoring segments and illustrates its application to the identification of possible transmembrane segments and the evaluation of sequence similarity.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Amino Acid Sequence*
Animals
Antithrombin III / genetics
Base Sequence*
Biological Evolution
Chickens
DNA*
Drosophila / genetics
Drosophila Proteins*
Eye Proteins / genetics
Fowlpox virus / genetics
Humans
Membrane Glycoproteins / genetics
Molecular Sequence Data
Probability
Proteins*
Receptor Protein-Tyrosine Kinases*
Receptors, Cell Surface / genetics
Receptors, Serotonin / genetics
Sequence Analysis*
Sequence Homology, Amino Acid*

Substances

Drosophila Proteins
Eye Proteins
Membrane Glycoproteins
Proteins
Receptors, Cell Surface
Receptors, Serotonin
Antithrombin III
DNA
Receptor Protein-Tyrosine Kinases
sev protein, Drosophila

Abstract

Publication types

MeSH terms

Substances

Grants and funding