Prediction of complete gene structures in human genomic DNA
- PMID: 9149143
- DOI: 10.1006/jmbi.1997.0951
Prediction of complete gene structures in human genomic DNA
Abstract
We introduce a general probabilistic model of the gene structure of human genomic sequences which incorporates descriptions of the basic transcriptional, translational and splicing signals, as well as length distributions and compositional features of exons, introns and intergenic regions. Distinct sets of model parameters are derived to account for the many substantial differences in gene density and structure observed in distinct C + G compositional regions of the human genome. In addition, new models of the donor and acceptor splice signals are described which capture potentially important dependencies between signal positions. The model is applied to the problem of gene identification in a computer program, GENSCAN, which identifies complete exon/intron structures of genes in genomic DNA. Novel features of the program include the capacity to predict multiple genes in a sequence, to deal with partial as well as complete genes, and to predict consistent sets of genes occurring on either or both DNA strands. GENSCAN is shown to have substantially higher accuracy than existing methods when tested on standardized sets of human and vertebrate genes, with 75 to 80% of exons identified exactly. The program is also capable of indicating fairly accurately the reliability of each predicted exon. Consistently high levels of accuracy are observed for sequences of differing C + G content and for distinct groups of vertebrates.
Similar articles
-
Finding genes in DNA with a Hidden Markov Model.J Comput Biol. 1997 Summer;4(2):127-41. doi: 10.1089/cmb.1997.4.127. J Comput Biol. 1997. PMID: 9228612
-
The Gene-Finder computer tools for analysis of human and model organisms genome sequences.Proc Int Conf Intell Syst Mol Biol. 1997;5:294-302. Proc Int Conf Intell Syst Mol Biol. 1997. PMID: 9322052
-
Compensatory relationship between splice sites and exonic splicing signals depending on the length of vertebrate introns.BMC Genomics. 2006 Dec 8;7:311. doi: 10.1186/1471-2164-7-311. BMC Genomics. 2006. PMID: 17156453 Free PMC article.
-
Mutations that alter RNA splicing of the human HPRT gene: a review of the spectrum.Mutat Res. 1998 Nov;411(3):179-214. doi: 10.1016/s1383-5742(98)00013-1. Mutat Res. 1998. PMID: 9804951 Review.
-
Advances in the Exon-Intron Database (EID).Brief Bioinform. 2006 Jun;7(2):178-85. doi: 10.1093/bib/bbl003. Epub 2006 Mar 9. Brief Bioinform. 2006. PMID: 16772261 Review.
Cited by
-
Whole-Genome Sequencing of the Entomopathogenic Fungus Fusarium solani KMZW-1 and Its Efficacy Against Bactrocera dorsalis.Curr Issues Mol Biol. 2024 Oct 17;46(10):11593-11612. doi: 10.3390/cimb46100688. Curr Issues Mol Biol. 2024. PMID: 39451568 Free PMC article.
-
Three independent techniques localize expression of transcript afp-11 and its bioactive peptide products to the paired AVK neurons in Ascaris suum: in situ hybridization, immunocytochemistry, and single cell mass spectrometry.ACS Chem Neurosci. 2013 Mar 20;4(3):418-34. doi: 10.1021/cn3001334. Epub 2012 Dec 27. ACS Chem Neurosci. 2013. PMID: 23509978 Free PMC article.
-
Chromosome-level genome provides insights into evolution and diving adaptability in the vulnerable common pochard (Aythya ferina).BMC Genomics. 2024 Oct 3;25(1):927. doi: 10.1186/s12864-024-10846-6. BMC Genomics. 2024. PMID: 39363174 Free PMC article.
-
A towering genome: Experimentally validated adaptations to high blood pressure and extreme stature in the giraffe.Sci Adv. 2021 Mar 17;7(12):eabe9459. doi: 10.1126/sciadv.abe9459. Print 2021 Mar. Sci Adv. 2021. PMID: 33731352 Free PMC article.
-
High degree of single nucleotide polymorphisms in California Culex pipiens (Diptera: Culicidae) sensu lato.J Med Entomol. 2012 Mar;49(2):299-306. doi: 10.1603/me11108. J Med Entomol. 2012. PMID: 22493847 Free PMC article.
Publication types
MeSH terms
Substances
Associated data
- Actions
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
