AceView: a comprehensive cDNA-supported gene and transcripts annotation
- PMID: 16925834
- PMCID: PMC1810549
- DOI: 10.1186/gb-2006-7-s1-s12
AceView: a comprehensive cDNA-supported gene and transcripts annotation
Abstract
Background: Regions covering one percent of the genome, selected by ENCODE for extensive analysis, were annotated by the HAVANA/Gencode group with high quality transcripts, thus defining a benchmark. The ENCODE Genome Annotation Assessment Project (EGASP) competition aimed at reproducing Gencode and finding new genes. The organizers evaluated the protein predictions in depth. We present a complementary analysis of the mRNAs, including alternative transcript variants.
Results: We evaluate 25 gene tracks from the University of California Santa Cruz (UCSC) genome browser. We either distinguish or collapse the alternative splice variants, and compare the genomic coordinates of exons, introns and nucleotides. Whole mRNA models, seen as chains of introns, are sorted to find the best matching pairs, and compared so that each mRNA is used only once. At the mRNA level, AceView is by far the closest to Gencode: the vast majority of transcripts of the two methods, including alternative variants, are identical. At the protein level, however, due to a lack of experimental data, our predictions differ: Gencode annotates proteins in only 41% of the mRNAs whereas AceView does so in virtually all. We describe the driving principles of AceView, and how, by performing hand-supervised automatic annotation, we solve the combinatorial splicing problem and summarize all of GenBank, dbEST and RefSeq into a genome-wide non-redundant but comprehensive cDNA-supported transcriptome. AceView accuracy is now validated by Gencode.
Conclusion: Relative to a consensus mRNA catalog constructed from all evidence-based annotations, Gencode and AceView have 81% and 84% sensitivity, and 74% and 73% specificity, respectively. This close agreement validates a richer view of the human transcriptome, with three to five times more transcripts than in UCSC Known Genes (sensitivity 28%), RefSeq (sensitivity 21%) or Ensembl (sensitivity 19%).
Figures
Similar articles
-
GENCODE: producing a reference annotation for ENCODE.Genome Biol. 2006;7 Suppl 1(Suppl 1):S4.1-9. doi: 10.1186/gb-2006-7-s1-s4. Epub 2006 Aug 7. Genome Biol. 2006. PMID: 16925838 Free PMC article.
-
Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction.BMC Genomics. 2015;16 Suppl 8(Suppl 8):S2. doi: 10.1186/1471-2164-16-S8-S2. Epub 2015 Jun 18. BMC Genomics. 2015. PMID: 26110515 Free PMC article.
-
GENCODE: the reference human genome annotation for The ENCODE Project.Genome Res. 2012 Sep;22(9):1760-74. doi: 10.1101/gr.135350.111. Genome Res. 2012. PMID: 22955987 Free PMC article.
-
EGASP: the human ENCODE Genome Annotation Assessment Project.Genome Biol. 2006;7 Suppl 1(Suppl 1):S2.1-31. doi: 10.1186/gb-2006-7-s1-s2. Epub 2006 Aug 7. Genome Biol. 2006. PMID: 16925836 Free PMC article. Review.
-
Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment.Genome Biol. 2006;7 Suppl 1(Suppl 1):S3.1-13. doi: 10.1186/gb-2006-7-s1-s3. Epub 2006 Aug 7. Genome Biol. 2006. PMID: 16925837 Free PMC article. Review.
Cited by
-
CD19: a biomarker for B cell development, lymphoma diagnosis and therapy.Exp Hematol Oncol. 2012 Nov 29;1(1):36. doi: 10.1186/2162-3619-1-36. Exp Hematol Oncol. 2012. PMID: 23210908 Free PMC article.
-
Structural and biochemical characterization of a novel aminopeptidase from human intestine.J Biol Chem. 2015 May 1;290(18):11321-36. doi: 10.1074/jbc.M114.628149. Epub 2015 Mar 9. J Biol Chem. 2015. PMID: 25752612 Free PMC article.
-
Molecular and biochemical characterization of a unique mutation in CCS, the human copper chaperone to superoxide dismutase.Hum Mutat. 2012 Aug;33(8):1207-15. doi: 10.1002/humu.22099. Epub 2012 May 16. Hum Mutat. 2012. PMID: 22508683 Free PMC article.
-
Novel transcription factor variants through RNA-sequencing: the importance of being "alternative".Int J Mol Sci. 2015 Jan 13;16(1):1755-71. doi: 10.3390/ijms16011755. Int J Mol Sci. 2015. PMID: 25590302 Free PMC article.
-
Heterogeneity of increased biological age in type 2 diabetes correlates with differential tissue DNA methylation, biological variables, and pharmacological treatments.Geroscience. 2024 Apr;46(2):2441-2461. doi: 10.1007/s11357-023-01009-8. Epub 2023 Nov 21. Geroscience. 2024. PMID: 37987887 Free PMC article.
References
-
- UCSC Genome Browser: ENCODE Regions http://genome.ucsc.edu/ENCODE/encode.hg17.html
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
