Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan;43(Database issue):D174-80.
doi: 10.1093/nar/gku1060. Epub 2014 Nov 5.

An Update on LNCipedia: A Database for Annotated Human lncRNA Sequences

Affiliations
Free PMC article

An Update on LNCipedia: A Database for Annotated Human lncRNA Sequences

Pieter-Jan Volders et al. Nucleic Acids Res. .
Free PMC article

Erratum in

Abstract

The human genome is pervasively transcribed, producing thousands of non-coding RNA transcripts. The majority of these transcripts are long non-coding RNAs (lncRNAs) and novel lncRNA genes are being identified at rapid pace. To streamline these efforts, we created LNCipedia, an online repository of lncRNA transcripts and annotation. Here, we present LNCipedia 3.0 (http://www.lncipedia.org), the latest version of the publicly available human lncRNA database. Compared to the previous version of LNCipedia, the database grew over five times in size, gaining over 90,000 new lncRNA transcripts. Assessment of the protein-coding potential of LNCipedia entries is improved with state-of-the art methods that include large-scale reprocessing of publicly available proteomics data. As a result, a high-confidence set of lncRNA transcripts with low coding potential is defined and made available for download. In addition, a tool to assess lncRNA gene conservation between human, mouse and zebrafish has been implemented.

Figures

Figure 1.
Figure 1.
LNCipedia has grown substantially since its first release. The first version (41) was based on sequences and annotation from three different sources and was made available to the public in 2012. For the 2013 release of LNCipedia (unpublished), no additional sources were used, but the different sources were updated to the most recent version. For version 3.0 of LNCipedia, both new sources were added and existing sources were updated.
Figure 2.
Figure 2.
Many lncRNA loci are conserved in mouse or zebrafish. Locus conservation is a novel tool to determine the orthologous locus of a human lncRNA in another species. When the order of the flanking protein-coding genes is conserved in another species, the lncRNA locus is considered conserved. The majority of the conserved loci in zebrafish are also conserved in mouse, this fraction is depicted in gray.
Figure 3.
Figure 3.
Different methods suggest contamination of coding sequences in lncRNA data sets. (a) PhyloCSF benchmarking and score distributions. We can observe a considerable difference between the score distributions of coding and non-coding transcripts in the Ensembl data set. In addition, while the great majority of LNCipedia is presumably non-coding, it also contains a fraction of transcripts with a PhyloCSF score in the coding range. (b) Transcripts with a TIS have a significantly higher PhyloCSF score (Mann–Whitney U test) compared to other transcripts. (c) Several public lncRNA resources suffer from considerable contamination with protein-coding sequences. The percentage of transcripts with PhyloCSF score greater than 41 is shown for the different sources in LNCipedia 3.0. Two sources already filtered with PhyloCSF are depicted in gray. In the case of RefSeq, only entries with property “biomol_ncrna_lncrna” were considered.
Figure 4.
Figure 4.
Transcripts with a likely coding potential are removed in the definition of a high-confidence set. Transcripts containing small ORFs (25), TIS (24), PhyloCSF score greater than 41 or PSMs with an identification confidence higher than 90% are excluded.

Similar articles

See all similar articles

Cited by 116 articles

See all "Cited by" articles

References

    1. Mercer T., Dinger M. Long non-coding RNAs: insights into functions. Nat. Rev. Genet. 2009;10:155–159. - PubMed
    1. Gupta R.A., Shah N., Wang K.C., Kim J., Horlings H.M., Wong D.J., Tsai M.-C., Hung T., Argani P., Rinn J.L., et al. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010;464:1071–1076. - PMC - PubMed
    1. Margueron R., Reinberg D. The Polycomb complex PRC2 and its mark in life. Nature. 2011;469:343–349. - PMC - PubMed
    1. Tsai M.-C., Manor O., Wan Y., Mosammaparast N., Wang J.K., Lan F., Shi Y., Segal E., Chang H.Y. Long noncoding RNA as modular scaffold of histone modification complexes. Science. 2010;329:689–693. - PMC - PubMed
    1. Cesana M., Cacchiarelli D., Legnini I., Santini T., Sthandier O., Chinappi M., Tramontano A., Bozzoni I. A long noncoding RNA controls muscle differentiation by functioning as a competing endogenous RNA. Gastroenterology. 2011;147:358–369. - PMC - PubMed

Publication types

Feedback