Long non-coding RNA identification over mouse brain development by integrative modeling of chromatin and genomic features

Nucleic Acids Res. 2013 Dec;41(22):10044-61. doi: 10.1093/nar/gkt818. Epub 2013 Sep 13.

Abstract

In silico prediction of genomic long non-coding RNAs (lncRNAs) is prerequisite to the construction and elucidation of non-coding regulatory network. Chromatin modifications marked by chromatin regulators are important epigenetic features, which can be captured by prevailing high-throughput approaches such as ChIP sequencing. We demonstrate that the accuracy of lncRNA predictions can be greatly improved when incorporating high-throughput chromatin modifications over mouse embryonic stem differentiation toward adult Cerebellum by logistic regression with LASSO regularization. The discriminating features include H3K9me3, H3K27ac, H3K4me1, open reading frames and several repeat elements. Importantly, chromatin information is suggested to be complementary to genomic sequence information, highlighting the importance of an integrated model. Applying integrated model, we obtain a list of putative lncRNAs based on uncharacterized fragments from transcriptome assembly. We demonstrate that the putative lncRNAs have regulatory roles in vicinity of known gene loci by expression and Gene Ontology enrichment analysis. We also show that the lncRNA expression specificity can be efficiently modeled by the chromatin data with same developmental stage. The study not only supports the biological hypothesis that chromatin can regulate expression of tissue-specific or developmental stage-specific lncRNAs but also reveals the discriminating features between lncRNA and coding genes, which would guide further lncRNA identifications and characterizations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Brain / embryology
  • Brain / growth & development
  • Brain / metabolism*
  • Cell Differentiation
  • Chromatin / metabolism*
  • Embryonic Stem Cells / cytology
  • Embryonic Stem Cells / metabolism
  • Gene Expression Regulation, Developmental
  • Genomics
  • Logistic Models
  • Mice
  • RNA, Long Noncoding / genetics
  • RNA, Long Noncoding / metabolism*
  • RNA, Long Noncoding / physiology

Substances

  • Chromatin
  • RNA, Long Noncoding