Long non-coding RNAs (ncRNA) have recently been demonstrated to be expressed from a subset of enhancers and to be required for the distant regulation of gene expression. Several approaches to predict enhancers have been developed based on various chromatin marks and occupancy of enhancer-binding proteins. Despite the rapid advances in the field, no consensus how to define tissue specific enhancers yet exists. Here, we identify 2,695 long ncRNAs annotated by ENCODE (corresponding to 28% of all ENCODE annotated long ncRNAs) that overlap tissue-specific enhancers. We use a recently developed algorithm to predict tissue-specific enhancers, PreSTIGE, that is based on the H3K4me1 mark and tissue specific expression of mRNAs. The expression of the long ncRNAs overlapping enhancers is significantly higher when the enhancer is predicted as active in a specific cell line, suggesting a general interdependency of active enhancers and expression of long ncRNAs. This dependency is not identified using previous enhancer prediction algorithms that do not account for expression of their downstream targets. The predicted enhancers that overlap annotated long ncRNAs generally have a lower ratio of H3K4me1 to H3K4me3, suggesting that enhancers expressing long ncRNAs might be associated with specific epigenetic marks. In conclusion, we demonstrate the tissue-specific predictive power of PreSTIGE and provide evidence for thousands of long ncRNAs that are expressed from active tissue-specific enhancers, suggesting a particularly important functional relationship between long ncRNAs and enhancer activity in determining tissue-specific gene expression.
Keywords: Enhancer; activating long non-coding RNA; enhancer prediction; long non-coding RNA; tissue-specific enhancer.