Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2013 Nov 11;368(1632):20130025.
doi: 10.1098/rstb.2013.0025. Print 2013 Dec 19.

Many Human Accelerated Regions Are Developmental Enhancers

Affiliations
Free PMC article
Comparative Study

Many Human Accelerated Regions Are Developmental Enhancers

John A Capra et al. Philos Trans R Soc Lond B Biol Sci. .
Free PMC article

Abstract

The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology.

Keywords: development; enhancers; gene regulation; human accelerated regions; primate evolution.

Figures

Figure 1.
Figure 1.
Overlap of different sets of non-coding human accelerated regions and their top enriched gene ontology (GO) biological process annotations. The independently defined sets of ncHARs considered in this study display only modest overlap. However, the functional annotations enriched in nearby genes compared with the genomic background share common themes of development (e.g. differentiation, proliferation and morphogenesis) and regulation. The GO biological process annotations enriched when all ncHARs are considered together show similar general patterns (table 1). The full sets of enriched GO biological process annotations for each set are given in electronic supplementary material, table S1. ‘HACNSs’ are human accelerated conserved non-coding sequences [17]; ‘ANCs’ are accelerated conserved non-coding sequences [16]; and ‘HARs’ are the non-coding subset of the HARs [4,15]. The 63 accelerated regions from Bush & Lahn [18] are not pictured here to aid clarity, and because they lacked significant GO biological process enrichment. (Online version in colour.)
Figure 2.
Figure 2.
Genomic distribution of ncHARs compared with non-coding conserved elements. (a) The genomic distribution of ncHARs demonstrates that the vast majority are found in unannotated intronic or intergenic regions. ncHARs are more likely to be found in pseudogenes and intergenic regions than non-coding evolutionarily conserved elements. ncHARs overlapping pseudogenes may have an accelerated substitution rate on the human lineage owing to loss of negative selection. (b) The ncHARs are relatively short, with an average length of 257 nt and only seven regions longer than 1 kb. The non-coding conserved regions are significantly shorter than the ncHARs (p ≈ 0, Mann–Whitney U-test (MWU)), but this difference is likely driven by the greater power to detect recent acceleration in longer conserved elements. All non-coding conserved elements shorter than the shortest ncHAR (13 nt) were not considered in this plot. (c) ncHARs are found at a range of distances and orientations from the nearest TSS, with many more than 100 kb away and a few as distant as 2 Mb. The ncHARs are more distant from the nearest TSS than non-coding conserved regions (p ≈ 0, MWU test).
Figure 3.
Figure 3.
Predicted ncHAR enhancers and their tissues of activity. (a) We applied the EnhancerFinder enhancer prediction pipeline to the 2649 ncHARs; 773 were predicted to be developmental enhancers. Among this set, EnhancerFinder predicted 251 brain enhancers, 194 limb enhancers and 39 heart enhancers. (b) We compared the predicted fraction of enhancers and tissue-specific enhancers between the ncHARs, the genome-wide non-coding background, and a filtered set of non-coding conserved regions (mammalian phastCons elements). The ncHARs are dramatically enriched for predicted enhancer, brain enhancer, and limb enhancer activity compared with the genomic background; however, they are not significantly different from the non-coding conserved regions. Supporting these predictions, validated ncHAR developmental enhancers (table 2 and electronic supplementary material, table S4) show a strong enrichment for brain activity, though this may reflect biases in how they were selected for experimental analysis. The relative lack of heart enhancers may reflect the fact that this tissue follows an earlier developmental trajectory than brain and limb. (Online version in colour.)
Figure 4.
Figure 4.
The 2xHAR.238 enhancer drives activity patterns in transgenic mice suggestive of brain expression differences between human and chimpanzee. (a) 2xHAR.238 is located on chromosome 2 and flanked by GLI2 and TFCP2L1. The alignment illustrates the human-specific substitutions in the ncHAR; bases matching the human sequence are shown as dots. A kilobase of sequence surrounding the ncHAR was cloned into a LacZ reporter construct. (b,c) One representative E11.5 transgenic mouse embryo is shown in two whole mount views plus two cross sections for the human sequence (b) and chimpanzee sequence (c). Both constructs produce consistent LacZ staining (blue) in the rostral dorsal pallium (arrowhead 1), the dorsal part of caudal hindbrain, and the rostral spinal cord. A second, more caudal, part of the dorsal pallium (arrowhead 2) shows activity unique to the chimpanzee sequence. The staining in the sectioned embryos shows human and chimpanzee enhancer activity in progenitor cells of the rostral dorsal pallium (domain 1), whereas only the chimpanzee enhancer has activity in the caudal dorsal pallium (progenitor cells and neurons; domain 2). The flanking gene, GLI2, is expressed in the cortex at E11.5 in mouse, and is thus a promising candidate target. (d) The activity patterns illustrated by the example images are consistent across embryos. All embryo images are given in electronic supplementary material, figure S1.
Figure 5.
Figure 5.
The human and chimpanzee sequences for 2xHAR.114 drive different activity patterns in the developing limbs of transgenic mice. (a) 2xHAR.114 is located on chromosome 20 and flanked by MYLK2 and FOXS1. The organization and details of this figure are the same as in figure 4. (b) Both the human and chimpanzee sequence produce consistent staining in the limb (white triangles) and neural tube, as well as suggestive staining in the brain. The flanking genes are known to be involved in heart development. Additional embryo images are given in electronic supplementary material, figure S1. (c) The chimpanzee sequence consistently drives more extensive activity in the limb at E11.5. The mean fraction of the forelimb stained across all LacZ positive mouse embryos with the human construct was significantly lower than with the chimpanzee construct (p = 0.004; t-test). (d) Cross sections of mouse embryonic forelimbs showing the patterns of LacZ expression (blue) driven by the human and chimpanzee 2xHAR.114 enhancers. Both enhancers have limb mesenchyme activity, but the chimpanzee enhancer has a much larger domain of activity.
Figure 6.
Figure 6.
Two ncHARs drive patterns suggestive of unique brain expression in human at the midbrain–hindbrain boundary in transgenic mice. The components of this figure are the same as in figure 4. (a) The genomic context, sequence alignment, and activity domains driven by human and chimpanzee 2xHAR.164 in E11.5 transgenic mice. Nearby developmental genes include LYPD1 and NCKAP5. Both human and chimpanzee sequences drive consistent activity in several brain structures, including the dorsal telencephalon, dorsal pretectum, roof plate of the diencephalon and midbrain, ventral diencephalon, midbrain and hindbrain. However, expression in the boundary between the midbrain and hindbrain (isthmus) is human-specific (arrowhead 1). The mouse orthologue of the nearby LYPD1 gene is expressed in the midbrain at E11.5. (b) 2xHAR.170 also produces human-specific activity in the isthmus (arrowhead 1). In addition, the chimpanzee construct drives strong spinal cord expression, whereas the human construct does not (arrowhead 2). The developmental gene HAND1 is a potential target gene of the 2xHAR.170 enhancer. All embryo images are given in the electronic supplementary material, figure S1.

Similar articles

See all similar articles

Cited by 49 articles

See all "Cited by" articles

References

    1. Chimpanzee Sequencing and Analysis Consortium 2005. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87 (doi:10.1038/nature04072) - DOI - PubMed
    1. Jiang Z, Tang H, Ventura M, Cardone M, Marques-Bonet T, She X, Pevzner P, Eichler E. 2007. Ancestral reconstruction of segmental duplications reveals punctuated cores of human genome evolution. Nat. Genet. 39, 1361–1368 (doi:10.1038/ng.2007.9) - DOI - PubMed
    1. Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. 2010. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (doi:10.1101/gr.097857.109) - DOI - PMC - PubMed
    1. Lindblad-Toh K, et al. 2011. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (doi:10.1038/nature10530) - DOI - PMC - PubMed
    1. Church DM, et al. 2009. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol. 7, e1000112 (doi:10.1371/journal.pbio.1000112) - DOI - PMC - PubMed

Publication types

LinkOut - more resources

Feedback