Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Oct 11;284(1864):20171357.
doi: 10.1098/rspb.2017.1357.

Novel and divergent genes in the evolution of placental mammals

Affiliations

Novel and divergent genes in the evolution of placental mammals

Thomas L Dunwell et al. Proc Biol Sci. .

Abstract

Analysis of genome sequences within a phylogenetic context can give insight into the mode and tempo of gene and protein evolution, including inference of gene ages. This can reveal whether new genes arose on particular evolutionary lineages and were recruited for new functional roles. Here, we apply MCL clustering with all-versus-all reciprocal BLASTP to identify and phylogenetically date 'Homology Groups' among vertebrate proteins. Homology Groups include new genes and highly divergent duplicate genes. Focusing on the origin of the placental mammals within the Eutheria, we identify 357 novel Homology Groups that arose on the stem lineage of Placentalia, 87 of which are deduced to play core roles in mammalian biology as judged by extensive retention in evolution. We find the human homologues of novel eutherian genes are enriched for expression in preimplantation embryo, brain, and testes, and enriched for functions in keratinization, reproductive development, and the immune system.

Keywords: Eutheria; MCL clustering; Placentalia; molecular evolution; new genes.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1.
Figure 1.
Taxon sampling and phylogeny. The number of proteins listed for each species is the combined total from NCBI RefSeq and Ensembl protein predictions. Each of the four coloured columns represents a HG. The first two columns are hypothetical examples that would be classified as Novel Ancestral Placental HG, since they contain genes found in one member of the Atlantogenata and one of the Boreoeutheria. The last two columns are hypothetical examples of Novel Core Placental HG (a subset of Novel Ancestral Placental HG), being groups found in all, or all but one, placental mammals. ‘YES’ and ‘NO’ represent presence or absence of a HG in a species. (Online version in colour.)
Figure 2.
Figure 2.
BLASTP/MCL pipeline and filtering steps for identifying Novel Ancestral Placental and Novel Core Placental HG. (Online version in colour.)
Figure 3.
Figure 3.
Distribution of genes from Novel Ancestral Placental and Novel Core Placental HG across human chromosomes. The number of proteins in Novel Ancestral Placental and Novel Core Placental HG are shown per-chromosome as a percentage of the total number of protein coding genes on that chromosome which were present in our dataset. The total number of protein coding genes per chromosome is plotted on the secondary axis. The significance of the adjusted p-value for the enrichment or depletion of the Novel Ancestral and Novel Core proteins per chromosome are shown in the grid below the histogram (*p-value < 0.05, **p-value < 5 × 10−3, ***p-value < 5 × 10−29).
Figure 4.
Figure 4.
GO annotation and pathway enrichment. Genes from Novel Ancestral Placental and Novel Core Placental HG were assessed for enrichment for GO annotation terms and KEGG pathways. Spot size is proportional to the –log2 of the p-value when a value ≤0.05 was found, terms are ordered by significance of enrichment in Novel Ancestral genes. Term and pathways IDs are shown below the term names. (Online version in colour.)
Figure 5.
Figure 5.
Heatmap of normalized gene expression for 59 human cell types and tissues. Expression data from 59 different human cell types and tissues for 336 different human genes from 249 Novel Ancestral Placental HG. Clustering is according to expression levels for each gene across all tissues and cell types after normalizing each gene's expression to the site of highest expression. Values are shown in a scale between 0 and 1. Individual selected tissue or cell type clusters are labelled on the left edge. The peach colour in the bar running the height of the heatmap identifies those genes which belong to only a Novel Ancestral Placental HG; a subset are coloured green and identifies those also belonging to a Novel Core Placental HG.
Figure 6.
Figure 6.
Analysis of clustering and BLASTP results for Novel Core Placental HG. BLASTP interactions for all proteins within the 87 Novel Core Placental HG were analysed to determine to which, if any, other HG BLASTP hits were detectable. (a) BLASTP interactions between the 87 Novel Core Placental HG were assessed to identify which HG had reciprocal BLASTP hits between them. The diagonal line indicates reciprocal hits within an HG to itself. Off-diagonal squares indicate BLASTP interactions between two different Novel Core Placental HG. Black lines illustrate BLASTP interactions between clusters. Numbers 1–5 represent Sets 1–5 in electronic supplementary material, table S6, where more details of the interactions are show. (b) BLASTP interactions between the 87 Novel Core Placental HG and all other HG. Black lines between (a) and (b) are used to illustrate selected examples of where hits were detected. The coloured bars below the plot indicate which species each HG in (b) is present in. A minimum of 25% of the proteins in a Novel Core Placental HG were required to have BLASTP hits against another cluster for a BLASTP interaction to be considered relevant. (Online version in colour.)
Figure 7.
Figure 7.
Methods of gene evolution. Selected Novel Ancestral Placental HG which contained a single protein were used to examine how selected HG may have been generated. The syntenic region surrounding the human gene was compared to the equivalent region in opossum. (a) CCER2 as an example of how a placental mammal protein coding gene has diverged such that it is detected as substantially different to the copy of the gene found in non-placental mammals. (b) Tandem duplication of the CLPS loci as an example for how genes can undergo duplication and subsequent divergence, resulting in one or more of the duplicates diverging substantially from the original copy. (c) IL31 as an example of a gene present in humans but not present in the syntenic location in opossum. (d) Simplified representation of rearrangements surrounding SPZ1, as an example of how new genes can be associated with large-scale changes to chromosome structure. (Online version in colour.)

Similar articles

Cited by

References

    1. O'Leary MA, et al. 2013. The placental mammal ancestor and the post-K-Pg radiation of placentals. Science 339, 662–667. (10.1126/science.1229237) - DOI - PubMed
    1. Dos Reis M, Donoghue PCJ, Yang Z. 2014. Neither phylogenomic nor palaeontological data support a Palaeogene origin of placental mammals. Biol. Lett. 10, 20131003 (10.1098/rsbl.2013.1003) - DOI - PMC - PubMed
    1. Moffett A, Loke C. 2006. Immunology of placentation in eutherian mammals. Nat. Rev. Immunol. 6, 584–594. (10.1038/nri1897) - DOI - PubMed
    1. Rossant J, Tam PP. 2009. Blastocyst lineage formation, early embryonic asymmetries and axis patterning in the mouse. Development 136, 701–713. (10.1242/dev.017178) - DOI - PubMed
    1. Frankenberg S, Shaw G, Freyer C, Pask AJ, Renfree MB. 2013. Early cell lineage specification in a marsupial: a case for diverse mechanisms among mammals. Development 140, 965–975. (10.1242/dev.091629) - DOI - PubMed

LinkOut - more resources