Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;4(9):883-99.
doi: 10.1093/gbe/evs061. Epub 2012 Jul 19.

Repeated evolution of identical domain architecture in metazoan netrin domain-containing proteins

Affiliations

Repeated evolution of identical domain architecture in metazoan netrin domain-containing proteins

Lucas Leclère et al. Genome Biol Evol. 2012.

Abstract

The majority of proteins in eukaryotes are composed of multiple domains, and the number and order of these domains is an important determinant of protein function. Although multidomain proteins with a particular domain architecture were initially considered to have a common evolutionary origin, recent comparative studies of protein families or whole genomes have reported that a minority of multidomain proteins could have appeared multiple times independently. Here, we test this scenario in detail for the signaling molecules netrin and secreted frizzled-related proteins (sFRPs), two groups of netrin domain-containing proteins with essential roles in animal development. Our primary phylogenetic analyses suggest that the particular domain architectures of each of these proteins were present in the eumetazoan ancestor and evolved a second time independently within the metazoan lineage from laminin and frizzled proteins, respectively. Using an array of phylogenetic methods, statistical tests, and character sorting analyses, we show that the polyphyly of netrin and sFRP is well supported and cannot be explained by classical phylogenetic reconstruction artifacts. Despite their independent origins, the two groups of netrins and of sFRPs have the same protein interaction partners (Deleted in Colorectal Cancer/neogenin and Unc5 for netrins and Wnts for sFRPs) and similar developmental functions. Thus, these cases of convergent evolution emphasize the importance of domain architecture for protein function by uncoupling shared domain architecture from shared evolutionary history. Therefore, we propose the terms merology to describe the repeated evolution of proteins with similar domain architecture and discuss the potential of merologous proteins to help understanding protein evolution.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.—
Fig. 1.—
Phylogenetic analyses of the complete amino acid domain datasets support polyphyly of netrins and sFRPs. (A) Netrin domain maximum likelihood (ML) analysis under a WAG + Γ(8) + I model (111 aa, 101 sequences, − ln L 21377.15); (B) LamininNT-3EGF supra-domain ML analysis under a model WAG + Γ(8) + I (363 aa, 99 sequences, −ln L 50081.22); (C) Frizzled-CRD domain ML analysis under a model LG + Γ(8) + I (112 aa, 87 sequences, −ln L 10857.68). For deep branches, nonparametric bootstrap values BP (ML)—500 replicates—are indicated on the left (A) or above the branches (B and C), and Bayesian posterior probability (PP) are indicated on the right or below the branches. Asterisks indicate branches with maximum support for both BP (ML) and PP. A dash indicates branches with BP (ML) < 50% and PP < 70%. (B) Values in parenthesis correspond to BP (ML) and PP values from analyses without Amphimedon and Monosiga sequences. For other branches, black dot indicates PP ≥ 90%, yellow dot indicates PP ≥ 95% and BP (ML) ≥ 90%. The scale bar indicates the estimated number of substitution per site. Consistent grouping of netrin and sFRP subfamilies in individual domain phylogenies are highlighted in red and green, respectively. (A–C) Domain composition of proteins are sketched next to each subgroup and are oriented N- to C-terminal from top to bottom in A and from left to right in B and C. Size of netrin and sFRP protein sketches are double that for the other proteins. The two first letters of gene names in B and C correspond to the first letters of genus and species names (see Materials and Methods).
F<sc>ig</sc>.
2.—
Fig. 2.—
Polyphyly of netrins and sFRPs is confirmed in reduced amino acid (A, C, and E) and nucleotide (B, D, and F) datasets. Unstable, fast-evolving and outgroup sequences were excluded from the datasets before re-analyses. (A and B) Netrin domain ML analysis under WAG + Γ(8) + I (111 aa, 57 sequences, −ln L 11711.10) and GTR + Γ(8) + I (333 nt, 57 sequences, −ln L 19511.50) models; (C and D) LamininNT-3EGF supra-domain ML analysis under WAG + Γ(8) + I (363 aa, 61 sequences, −ln L 30275.53) and GTR + Γ(8) + I (1089 nt, 61 sequences, −ln L 58079.49) models; (E and F) frizzled-CRD domain ML analysis under LG + Γ(8) + I (112 aa, 56 sequences, −ln L 5969.88) and GTR + Γ(8) + I (1089 nt, 56 sequences, −ln L 13272.39) models. For deep branches, nonparametric bootstrap values BP (ML)—500 replicates—and Bayesian PP are indicated above and below the branches, respectively. Asterisks indicate branches with maximum support for both BP (ML) and PP. A dash indicates branches with BP (ML) < 50% and PP < 70%. For other branches, PP ≥ 90% are indicated by a black dot, and PP ≥ 95% + BP (ML) ≥ 90% are indicated by a yellow dot. The scale bar indicates the estimated number of substitution per site.
F<sc>ig</sc>. 3.—
Fig. 3.—
Netrin, LamininNT-EGF, and frizzled-CRD domains display a significant level of substitution saturation. Estimation of the substitution saturation of the domains netrin (A), LamininNT-EGF (B), and frizzled-CRD (C) at the amino acid level (complete datasets) as a ratio between inferred (x axis) and observed (y axis) differences for each pair of sequences. Inferred number of substitutions between pairs of sequences were determined using parsimony on the best ML trees. White squares and grey diamonds represent netrin-1/2/3/5-netrin-4 and sFRP-1/2/5-sFRP-3/4 pairwise comparison, respectively. Data points on the straight line X = Y correspond to completely unsaturated comparisons.
F<sc>ig</sc>.
4.—
Fig. 4.—
Distribution of the polyphyly versus monophyly signal for netrins and sFRPs. Differences in log likelihood per-site (Δpsln L) between unconstrained and constrained maximum likelihood analyses of (A) LamininNT-EGF supra-domain, with netrin-1/2/3/5 + netrin-4 + netrin-G constrained as monophyletic; (B) LamininNT-EGF and netrin domains, with netrin-1/2/3/5 + netrin-4 constrained as monophyletic; (C) frizzled-CRD and netrin domain, with sFRP-1/2/5 + sFRP-3/4 constrained as monophyletic. The x axes correspond to the alignment columns along the complete amino acid matrices and the y axes correspond to the Δpsln L between unconstrained and constrained ML analyses. The sites with positive y axis values have a higher likelihood for the unconstrained topology in which netrin or sFRP is polyphyletic, whereas the sites with negative y axis values have a higher likelihood for the constrained topology in which netrin or sFRP is monophyletic.
F<sc>ig</sc>. 5.—
Fig. 5.—
Polyphylies of netrins and sFRPs are supported by slow-evolving sites and are not caused by heterotachy in the ML analyses of the LamininNT-EGF (A, B, E, F, I, and J) and frizzled-CRD (C, D, G, H, K, and L) amino acid datasets. (A and C) Proportion of sites for each rate category, corresponding to the calculated number of steps in seven monophyletic groups using parsimony. For displaying purpose, each category contains two merged sequential values. (B and D) Cumulated difference in log likelihood per-site between unconstrained and constrained (B: netrin-1-4 monophyletic; D: sFRP-1/2/5-3/4 monophyletic) ML analysis for all sites within each rate category. (E and G) “Evolution” of the ML bootstrap support values (100 replicates) as fast-evolving sites are progressively removed from the original dataset; (E) 90% of bootstrap support is figured by a dotted line; (G) the “evolution” of BP-ML support value for sFRP monophyly is also indicated as slow-evolving sites are progressively removed from the original dataset. (F and H) Estimation of the mutational saturation as a ratio between inferred (x axis) and observed differences (y axis) for each pair of sequences in the LamininNT-EGF (F) and frizzled-CRD (H) datasets containing, respectively, the 30% and 50% slowest evolving sites. Data points on the straight line X = Y correspond to completely unsaturated comparisons. Data coming from the analyses of the 30% slowest evolving sites of the LamininNT-EGF dataset (in A, B, E, and F) and of the 50% slowest evolving sites of the frizzled-CRD dataset (in C, D, G, and H) are shaded. (I and K) Histogram of the absolute difference of steps per site calculated between the netrin-1-laminin-γ and netrin-4-laminin-β clades for the LamininNT-EGF dataset (I) and between the frizzled-5/8-frizzled-1/2/7-3/6-sFRP-3/4 and frizzled-4-frizzled-9/10-sFRP-1/2/5 clades for the frizzled-CRD dataset (K). (J and L) Cumulated difference in log likelihood per-site between unconstrained and constrained (netrin-1-4 monohyletic in J; sFRP-1/2/5-3/4 monophyletic in L) ML analysis for all sites within each “Δsteps per site” category. Data coming from the analyses of the 70% nonheterotachous sites of the LamininNT-EGF dataset (I and J) and of the 84% nonheterotachous sites of the frizzled-CRD dataset (K and L) are shaded.
F<sc>ig</sc>.
6.—
Fig. 6.—
Evolutionary scenario for the origin and evolution of netrins and sFRPs. Schematic representation of expansion of (A) netrins and (B) sFRP within one evolutionary lineage by both convergent domain shuffling and gene duplication. Note that diversification of laminin and frizzled proteins in vertebrates and origin and diversification of laminin-α, β/γ-like and netrin-G have been omitted.

Similar articles

Cited by

References

    1. Abascal F, Zardoya R, Posada D. ProtTest: selection of best-fit models of protein evolution. Bioinformatics. 2005;21:2104–2105. - PubMed
    1. Adamska M, et al. Structure and expression of conserved Wnt pathway components in the demosponge Amphimedon queenslandica. Evol Dev. 2010;12:494–518. - PubMed
    1. Adell T, Thakur AN, Muller WE. Isolation and characterization of Wnt pathway-related genes from Porifera. Cell Biol Int. 2007;31:939–949. - PubMed
    1. Banyai L, Patthy L. The NTR module: domains of netrins, secreted frizzled related proteins, and type I procollagen C-proteinase enhancer protein are homologous with tissue inhibitors of metalloproteases. Protein Sci. 1999;8:1636–1642. - PMC - PubMed
    1. Bashton M, Chothia C. The generation of new protein functions by the combination of domains. Structure. 2007;15:85–99. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources