Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 21 (1), 15

Gene Content Evolution in the Arthropods

Gregg W C Thomas  1 Elias Dohmen  2   3   4 Daniel S T Hughes  5   6 Shwetha C Murali  5   7 Monica Poelchau  8 Karl Glastad  9   10 Clare A Anstead  11 Nadia A Ayoub  12 Phillip Batterham  13 Michelle Bellair  5   14 Greta J Binford  15 Hsu Chao  5 Yolanda H Chen  16 Christopher Childers  8 Huyen Dinh  5 Harsha Vardhan Doddapaneni  5 Jian J Duan  17 Shannon Dugan  5 Lauren A Esposito  18 Markus Friedrich  19 Jessica Garb  20 Robin B Gasser  11 Michael A D Goodisman  9 Dawn E Gundersen-Rindal  21 Yi Han  5 Alfred M Handler  22 Masatsugu Hatakeyama  23 Lars Hering  24 Wayne B Hunter  25 Panagiotis Ioannidis  26   27 Joy C Jayaseelan  5 Divya Kalra  5 Abderrahman Khila  28 Pasi K Korhonen  11 Carol Eunmi Lee  29 Sandra L Lee  5 Yiyuan Li  30 Amelia R I Lindsey  31   32 Georg Mayer  24 Alistair P McGregor  33 Duane D McKenna  34 Bernhard Misof  35 Mala Munidasa  5 Monica Munoz-Torres  36   37 Donna M Muzny  5 Oliver Niehuis  38 Nkechinyere Osuji-Lacy  5 Subba R Palli  39 Kristen A Panfilio  40 Matthias Pechmann  41 Trent Perry  13 Ralph S Peters  42 Helen C Poynton  43 Nikola-Michael Prpic  44   45 Jiaxin Qu  5 Dorith Rotenberg  46 Coby Schal  47 Sean D Schoville  48 Erin D Scully  49 Evette Skinner  5 Daniel B Sloan  50 Richard Stouthamer  31 Michael R Strand  51 Nikolaus U Szucsich  52 Asela Wijeratne  34   53 Neil D Young  11 Eduardo E Zattara  54 Joshua B Benoit  55 Evgeny M Zdobnov  26 Michael E Pfrender  30 Kevin J Hackett  56 John H Werren  57 Kim C Worley  5 Richard A Gibbs  5 Ariel D Chipman  58 Robert M Waterhouse  59 Erich Bornberg-Bauer  2   3   60 Matthew W Hahn  1 Stephen Richards  61   62

Gene Content Evolution in the Arthropods

Gregg W C Thomas et al. Genome Biol.


Background: Arthropods comprise the largest and most diverse phylum on Earth and play vital roles in nearly every ecosystem. Their diversity stems in part from variations on a conserved body plan, resulting from and recorded in adaptive changes in the genome. Dissection of the genomic record of sequence change enables broad questions regarding genome evolution to be addressed, even across hyper-diverse taxa within arthropods.

Results: Using 76 whole genome sequences representing 21 orders spanning more than 500 million years of arthropod evolution, we document changes in gene and protein domain content and provide temporal and phylogenetic context for interpreting these innovations. We identify many novel gene families that arose early in the evolution of arthropods and during the diversification of insects into modern orders. We reveal unexpected variation in patterns of DNA methylation across arthropods and examples of gene family and protein domain evolution coincident with the appearance of notable phenotypic and physiological adaptations such as flight, metamorphosis, sociality, and chemoperception.

Conclusions: These analyses demonstrate how large-scale comparative genomics can provide broad new insights into the genotype to phenotype map and generate testable hypotheses about the evolution of animal diversity.

Keywords: Arthropods; DNA methylation; Evolution; Gene content; Genome assembly; Genomics; Protein domains.

Conflict of interest statement

The authors declare that they have no competing interests.


Fig. 1
Fig. 1
OrthoDB orthology delineation for the i5K pilot species. The bars show Metazoa-level orthologs for the 76 selected arthropods and three outgroup species (of 13 outgroup species used for orthology analysis) partitioned according to their presence and copy number, sorted from the largest total gene counts to the smallest. The 28 i5K species generated in this study with a total of 533,636 gene models are indicated in bold green font. A total of 38,195 orthologous protein groups were annotated among the total 76 genomes
Fig. 2
Fig. 2
Arthropod phylogeny inferred from 569 to 4097 single-copy protein-coding genes among the six multi-species orders, crustaceans, and non-spider chelicerates (Additional file 1: Table S13) and 150 single-copy genes for the orders represented by a single species and the deeper nodes. Divergence times estimated with non-parametric rate smoothing and fossil calibrations at 22 nodes (Additional file 1: Table S14). Species in bold are those sequenced within the framework of the i5K pilot project. All nodes, except those indicated with red shapes, have bootstrap support of 100 inferred by ASTRAL. Nodes of particular interest are labeled in orange and referred to in the text. Larger fonts indicate multi-species orders enabling CAFE 3.0 likelihood analyses (see “Methods”). Nodes leading to major taxonomic groups have been labeled with their node number and the number of genes inferred at that point. See Additional file 2: Figure S16 and Additional file 1: Table S12 for full node labels
Fig. 3
Fig. 3
Summary of major results from gene family, protein domain, and methylation analyses. a We identify 147 gene families emerging during the evolution of insects, including several which may play an important role in insect development and adaptation. b Contrastingly, we find only ten emergent gene families during the evolution of holometabolous insects, indicating many gene families were already present during this transition. c Among all lineage nodes, we find that the node leading to Lepidoptera has the most emergent gene families. d We find that rates of gene gain and loss are highly correlated with rates of protein domain rearrangement. Leafcutter ants have experienced high rates of both types of change. e Blattella germanica has experienced the highest number of rapid gene family changes, possibly indicating its ability to rapidly adapt to new environments. f We observe signals of CpG methylation in all Araneae (spiders) genomes investigated (species shown: the brown recluse spider, Loxosceles reclusa) and the genome of the bark scorpion, Centruroides exilicauda. The two peaks show different CG counts in different gene features, with depletion of CG sequences in the left peak due to methylated C’s mutating to T. This suggests epigenetic control of a significant number of spider genes. Additional plots for all species in this study are shown in Additional file 2: Figure S5
Fig. 4
Fig. 4
Rate of genomic change along the arthropod phylogeny: a frequency of amino acid substitutions per site, b gene gains/losses, and c domain changes. All rates are averaged per My and color-indicated as branches of the phylogenetic tree. Species names are shown on the right; specific subclades are highlighted by colors according to the taxonomic groups noted in Fig. 2

Similar articles

  • The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.
    Chipman AD, Ferrier DE, Brena C, Qu J, Hughes DS, Schröder R, Torres-Oliva M, Znassi N, Jiang H, Almeida FC, Alonso CR, Apostolou Z, Aqrawi P, Arthur W, Barna JC, Blankenburg KP, Brites D, Capella-Gutiérrez S, Coyle M, Dearden PK, Du Pasquier L, Duncan EJ, Ebert D, Eibner C, Erikson G, Evans PD, Extavour CG, Francisco L, Gabaldón T, Gillis WJ, Goodwin-Horn EA, Green JE, Griffiths-Jones S, Grimmelikhuijzen CJ, Gubbala S, Guigó R, Han Y, Hauser F, Havlak P, Hayden L, Helbing S, Holder M, Hui JH, Hunn JP, Hunnekuhl VS, Jackson L, Javaid M, Jhangiani SN, Jiggins FM, Jones TE, Kaiser TS, Kalra D, Kenny NJ, Korchina V, Kovar CL, Kraus FB, Lapraz F, Lee SL, Lv J, Mandapat C, Manning G, Mariotti M, Mata R, Mathew T, Neumann T, Newsham I, Ngo DN, Ninova M, Okwuonu G, Ongeri F, Palmer WJ, Patil S, Patraquim P, Pham C, Pu LL, Putman NH, Rabouille C, Ramos OM, Rhodes AC, Robertson HE, Robertson HM, Ronshaugen M, Rozas J, Saada N, Sánchez-Gracia A, Scherer SE, Schurko AM, Siggens KW, Simmons D, Stief A, Stolle E, Telford MJ, Tessmar-Raible K, Thornton R, van der Zee M, von Haeseler A, Williams JM, Willis JH, Wu Y, Zou X, Lawson D, Muzny DM, Worley KC, Gibbs RA, Akam M, Richards S. Chipman AD, et al. PLoS Biol. 2014 Nov 25;12(11):e1002005. doi: 10.1371/journal.pbio.1002005. eCollection 2014 Nov. PLoS Biol. 2014. PMID: 25423365 Free PMC article.
  • Analysis of the genome of the New Zealand giant collembolan (Holacanthella duospinosa) sheds light on hexapod evolution.
    Wu C, Jordan MD, Newcomb RD, Gemmell NJ, Bank S, Meusemann K, Dearden PK, Duncan EJ, Grosser S, Rutherford K, Gardner PP, Crowhurst RN, Steinwender B, Tooman LK, Stevens MI, Buckley TR. Wu C, et al. BMC Genomics. 2017 Oct 17;18(1):795. doi: 10.1186/s12864-017-4197-1. BMC Genomics. 2017. PMID: 29041914 Free PMC article.
  • Phylogenomic insights into the cambrian explosion, the colonization of land and the evolution of flight in arthropoda.
    Wheat CW, Wahlberg N. Wheat CW, et al. Syst Biol. 2013 Jan 1;62(1):93-109. doi: 10.1093/sysbio/sys074. Epub 2012 Sep 4. Syst Biol. 2013. PMID: 22949483
  • Arthropod evolution and development: recent insights from chelicerates and myriapods.
    Leite DJ, McGregor AP. Leite DJ, et al. Curr Opin Genet Dev. 2016 Aug;39:93-100. doi: 10.1016/j.gde.2016.06.002. Epub 2016 Jun 28. Curr Opin Genet Dev. 2016. PMID: 27362947 Review.
  • Responses of terrestrial arthropods to air pollution: a meta-analysis.
    Zvereva EL, Kozlov MV. Zvereva EL, et al. Environ Sci Pollut Res Int. 2010 Feb;17(2):297-311. doi: 10.1007/s11356-009-0138-0. Epub 2009 Mar 25. Environ Sci Pollut Res Int. 2010. PMID: 19319587 Review.
See all similar articles


    1. Lozano-Fernandez J, Carton R, Tanner AR, Puttick MN, Blaxter M, Vinther J, Olesen J, Giribet G, Edgecombe GD, Pisani D. A molecular palaeobiological exploration of arthropod terrestrialization. Philos Trans R Soc B. 2016;371:20160133. doi: 10.1098/rstb.2015.0133. - DOI - PMC - PubMed
    1. Glenner H, Thomsen PF, Hebsgaard MB, Sorensen MV, Willerslev E. Evolution. The origin of insects. Science. 2006;314:1883–1884. doi: 10.1126/science.1129844. - DOI - PubMed
    1. Haug JT, Haug C, Garwood RJ. Evolution of insect wings and development - new details from Palaeozoic nymphs. Biol Rev Camb Philos Soc. 2015;91:53–69. doi: 10.1111/brv.12159. - DOI - PubMed
    1. Medved V, Marden JH, Fescemyer HW, Der JP, Liu J, Mahfooz N, Popadic A. Origin and diversification of wings: insights from a neopteran insect. Proc Natl Acad Sci U S A. 2015;112:15946–15951. doi: 10.1073/pnas.1509517112. - DOI - PMC - PubMed
    1. Nel A, Roques P, Nel P, Prokin AA, Bourgoin T, Prokop J, Szwedo J, Azar D, Desutter-Grandcolas L, Wappler T, et al. The earliest known holometabolous insects. Nature. 2013;503:257–261. doi: 10.1038/nature12629. - DOI - PubMed

Publication types

LinkOut - more resources