Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 8 (9)

High-coverage Genomes to Elucidate the Evolution of Penguins


High-coverage Genomes to Elucidate the Evolution of Penguins

Hailin Pan et al. Gigascience.


Background: Penguins (Sphenisciformes) are a remarkable order of flightless wing-propelled diving seabirds distributed widely across the southern hemisphere. They share a volant common ancestor with Procellariiformes close to the Cretaceous-Paleogene boundary (66 million years ago) and subsequently lost the ability to fly but enhanced their diving capabilities. With ∼20 species among 6 genera, penguins range from the tropical Galápagos Islands to the oceanic temperate forests of New Zealand, the rocky coastlines of the sub-Antarctic islands, and the sea ice around Antarctica. To inhabit such diverse and extreme environments, penguins evolved many physiological and morphological adaptations. However, they are also highly sensitive to climate change. Therefore, penguins provide an exciting target system for understanding the evolutionary processes of speciation, adaptation, and demography. Genomic data are an emerging resource for addressing questions about such processes.

Results: Here we present a novel dataset of 19 high-coverage genomes that, together with 2 previously published genomes, encompass all extant penguin species. We also present a well-supported phylogeny to clarify the relationships among penguins. In contrast to recent studies, our results demonstrate that the genus Aptenodytes is basal and sister to all other extant penguin genera, providing intriguing new insights into the adaptation of penguins to Antarctica. As such, our dataset provides a novel resource for understanding the evolutionary history of penguins as a clade, as well as the fine-scale relationships of individual penguin lineages. Against this background, we introduce a major consortium of international scientists dedicated to studying these genomes. Moreover, we highlight emerging issues regarding ensuring legal and respectful indigenous consultation, particularly for genomic data originating from New Zealand Taonga species.

Conclusions: We believe that our dataset and project will be important for understanding evolution, increasing cultural heritage and guiding the conservation of this iconic southern hemisphere species assemblage.

Keywords: Antarctica; Sphenisciformes; biogeography; climate change; comparative evolution; demography; evolution; genomics; phylogenetics; speciation.


Figure 1:
Figure 1:
Locations of breeding colonies of penguins and sampling sites for the final genomes, adapted from Ksepka et al. [1]. Sampling locations are shown with a small white ellipse. Note that the sampling location of the humboldt penguin (Spheniscus humboldti) is unclear because this individual was bred in the Copenhagen zoo, with ancestors imported from Peru and Chile in 1972. AMS: Amsterdam Island; ANT: Antipodes Islands; AUC: Auckland Islands; BOU: Bouvet; CAM: Campbell Island; CHA: Chatham Islands; CRZ: Crozet; FAL: Falkland Islands/Malvinas; GAL: Galapagos Islands; GOU: Gough Island; HEA: Heard Island; KER: Kerguelen; MAC: Macquarie Island; NZ: New Zealand; PEI: Prince Edward/Marion Island; SG: South Georgia; SNA: The Snares; SO: South Orkney Islands; SS: South Sandwich Islands.
Figure 2:
Figure 2:
Genome assembly statistics of all penguin species. A, Dot plot of the quality of each index showing contig N50 (maximum is Eudyptes chrysolophus chrysolophus with 163,848 bp; minimum is Spheniscus humboldti with 19,849 bp) and scaffold N50 (maximum is Eudyptula novaehollandiae with 29,280,209 bp; minimum is Eudyptes robustus with 363,310 bp). Each symbol indicates a penguin species, the x-axis indicates the scaffold N50, and the y-axis indicates the contig N50 for each species. B, Genome size for each penguin species (maximum is Eudyptula minor with 1,466,686,831 bp; minimum is Eudyptes sclateri with 1,211,737,899 bp). C, BUSCO assessments of all penguin genomes, showing the percentage of complete, duplicated, fragmented, or missing data. See Table   3 for more details. The symbols for each penguin species correspond to the symbols used in Fig. 1. and Fig. 3.
Figure 3:
Figure 3:
Phylogenomic reconstruction of penguins inferred by the ExaML method with no missing data. The topology of all clades was strongly supported (bootstrap support: 100). The topology and support were identical using the MP-EST and ASTRAL methods (with no missing data) except for the outgroup (bootstrap support for the split between Hydrobates tethys and Oceanites oceanicus: 37) and within the penguin genus Spheniscus (bootstrap support for the split between the African penguin [Spheniscus demersus] and the magellanic penguin [S. magellanicus]: 97).

Similar articles

See all similar articles


    1. Ksepka DT, Bertelli S, Giannini NP. The phylogeny of the living and fossil Sphenisciformes (penguins). Cladistics. 2006;22(5):412–41.
    1. Cole TL, Waters J, Shepherd LD, et al. . Ancient DNA reveals that the ‘extinct’ Hunter Island penguin (Tasidyptes hunteri) is not a distinct taxon. Zool J Linn Soc. 2018;182(2):459–64.
    1. Cole TL, Ksepka DT, Mitchell KJ, et al. . Mitogenomes uncover extinct penguin taxa and reveal island formation as a key driver of speciation. Mol Biol Evol. 2019;36(4):784–97. - PubMed
    1. Challies CW, Burleigh RR. Abundance and breeding distribution of the white-flippered penguin (Eudyptula minor albosignata) on Banks Peninsula, New Zealand. Notornis. 2004;51(1):1–6.
    1. Grosser S, Rawlence NJ, Anderson CNK, et al. . Invader or resident? Ancient-DNA reveals rapid species turnover in New Zealand little penguins. Proc Biol Sci. 2016;283(1824):20152879. - PMC - PubMed

Publication types