Predicting phenotype from genotype represents the epitome of biological questions. Comparative genomics of appropriate model organisms holds the promise of making it possible. However, the high heterozygosity of many Eukaryotes currently prohibits assembling their genomes. Here, we report the 376 Mb genome sequence of Papilio glaucus (Pgl), the first sequenced genome from the Papilionidae family. We obtained the genome from a wild-caught specimen using a cost-effective strategy that overcomes the high (2%) heterozygosity problem. Comparative analyses suggest the molecular bases of various phenotypic traits, including terpene production in the Papilionidae-specific organ, osmeterium. Comparison of Pgl and Papilio canadensis transcriptomes reveals mutation hotspots (4% genes) associated with their divergence: four key circadian clock proteins are enriched in inter-species mutations and likely responsible for the difference in pupal diapause. Finally, the Pgl genome confirms Papilio appalachiensis as a hybrid of Pgl and Pca, but suggests it inherited 3/4 of its genes from Pca.
Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.