Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 1;9(8):giaa080.
doi: 10.1093/gigascience/giaa080.

Initial data release and announcement of the 10,000 Fish Genomes Project (Fish10K)

Affiliations

Initial data release and announcement of the 10,000 Fish Genomes Project (Fish10K)

Guangyi Fan et al. Gigascience. .

Abstract

Background: With more than 30,000 species, fish-including bony, jawless, and cartilaginous fish-are the largest vertebrate group, and include some of the earliest vertebrates. Despite their critical roles in many ecosystems and human society, fish genomics lags behind work on birds and mammals. This severely limits our understanding of evolution and hinders progress on the conservation and sustainable utilization of fish.

Results: Here, we announce the Fish10K project, a portion of the Earth BioGenome Project aiming to sequence 10,000 representative fish genomes in a systematic fashion within 10 years, and we officially welcome collaborators to join this effort. As a step towards this goal, we herein describe a feasible workflow for the procurement and storage of biospecimens, as well as sequencing and assembly strategies.

Conclusions: To illustrate, we present the genomes of 10 fish species from a cohort of 93 species chosen for technology development.

Keywords: Fish10K; evolution; fish; genome sequencing; phylogenetics; stLFR.

PubMed Disclaimer

Conflict of interest statement

Some of the authors are employees of BGI Group. The authors otherwise declare that they have no competing interests.

Figures

Figure 1:
Figure 1:
Assembly statistics of fish genomes in public databases. (a) Summary of genome size. (b, c) N50 statistics. A scaffold is a set of contigs linked together with gaps introduced in between. N50 is the median contig size of the genomic assembly. It's a metric that could be used to evaluate the quality of genome assembly.
Figure 2:
Figure 2:
The sequencing and assembly strategies. In the preferred strategy (Strategy II), high-quality DNA fragments (≥40Kb) are used to construct a stLFR library, which is sequenced using the DNBSEQ platform. Low–sequencing depth long reads are only used to improve the continuity of highly complex regions (increase the contig N50). In the alternative Strategy I, high-depth long reads are used to construct contigs, while low-depth stLFR reads are used to polish the contig and link the scaffolds. Hi-C data are used to generate a chromosome-level assembly.
Figure 3:
Figure 3:
The roadmap and organization of Fish10K. Fish10K is divided into 3 phases, based on the evolutionary relationship of fish, and 3 working groups (steering committee, scientific groups, and species groups).
Figure 4:
Figure 4:
Phylogenetics tree of fish. Jawed vertebrates (gnathostomes) are divided into 2 major groups: cartilaginous fish (Chondrichthyes; in orange) and bony vertebrates (Osteichthyes; in blue and green). Bony fish are grouped into 2 subgroups (Sarcopterygii; green) and (Actinopterygii; blue). The number of families and species in the 5 largest orders are labeled. The remaining 10 orders of bony fish (Caproiformes, Callionymiformes, Gobiesociformes, Icosteiformes, Lepisosteiformes, Moroniformes, Scombrolabraciformes, Scorpaeniformes, Trachichthyiformes, and Trachiniformes) and 2 orders of cartilaginous fish (Rhinopristiformes and Squatiniformes) are not included in the phylogenetic tree, due to their uncertain positions.

Similar articles

Cited by

References

    1. Koepfli KP, Paten B,The Genome 10K Project: a way forward. Annu Rev Anim Biosci. 2015;3(1):57–111. - PMC - PubMed
    1. Lewin HA, Robinson GE, Kress WJ, et al. . Earth BioGenome Project: sequencing life for the future of life. Proc Natl Acad Sci USA. 2018;115(17):4325–33. - PMC - PubMed
    1. Vertebrate Genomes Project . Vertebrate Genomes Project (VGP), https://genome10k.soe.ucsc.edu/vertebrate-genomes-project/. Accessed 10 September 2019.
    1. Zhang G, Rahbek C, Graves GRet al. . Genomics: bird sequencing project takes off. Nature. 2015;522(7554):34, doi:10.1038/522034d. - DOI - PubMed
    1. Sun Y, Huang Y, Li X, et al.. Fish-T1K (transcriptomes of 1,000 fishes) Project: large-scale transcriptome data for fish evolution studies. GigaSci. 2016;5(1):18–22. - PMC - PubMed

Publication types