Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jan 21;12:53.
doi: 10.1186/1471-2164-12-53.

Generation of Genome-Scale Gene-Associated SNPs in Catfish for the Construction of a High-Density SNP Array

Affiliations
Free PMC article

Generation of Genome-Scale Gene-Associated SNPs in Catfish for the Construction of a High-Density SNP Array

Shikai Liu et al. BMC Genomics. .
Free PMC article

Abstract

Background: Single nucleotide polymorphisms (SNPs) have become the marker of choice for genome-wide association studies. In order to provide the best genome coverage for the analysis of performance and production traits, a large number of relatively evenly distributed SNPs are needed. Gene-associated SNPs may fulfill these requirements of large numbers and genome wide distribution. In addition, gene-associated SNPs could themselves be causative SNPs for traits. The objective of this project was to identify large numbers of gene-associated SNPs using high-throughput next generation sequencing.

Results: Transcriptome sequencing was conducted for channel catfish and blue catfish using Illumina next generation sequencing technology. Approximately 220 million reads (15.6 Gb) for channel catfish and 280 million reads (19.6 Gb) for blue catfish were obtained by sequencing gene transcripts derived from various tissues of multiple individuals from a diverse genetic background. A total of over 35 billion base pairs of expressed short read sequences were generated. Over two million putative SNPs were identified from channel catfish and almost 2.5 million putative SNPs were identified from blue catfish. Of these putative SNPs, a set of filtered SNPs were identified including 342,104 intra-specific SNPs for channel catfish, 366,269 intra-specific SNPs for blue catfish, and 420,727 inter-specific SNPs between channel catfish and blue catfish. These filtered SNPs are distributed within 16,562 unique genes in channel catfish and 17,423 unique genes in blue catfish.

Conclusions: For aquaculture species, transcriptome analysis of pooled RNA samples from multiple individuals using Illumina sequencing technology is both technically efficient and cost-effective for generating expressed sequences. Such an approach is most effective when coupled to existing EST resources generated using traditional sequencing approaches because the reference ESTs facilitate effective assembly of the expressed short reads. When multiple individuals with different genetic backgrounds are used, RNA-Seq is very effective for the identification of SNPs. The SNPs identified in this report will provide a much needed resource for genetic studies in catfish and will contribute to the development of a high-density SNP array. Validation and testing of these SNPs using SNP arrays will form the material basis for genome association studies and whole genome-based selection in catfish.

Figures

Figure 1
Figure 1
Similarity of GO-term assignments for catfish and zebrafish genes. Proportions of GO-terms assigned to annotated contigs from catfish assembly compared with the proportions found in the zebrafish genome annotation which serves as an indicator of the extent to which the catfish transcriptome has been characterized.
Figure 2
Figure 2
Distribution of minor allele frequencies of SNPs identified for channel catfish, blue catfish and inter-species, as derived from analysis of sequence tags from the Illumina sequencing. A: Intra-specific SNPs in channel catfish; B: Intra-specific SNPs in blue catfish and C: Inter-specific SNPs between the two species. The X-axis represents the SNP sequence derived minor allele frequency in percentage, while the Y-axis represents the number of SNPs with given minor allele frequency. Note that the majority of SNPs have minor allele frequencies more than 15%.
Figure 3
Figure 3
Comparative analysis of the genes containing SNPs on 25 chromosomes of the zebrafish genome. Each of the 25 zebrafish chromosomes was laid out in the X-axis with one million base pairs intervals, and the number of genes contained with filtered SNPs residing in the interval was plotted on the Y-axis.
Figure 4
Figure 4
Frequency of contigs of various sizes from the all catfish reference assembly. The X-axis represents contig size (number of reads per contig). The curved line denotes the cumulative percentage of reads assembled. Note that a small number of very large contigs account for the majority of total reads. For instance, less than 0.3% of the contigs with over 100,000 reads per contig represent over 32% of all sequence reads assembled.
Figure 5
Figure 5
Distribution of filtered SNPs per contig. Histograms depict frequency of contigs with a given number of SNPs identified. Note that the majority of contigs have 5 or fewer SNPs per contig.
Figure 6
Figure 6
Schematic presentation of the catfish transcriptome analysis.

Similar articles

See all similar articles

Cited by 59 articles

See all "Cited by" articles

References

    1. Morin PA, Luikart G, Wayne RK, Grp SW. SNPs in ecology, evolution and conservation. Trends Ecol Evol. 2004;19(4):208–216. doi: 10.1016/j.tree.2004.01.009. - DOI
    1. Abasht B, Lamont SJ. Genome-wide association analysis reveals cryptic alleles as an important factor in heterosis for fatness in chicken F-2 population. Animal Genetics. 2007;38(5):491–498. doi: 10.1111/j.1365-2052.2007.01642.x. - DOI - PubMed
    1. Duijvesteijn N, Knol EF, Merks JWM, Crooijmans RPMA, Groenen MAM, Bovenhuis H, Harlizius B. A genome-wide association study on androstenone levels in pigs reveals a cluster of candidate genes on chromosome 6. BMC Genetics. 2010;11:42. doi: 10.1186/1471-2350-11-42. - DOI - PMC - PubMed
    1. Du ZQ, Zhao X, Vukasinovic N, Rodriguez F, Clutter AC, Rothschild MF. Association and Haplotype Analyses of Positional Candidate Genes in Five Genomic Regions Linked to Scrotal Hernia in Commercial Pig Lines. PLoS One. 2009;4(3):e4837. doi: 10.1371/journal.pone.0004837. - DOI - PMC - PubMed
    1. Meuwissen THE, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001;157(4):1819–1829. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources

Feedback