Genotype calling and mapping of multisite variants using an Atlantic salmon iSelect SNP array

Bioinformatics. 2011 Feb 1;27(3):303-10. doi: 10.1093/bioinformatics/btq673. Epub 2010 Dec 12.

Abstract

Motivation: Due to a genome duplication event in the recent history of salmonids, modern Atlantic salmon (Salmo salar) have a mosaic genome with roughly one-third being tetraploid. This is a complicating factor in genotyping and genetic mapping since polymorphisms within duplicated regions (multisite variants; MSVs) are challenging to call and to assign to the correct paralogue. Standard genotyping software offered by Illumina has not been written to interpret MSVs and will either fail or miscall these polymorphisms. For the purpose of mapping, linkage or association studies in non-diploid species, there is a pressing need for software that includes analysis of MSVs in addition to regular single nucleotide polymorphism (SNP) markers.

Results: A software package is presented for the analysis of partially tetraploid genomes genotyped using Illumina Infinium BeadArrays (Illumina Inc.) that includes pre-processing, clustering, plotting and validation routines. More than 3000 salmon from an aquacultural strain in Norway, distributed among 266 full-sib families, were genotyped on a 15K BeadArray including both SNP- and MSV-markers. A total of 4268 SNPs and 1471 MSVs were identified, with average call accuracies of 0.97 and 0.86, respectively. A total of 150 MSVs polymorphic in both paralogs were dissected and mapped to their respective chromosomes, yielding insights about the salmon genome reversion to diploidy and improving marker genome coverage. Several retained homologies were found and are reported.

Availability and implementation: R-package beadarrayMSV freely available on the web at http://cran.r-project.org/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromosome Mapping / methods*
  • Genome
  • Genotype
  • Norway
  • Oligonucleotide Array Sequence Analysis*
  • Polymorphism, Single Nucleotide*
  • Reproducibility of Results
  • Salmo salar / genetics*
  • Software*