Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Sep 2;46(1):51.
doi: 10.1186/s12711-014-0051-y.

Quantitative analysis of low-density SNP data for parentage assignment and estimation of family contributions to pooled samples

Quantitative analysis of low-density SNP data for parentage assignment and estimation of family contributions to pooled samples

John M Henshall et al. Genet Sel Evol. .

Abstract

Background: While much attention has focused on the development of high-density single nucleotide polymorphism (SNP) assays, the costs of developing and running low-density assays have fallen dramatically. This makes it feasible to develop and apply SNP assays for agricultural species beyond the major livestock species. Although low-cost low-density assays may not have the accuracy of the high-density assays widely used in human and livestock species, we show that when combined with statistical analysis approaches that use quantitative instead of discrete genotypes, their utility may be improved. The data used in this study are from a 63-SNP marker Sequenom® iPLEX Platinum panel for the Black Tiger shrimp, for which high-density SNP assays are not currently available.

Results: For quantitative genotypes that could be estimated, in 5% of cases the most likely genotype for an individual at a SNP had a probability of less than 0.99. Matrix formulations of maximum likelihood equations for parentage assignment were developed for the quantitative genotypes and also for discrete genotypes perturbed by an assumed error term. Assignment rates that were based on maximum likelihood with quantitative genotypes were similar to those based on maximum likelihood with perturbed genotypes but, for more than 50% of cases, the two methods resulted in individuals being assigned to different families. Treating genotypes as quantitative values allows the same analysis framework to be used for pooled samples of DNA from multiple individuals. Resulting correlations between allele frequency estimates from pooled DNA and individual samples were consistently greater than 0.90, and as high as 0.97 for some pools. Estimates of family contributions to the pools based on quantitative genotypes in pooled DNA had a correlation of 0.85 with estimates of contributions from DNA-derived pedigree.

Conclusions: Even with low numbers of SNPs of variable quality, parentage testing and family assignment from pooled samples are sufficiently accurate to provide useful information for a breeding program. Treating genotypes as quantitative values is an alternative to perturbing genotypes using an assumed error distribution, but can produce very different results. An understanding of the distribution of the error is required for SNP genotyping platforms.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Variation between SNPs and effect of adjusting for uncertainty. Intensity is plotted against allelic proportion p (see Equation 1) for all individuals for four selected SNPs, using both the unadjusted areas (left hand side) and areas adjusted for the uncertainty associated with the area estimate as provided by the genotyping provider (right hand side); intensities are estimated as Euclidean distances from the origin to the data points in Cartesian coordinates; for the adjusted areas, the mean and standard deviation of the allelic proportion estimates (p) are provided.
Figure 2
Figure 2
Comparison of genotype probabilities for quantitative and perturbed genotypes. Perturbed genotype probabilities were estimated using either an estimated error (top three panels) or an assumed error (bottom three panels), and were only estimated when the quantitative genotype probability exceeded 0.98; the data in the left hand panels (A and D) are expanded in the centre (B and E) and right hand panels (C and F).
Figure 3
Figure 3
Comparison of LOD scores obtained with quantitative and perturbed genotypes. In the top two panels, the genotypes were perturbed (x-axis) using the estimated error rate, and in the bottom two panels, the genotypes were perturbed (x-axis) using the assumed error rate; the left hand panels contain LOD scores for all pedigrees, while in the right hand panels only LOD scores for the 10 most likely pedigrees for each progeny are plotted; these are coloured according to whether the pedigree appeared in the top 10 pedigrees obtained using perturbed genotypes, the top 10 pedigrees obtained using quantitative genotypes, or the top 10 pedigrees for both approaches.
Figure 4
Figure 4
Comparison of LOD scores for most likely pedigrees obtained with quantitative and perturbed genotypes. For each progeny, the LOD score of the most likely sire-dam family (x-axis) and δ, the difference between the LOD for the most likely and second most likely sire dam family (y-axis), are plotted; colours indicate whether quantitative or perturbed (estimated error or assumed error) genotypes were used.
Figure 5
Figure 5
Effect of constraining the pedigree. The maximum sire-dam-offspring trio LOD score is plotted for each of the 207 G10 progeny; on the y-axis, the maximum LOD score for the unrestricted pedigree is plotted, while on the x-axis the maximum LOD scores for the half-sib and full-sib pedigrees are plotted; the order of plotting is full-sibs followed by half-sibs; full-sib data points are masked by half-sib data points, except when the two differ.
Figure 6
Figure 6
Accuracy of allele frequencies estimated from pooled DNA. SNP allele frequencies estimated from pooled or individual samples are compared in panel A, and, in panel B, the absolute value of the difference between allele frequency estimates from pooled and individual samples is plotted as a function of the estimate of the relevant Welch statistic (τ).
Figure 7
Figure 7
Estimation of family contributions to pools. Estimates of family contributions to pools from pooled DNA samples (y-axis) are compared to family contributions to pools estimated from pedigree (x-axis, left hand panels) and to family contributions estimated from individual DNA samples (x-axis, right hand panels).

Similar articles

Cited by

References

    1. Sonesson AK, Meuwissen TH, Goddard ME. The use of communal rearing of families and DNA pooling in aquaculture genomic selection schemes. Genet Sel Evol. 2010;42:41. - PMC - PubMed
    1. Harris L, Perez F. Proceedings of the World Aquaculture Society Meeting 2010: 1–5 March 2010. San Diego: World Aquaculture Society; 2010. Novel Methods for Microsatellite Assisted Family Selection under Commercial Production Conditions in Ecuador, Comparison with Published Data for Growth and Survival Using Elastomer Tagging.
    1. Dixon TJ, Coman GJ, Arnold SJ, Sellars MJ, Lyons RE, Dierens L, Preston NP, Li YT. Shifts in genetic diversity during domestication of Black Tiger shrimp, Penaeus monodon, monitored using two multiplexed microsatellite systems. Aquaculture. 2008;283:1–6. doi: 10.1016/j.aquaculture.2008.07.009. - DOI
    1. Li YT, Wongprasert K, Shekhar M, Ryan J, Dierens L, Meadows J, Preston N, Coman G, Lyons RE. Development of two microsatellite multiplex systems for black tiger shrimp Penaeus monodon and its application in genetic diversity study for two populations. Aquaculture. 2007;266:279–288. doi: 10.1016/j.aquaculture.2007.01.038. - DOI
    1. Sellars MJ, Dierens L, McWilliam S, Little B, Murphy B, Coman GJ, Barendse W, Henshall J. Comparison of microsatellite and SNP DNA markers for pedigree assignment in Black Tiger shrimp, Penaeus monodon. Aquacult Res. 2014;45:417–426. doi: 10.1111/j.1365-2109.2012.03243.x. - DOI

Publication types