Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar 12:16:24.
doi: 10.1186/s12863-015-0185-0.

The effect of rare alleles on estimated genomic relationships from whole genome sequence data

Affiliations

The effect of rare alleles on estimated genomic relationships from whole genome sequence data

Sonia E Eynard et al. BMC Genet. .

Abstract

Background: Relationships between individuals and inbreeding coefficients are commonly used for breeding decisions, but may be affected by the type of data used for their estimation. The proportion of variants with low Minor Allele Frequency (MAF) is larger in whole genome sequence (WGS) data compared to Single Nucleotide Polymorphism (SNP) chips. Therefore, WGS data provide true relationships between individuals and may influence breeding decisions and prioritisation for conservation of genetic diversity in livestock. This study identifies differences between relationships and inbreeding coefficients estimated using pedigree, SNP or WGS data for 118 Holstein bulls from the 1000 Bull genomes project. To determine the impact of rare alleles on the estimates we compared three scenarios of MAF restrictions: variants with a MAF higher than 5%, variants with a MAF higher than 1% and variants with a MAF between 1% and 5%.

Results: We observed significant differences between estimated relationships and, although less significantly, inbreeding coefficients from pedigree, SNP or WGS data, and between MAF restriction scenarios. Computed correlations between pedigree and genomic relationships, within groups with similar relationships, ranged from negative to moderate for both estimated relationships and inbreeding coefficients, but were high between estimates from SNP and WGS (0.49 to 0.99). Estimated relationships from genomic information exhibited higher variation than from pedigree. Inbreeding coefficients analysis showed that more complete pedigree records lead to higher correlation between inbreeding coefficients from pedigree and genomic data. Finally, estimates and correlations between additive genetic (A) and genomic (G) relationship matrices were lower, and variances of the relationships were larger when accounting for allele frequencies than without accounting for allele frequencies.

Conclusions: Using pedigree data or genomic information, and including or excluding variants with a MAF below 5% showed significant differences in relationship and inbreeding coefficient estimates. Estimated relationships and inbreeding coefficients are the basis for selection decisions. Therefore, it can be expected that using WGS instead of SNP can affect selection decision. Inclusion of rare variants will give access to the variation they carry, which is of interest for conservation of genetic diversity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution plot of the number of variants per class of MAF. Histograms of the number of segregating variants in each Minor Allele Frequency category (116 bins) from 1% to 50%, with density curve. The histogram on the left represents the distribution of variants from the Bovine 50 K SNP chip. The histogram on the right represents the distribution of variants from whole genome sequence (WGS) data.
Figure 2
Figure 2
Linear regressions plots for A , SNP and WGS against each other ( Yang method ) . Plots of linear regressions of A estimated relationships from pedigree (A ped), G estimated relationships for Single Nucleotide Polymorphism (G SNP) and whole genome sequence (G WGS) data using the Yang method. Each linear regression was performed for the scenarios with Minor Allele Frequency (MAF) ≥ 5% (5+), ≥ 1% (1+) and between 1% and 5% (1_5). The first row represents the plots for scenario +5, the second for +1 and the third for 1_5. The first column shows the linear regression plots of G SNP on A ped. The second column shows the linear regression plots of G WGS on A ped. The third shows the linear regression plots of G WGS on G SNP. In black is the regression line for an exact linear model (intercept=0, slope=1) and in red is the actual overall regression line. On the top left corner, the overall correlation coefficient for each linear regression appears.
Figure 3
Figure 3
Linear regressions plots for A, SNP and WGS against each other (based on similarities). Plots of linear regression of A estimated relationships from pedigree (A ped), G estimated relationships for Single Nucleotide Polymorphism (G SNP) and whole genome sequence (G WGS) data, based on similarities. Each linear regression was performed for the scenarios with Minor Allele Frequency (MAF) ≥ 5% (5+), ≥ 1% (1+) and between 1% and 5% (1_5). The first row represents the plots for scenario +5, the second for +1 and the third for 1_5. The first column shows the linear regression plots of G SNP on A ped. The second column shows the linear regression plots of G WGS on A ped. The third shows the linear regression plots of G WGS on G SNP. In black is the regression line for an exact linear model (intercept=0, slope=1) and in red is the actual overall regression line. On the top left corner, the overall correlation coefficient for each linear regression appears.

Similar articles

Cited by

References

    1. Stock KF, Reents R. Genomic selection: status in different species and challenges for breeding. Reprod Domest Anim. 2013;48:2–10. doi: 10.1111/rda.12201. - DOI - PubMed
    1. Meuwissen THE, Hayes BJ, Goddard M. Accelerating improvement of livestock with genomic selection. Ann Rev Animal Biosci. 2013;1:221–37. doi: 10.1146/annurev-animal-031412-103705. - DOI - PubMed
    1. Nielsen R. Population genetic analysis of ascertained SNP data. Hum Genomics. 2004;1(3):218–24. doi: 10.1186/1479-7364-1-3-218. - DOI - PMC - PubMed
    1. Heslot N, Rutkoski J, Poland J, Jannink JL, Sorrells ME. Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity. PLoS One. 2013;8(9):e74612. doi: 10.1371/journal.pone.0074612. - DOI - PMC - PubMed
    1. Henryon M, Berg P, Sørensen AC. Invited review: animal-breeding schemes using genomic information need breeding plans designed to maximise long-term genetic gains. Livest Sci. 2014;166:38–47. doi: 10.1016/j.livsci.2014.06.016. - DOI

Publication types

LinkOut - more resources