Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 7;102(6):1185-1194.
doi: 10.1016/j.ajhg.2018.03.021. Epub 2018 May 10.

Estimation of Genetic Correlation via Linkage Disequilibrium Score Regression and Genomic Restricted Maximum Likelihood

Collaborators, Affiliations

Estimation of Genetic Correlation via Linkage Disequilibrium Score Regression and Genomic Restricted Maximum Likelihood

Guiyan Ni et al. Am J Hum Genet. .

Abstract

Genetic correlation is a key population parameter that describes the shared genetic architecture of complex traits and diseases. It can be estimated by current state-of-art methods, i.e., linkage disequilibrium score regression (LDSC) and genomic restricted maximum likelihood (GREML). The massively reduced computing burden of LDSC compared to GREML makes it an attractive tool, although the accuracy (i.e., magnitude of standard errors) of LDSC estimates has not been thoroughly studied. In simulation, we show that the accuracy of GREML is generally higher than that of LDSC. When there is genetic heterogeneity between the actual sample and reference data from which LD scores are estimated, the accuracy of LDSC decreases further. In real data analyses estimating the genetic correlation between schizophrenia (SCZ) and body mass index, we show that GREML estimates based on ∼150,000 individuals give a higher accuracy than LDSC estimates based on ∼400,000 individuals (from combined meta-data). A GREML genomic partitioning analysis reveals that the genetic correlation between SCZ and height is significantly negative for regulatory regions, which whole genome or LDSC approach has less power to detect. We conclude that LDSC estimates should be carefully interpreted as there can be uncertainty about homogeneity among combined meta-datasets. We suggest that any interesting findings from massive LDSC analysis for a large number of complex traits should be followed up, where possible, with more detailed analyses with GREML methods, even if sample sizes are lesser.

Keywords: SNP heritability; accuracy; biasedness; body mass index; genetic correlation; genome-wide SNPs; genomic restricted maximum likelihood; height; linkage disequilibrium score regression; schizophrenia.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The Ratio of SE of LDSC Estimate to that of GREML Estimate using Simulated Phenotypes Based on UK Biobank Genotypes Bars are 95% CI based on 100 replicates. The unit for the number of SNPs is thousands. This result was based on 858K SNPs (after QC) and 10,000 individuals that were randomly selected from UK Biobank. SNPs in each bin were randomly drawn from the 858K SNPs independently. The number of causal SNPs was 10,000 that were randomly selected in each bin. The true simulated value for the genetic correlation was 0.6 and that for the heritability was 0.5 for both traits. Overlap (0%, 10%, and 20%) stands for the percentage of overlapping individuals in the first and second traits.
Figure 2
Figure 2
Estimated Genetic Correlation with GREML and LDSC (without Constrain to the Intercept) Based on Different Genetic Datasets Simulation was based on 10,000 individuals that were randomly selected from UKBB, WTCCC2, GERA, and UKBBr (the raw genotype of UKBB), with 858K, 432K, 239K, and 124K SNPs, respectively. Bars are 95% CI based on 100 replicates. Overlap (0%, 10%, and 20%) stands for the percentage of overlapping individuals in the first and second traits. The gray dashed line stands for the true simulated genetic correlation 0.6.
Figure 3
Figure 3
Estimated Genetic Correlation of Simulated Data Based on a Genomic Partitioning Model Simulation was based on 10,000 individuals that were randomly selected from UKBB with 858K SNP. Based on Gusev et al., the 858K SNPs across the genome were stratified as two categories: DHS (194K SNPs with 2,268 causal SNPs) and non-DHS (664K SNPs with 7,732 causal SNPs). The genetic correlation for the simulated phenotypes between the first and second traits was 0.6 and −0.6 in DHS and non-DHS region, respectively. Bars are 95% CI based on 100 replicates. LDSC-CEU: Using LD-scores estimated from 1KG reference data. LDSC-OWN: Using LD-scores estimated from UKBB. sLDSC-CEU: Using stratified LD-scores estimated from 1KG reference data. sLDSC-OWN: Using stratified LD-scores estimated from UKBB. The presented results were based on 0% overlapping samples between the first and second traits and those based on other scenarios (e.g., 10% and 20%) are presented in Table S1.
Figure 4
Figure 4
Genetic Correlation between SCZ and Height and Heritability Based on SNPs in Partitioned Genomic Regions Estimated with GREML A joint model was applied by fitting four genomic relationship matrices simultaneously, each estimated based on the set of SNPs belong to each of the functional categories (regulatory, intron, intergene, and DHS). The bars are standard errors. p value for the estimate significantly different from 0 was 0.0028, 0.52, 0.91, and 0.67 for regulatory, intronic, intergenic, and DHS region, respectively.

Similar articles

Cited by

References

    1. Mehta D., Tropf F.C., Gratten J., Bakshi A., Zhu Z., Bacanu S.-A., Hemani G., Magnusson P.K.E., Barban N., Esko T., Schizophrenia Working Group of the Psychiatric Genomics Consortium, LifeLines Cohort Study, and TwinsUK Evidence for genetic overlap between schizophrenia and age at first birth in women. JAMA Psychiatry. 2016;73:497–505. - PMC - PubMed
    1. Lee S.H., Byrne E.M., Hultman C.M., Kähler A., Vinkhuyzen A.A.E., Ripke S., Andreassen O.A., Frisell T., Gusev A., Hu X., Schizophrenia Working Group of the Psychiatric Genomics Consortium and Rheumatoid Arthritis Consortium International. Schizophrenia Working Group of the Psychiatric Genomics Consortium Authors. Schizophrenia Working Group of the Psychiatric Genomics Consortium Collaborators. Rheumatoid Arthritis Consortium International Authors. Rheumatoid Arthritis Consortium International Collaborators New data and an old puzzle: the negative association between schizophrenia and rheumatoid arthritis. Int. J. Epidemiol. 2015;44:1706–1721. - PMC - PubMed
    1. Lee S.H., DeCandia T.R., Ripke S., Yang J., Sullivan P.F., Goddard M.E., Keller M.C., Visscher P.M., Wray N.R., Schizophrenia Psychiatric Genome-Wide Association Study Consortium (PGC-SCZ) International Schizophrenia Consortium (ISC) Molecular Genetics of Schizophrenia Collaboration (MGS) Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 2012;44:247–250. - PMC - PubMed
    1. Lee S.H., Yang J., Goddard M.E., Visscher P.M., Wray N.R. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics. 2012;28:2540–2542. - PMC - PubMed
    1. Harville D.A. Maximum likelihood approaches to variance component estimation and to related problems. J. Am. Stat. Assoc. 1977;72:320–338.

Publication types