Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug;20(8):855-866.
doi: 10.1038/gim.2017.192. Epub 2017 Nov 16.

Characterizing Reduced Coverage Regions Through Comparison of Exome and Genome Sequencing Data Across 10 Centers

Affiliations
Free PMC article

Characterizing Reduced Coverage Regions Through Comparison of Exome and Genome Sequencing Data Across 10 Centers

Rashesh V Sanghvi et al. Genet Med. .
Free PMC article

Abstract

Purpose: As massively parallel sequencing is increasingly being used for clinical decision making, it has become critical to understand parameters that affect sequencing quality and to establish methods for measuring and reporting clinical sequencing standards. In this report, we propose a definition for reduced coverage regions and describe a set of standards for variant calling in clinical sequencing applications.

Methods: To enable sequencing centers to assess the regions of poor sequencing quality in their own data, we optimized and used a tool (ExCID) to identify reduced coverage loci within genes or regions of particular interest. We used this framework to examine sequencing data from 500 patients generated in 10 projects at sequencing centers in the National Human Genome Research Institute/National Cancer Institute Clinical Sequencing Exploratory Research Consortium.

Results: This approach identified reduced coverage regions in clinically relevant genes, including known clinically relevant loci that were uniquely missed at individual centers, in multiple centers, and in all centers.

Conclusion: This report provides a process road map for clinical sequencing centers looking to perform similar analyses on their data.

Keywords: clinical sequencing; exome; genome; sequencing standards.

Conflict of interest statement

Financial Disclosures/Conflicts of Interest: L.G.B. is an uncompensated consultant for Illumina, receives royalties from Genentech, and honoraria from Wiley Blackwell. L.A.G. is a consultant for Foundation Medicine, Novartis, Boehringer Ingelheim, Third Rock; an equity holder in Foundation Medicine; and a member of the Scientific Advisory Board at Warp Drive. L.A.G. receives sponsored research support from Novartis, Astellas, BMS, and Merck. N.W. is a consultant for Novartis; is an equity holder in Foundation Medicine; and receives sponsored research support from Novartis, Genentech, and Merck.

Figures

Figure 1.
Figure 1.. Reduced coverage regions in the GeneTests List.
(A) Comparison of reduced coverage bases among all centers. (B) Comparison of GeneTests exons affected by the reduced coverage bases among all centers. (C) Comparison of GeneTests exons affected by the reduced coverage bases among all centers. (D) Pairwise comparisons of reduced coverage regions between any two centers. Absolute values represent the reduced coverage bases common to two centers. Percentages represent the overlap in reduced coverage bases between two centers as compared to the union of reduced coverage bases at the two centers. High correlation existed between the two GS centers (I and J) and among the ES centers using same capture design. (E) Disease-associated ClinVar variants overlapping the reduced coverage bases in each center.
Figure 1.
Figure 1.. Reduced coverage regions in the GeneTests List.
(A) Comparison of reduced coverage bases among all centers. (B) Comparison of GeneTests exons affected by the reduced coverage bases among all centers. (C) Comparison of GeneTests exons affected by the reduced coverage bases among all centers. (D) Pairwise comparisons of reduced coverage regions between any two centers. Absolute values represent the reduced coverage bases common to two centers. Percentages represent the overlap in reduced coverage bases between two centers as compared to the union of reduced coverage bases at the two centers. High correlation existed between the two GS centers (I and J) and among the ES centers using same capture design. (E) Disease-associated ClinVar variants overlapping the reduced coverage bases in each center.
Figure 2.
Figure 2.. Reduced coverage bases in GeneTests.
(A) Comparison of reduced coverage regions among eight ES centers. (B) The percent of total bases, total whole exons, and total whole genes amongst the 4,656 GeneTests list successfully covered at one or more centers. To be included, every base in an exon or gene must have been a usable base (coverage ≥20X, mapping quality ≥20, base quality ≥20).
Figure 3.
Figure 3.. Analysis of Reduced Coverage Regions common to all Centers
(A) Overall, there were 735 missing intervals totaling 66.4 Kbp in the intersection of exome and genome centers. Forty-two percent of all missing intervals had lengths that were 5 bp or shorter. The remainder of missing intervals lengths ranged widely and occured with less frequency, but they accounted for 65.7 Kbp or 97.7% of the total length of all missing intervals combined. (B) Of the >67K exons accounting for 4,656 genes in GeneTests, 533 had reduced coverage regions. These regions fell into two distinct groups—either a small part (<20%, with the vast majority less than 10%) of the entire exon had reduced coverage, or most of an exon (>90%) had reduced coverage. (C) Comparison of GC% distribution between the GeneTests baseline and the reduced coverage regions in all centers.
Figure 3.
Figure 3.. Analysis of Reduced Coverage Regions common to all Centers
(A) Overall, there were 735 missing intervals totaling 66.4 Kbp in the intersection of exome and genome centers. Forty-two percent of all missing intervals had lengths that were 5 bp or shorter. The remainder of missing intervals lengths ranged widely and occured with less frequency, but they accounted for 65.7 Kbp or 97.7% of the total length of all missing intervals combined. (B) Of the >67K exons accounting for 4,656 genes in GeneTests, 533 had reduced coverage regions. These regions fell into two distinct groups—either a small part (<20%, with the vast majority less than 10%) of the entire exon had reduced coverage, or most of an exon (>90%) had reduced coverage. (C) Comparison of GC% distribution between the GeneTests baseline and the reduced coverage regions in all centers.

Similar articles

See all similar articles

Cited by 6 articles

See all "Cited by" articles

References

    1. Biesecker LG & Green RC Diagnostic clinical genome and exome sequencing. The New England journal of medicine 371, 1170 (2014). - PubMed
    1. Brownstein CA, et al. An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge. Genome biology 15, R53 (2014). - PMC - PubMed
    1. Gargis AS, et al. Assuring the quality of next-generation sequencing in clinical laboratory practice. Nature biotechnology 30, 1033–1036 (2012). - PMC - PubMed
    1. Rehm HL, et al. ACMG clinical laboratory standards for next-generation sequencing. Genetics in medicine : official journal of the American College of Medical Genetics 15, 733–747 (2013). - PMC - PubMed
    1. Weiss MM, et al. Best practice guidelines for the use of next-generation sequencing applications in genome diagnostics: a national collaborative study of Dutch genome diagnostic laboratories. Human mutation 34, 1313–1321 (2013). - PubMed

Publication types

LinkOut - more resources

Feedback