Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 1;32(11):1749-51.
doi: 10.1093/bioinformatics/btw044. Epub 2016 Jan 30.

BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data

Affiliations

BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data

Vagheesh Narasimhan et al. Bioinformatics. .

Abstract

Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity.

Availability and implementation: BCFtools/RoH and its associated binary/source files are freely available from https://github.com/samtools/BCFtools

Contact: vn2@sanger.ac.uk or pd3@sanger.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Comparison of error rates of BCFtools/RoH and other existing methods as well as performance on real data. (A) Performance on simulated data. FPR and FNRs in data simulated with varying levels of autozygosity and SNP calling error, analyzed using three different detection methods. (B) Performance on real data. We compare the inbreeding coefficient F, estimated either by our method as the percentage of the genome that is autozygous, or as the deviation from HWE estimated across all sites, for 31 CEU individuals. (C) Example autozygous segments in simulated data (green) and detected by our method (red). For each chromosome, the y-axis shows the normalized density of heterozygous sites in bins of 0.1 Mb. The x-axis shows the position of the chromosome (in units of 1e8 bp). The overlapping red and green sections show that the regions identified as autozygous using our HMM approach accurately reflect the true length and location of autozygous sections in the simulated data

Similar articles

Cited by

References

    1. Abecasis G.R. et al. (2012) An integrated map of genetic variation from 1,092 human genomes. Nature, 491, 56–65. - PMC - PubMed
    1. Browning S.R., Browning B.L. (2010) High-resolution detection of identity by descent in unrelated individuals. Am. J. Hum. Genet., 86, 526–539. - PMC - PubMed
    1. Durbin R. et al. (1998) Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, New York.
    1. Gusev A. et al. (2009) Whole population, genome-wide mapping of hidden relatedness. Genome Res., 19, 318–326. - PMC - PubMed
    1. Howrigan D.P. et al. (2011) Detecting autozygosity through runs of homozygosity: a comparison of three autozygosity detection algorithms. BMC Genomics, 12, 460. - PMC - PubMed