A detailed characteristics of bias associated with long runs of homozygosity identification based on medium density SNP microarrays

J Genomics. 2020 Apr 7;8:43-48. doi: 10.7150/jgen.39147. eCollection 2020.


In the present study, runs of homozygosity (ROH) detected with the use of a standard bovine 54k single nucleotide polymorphism (SNP) genotyping assay and two different ROH detection approaches, based on 50 (M1) or 15 (M2) consecutive SNPs, were compared with results of whole genome sequencing. Both microarray-based methods accurately recognised medium-sized ROH, however, it was found that M2 method seemed to better than M1 identify short ROH, but highly overestimated their number, leading to numerous false positive calls. Moreover, long ROH identified with microarray data tended to break into shorter segments in sequencing data because of the presence of regions with high heterozygosity within the ROH sequences. This may indicate, that these long ROH are formed by closely positioned shorter homozygous segments that may be of older origin or may be created by two similar but not identical haplotypes, showing minor internal recombination signs. Such finding also suggests that at least some of the results of previous studies in regard to long ROH may be biased leading to inaccurate estimations of genomes autozygosity via ROH classification into length categories.

Keywords: autozygosity; microarray; next generation sequencing; runs of homozygosity.