Motivation: Next generation sequencing technology considerably changed the way we screen for pathogenic mutations in rare Mendelian disorders. However, the identification of the disease-causing mutation amongst thousands of variants of partly unknown relevance is still challenging and efficient techniques that reduce the genomic search space play a decisive role. Often segregation- or linkage analysis are used to prioritize candidates, however, these approaches require correct information about the degree of relationship among the sequenced samples. For quality assurance an automated control of pedigree structures and sample assignment is therefore highly desirable in order to detect label mix-ups that might otherwise corrupt downstream analysis.
Results: We developed an algorithm based on likelihood ratios that discriminates between different classes of relationship for an arbitrary number of genotyped samples. By identifying the most likely class we are able to reconstruct entire pedigrees iteratively, even for highly consanguineous families. We tested our approach on exome data of different sequencing studies and achieved high precision for all pedigree predictions. By analyzing the precision for varying degrees of relatedness or inbreeding we could show that a prediction is robust down to magnitudes of a few hundred loci.
Availability and implementation: A java standalone application that computes the relationships between multiple samples as well as a Rscript that visualizes the pedigree information is available for download as well as a web service at www.gene-talk.de CONTACT: firstname.lastname@example.orgSupplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press.