It's challenging work to identify disease-causing genes from the next-generation sequencing (NGS) data of patients with Mendelian disorders. To improve this situation, researchers have developed many phenotype-driven gene prioritization methods using a patient's genotype and phenotype information, or phenotype information only as input to rank the candidate's pathogenic genes. Evaluations of these ranking methods provide practitioners with convenience for choosing an appropriate tool for their workflows, but retrospective benchmarks are underpowered to provide statistically significant results in their attempt to differentiate. In this research, the performance of ten recognized causal-gene prioritization methods was benchmarked using 305 cases from the Deciphering Developmental Disorders (DDD) project and 209 in-house cases via a relatively unbiased methodology. The evaluation results show that methods using Human Phenotype Ontology (HPO) terms and Variant Call Format (VCF) files as input achieved better overall performance than those using phenotypic data alone. Besides, LIRICAL and AMELIE, two of the best methods in our benchmark experiments, complement each other in cases with the causal genes ranked highly, suggesting a possible integrative approach to further enhance the diagnostic efficiency. Our benchmarking provides valuable reference information to the computer-assisted rapid diagnosis in Mendelian diseases and sheds some light on the potential direction of future improvement on disease-causing gene prioritization methods.
Keywords: HPO; Mendelian diseases; benchmarking; gene prioritization.
© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org.