Artificial intelligence and database for NGS-based diagnosis in rare disease

Front Genet. 2024 Jan 25:14:1258083. doi: 10.3389/fgene.2023.1258083. eCollection 2023.

Abstract

Rare diseases (RDs) are rare complex genetic diseases affecting a conservative estimate of 300 million people worldwide. Recent Next-Generation Sequencing (NGS) studies are unraveling the underlying genetic heterogeneity of this group of diseases. NGS-based methods used in RDs studies have improved the diagnosis and management of RDs. Concomitantly, a suite of bioinformatics tools has been developed to sort through big data generated by NGS to understand RDs better. However, there are concerns regarding the lack of consistency among different methods, primarily linked to factors such as the lack of uniformity in input and output formats, the absence of a standardized measure for predictive accuracy, and the regularity of updates to the annotation database. Today, artificial intelligence (AI), particularly deep learning, is widely used in a variety of biological contexts, changing the healthcare system. AI has demonstrated promising capabilities in boosting variant calling precision, refining variant prediction, and enhancing the user-friendliness of electronic health record (EHR) systems in NGS-based diagnostics. This paper reviews the state of the art of AI in NGS-based genetics, and its future directions and challenges. It also compare several rare disease databases.

Keywords: artificial intelligence; data science; diagnosis; machine learning; next-generation sequencing; rare disease.

Publication types

  • Review

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was sponsored by the ASPIRE, the technology program management pillar of Abu Dhabi’s Advanced Technology Research Council (ATRC), via the ASPIRE Precision Medicine Research Institute Abu Dhabi (ASPIREPMRIAD) award grant number VRI-20‐10. The United Arab Emirates University also supported this work through the Research Start‐up Program (Grant # 12M109).