Designing sensitive viral diagnostics with machine learning

Nat Biotechnol. 2022 Jul;40(7):1123-1131. doi: 10.1038/s41587-022-01213-5. Epub 2022 Mar 3.

Abstract

Design of nucleic acid-based viral diagnostics typically follows heuristic rules and, to contend with viral variation, focuses on a genome's conserved regions. A design process could, instead, directly optimize diagnostic effectiveness using a learned model of sensitivity for targets and their variants. Toward that goal, we screen 19,209 diagnostic-target pairs, concentrated on CRISPR-based diagnostics, and train a deep neural network to accurately predict diagnostic readout. We join this model with combinatorial optimization to maximize sensitivity over the full spectrum of a virus's genomic variation. We introduce Activity-informed Design with All-inclusive Patrolling of Targets (ADAPT), a system for automated design, and use it to design diagnostics for 1,933 vertebrate-infecting viral species within 2 hours for most species and within 24 hours for all but three. We experimentally show that ADAPT's designs are sensitive and specific to the lineage level and permit lower limits of detection, across a virus's variation, than the outputs of standard design techniques. Our strategy could facilitate a proactive resource of assays for detecting pathogens.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Machine Learning*
  • Neural Networks, Computer
  • Nucleic Acids*

Substances

  • Nucleic Acids