Background: Identification of antibiotic resistance genes from environmental samples has been a critical sub-domain of gene discovery which is directly connected to human health. However, it is drawing extraordinary attention in recent years and regarded as a severe threat to human health by many institutions around the world. To satisfy the needs for efficient ARG discovery, a series of online antibiotic resistance gene databases have been published. This article will conduct an in-depth analysis of CARD, one of the most widely used ARG databases.
Results: The decision model of CARD is based the alignment score with a single ARG type. We discover the occasions where the model is likely to make false prediction, and then propose an optimization method on top of the current CARD model. The optimization is expected to raise the coherence with BLAST homology relationships and improve the confidence for identification of ARGs using the database.
Conclusions: The absence of public recognized benchmark makes it challenging to evaluate the performance of ARG identification. However, possible wrong predictions and methods for resolving the problem can be inferred by computational analysis of the identification method and the underlying reference sequences. We hope our work can bring insight to the mission of precise ARG type classifications.
Keywords: Antibiotic resistance gene; CARD database; RND efflux pumps.