Using MS-FINDER for identifying 19 natural products in the CASMI 2016 contest

Phytochem Lett. 2017 Sep:21:306-312. doi: 10.1016/j.phytol.2016.12.008. Epub 2016 Dec 9.

Abstract

In its fourth year, the CASMI 2016 contest was organized to evaluate current chemical structure identification strategies for 19 natural products using high-resolution LC-MS and LC-MS/MS challenge datasets using automated methods with or without the combination of other tools. These natural products originate from plants, fungi, marine sponges, algae, or micro-algae. Every compound annotation workflow must start with determination of elemental compositions. Of these 19 challenges, one was excluded by the organizers after submission. For the remaining 18 challenges, three software programs were used. MS-FINDER version 1.62 was able to correctly identify 89% of the molecular formulas using an internal database that comprised of 13 metabolomics repositories with 45,181 formulas. SIRIUS correctly identified 61% compositions using PubChem formulas and Seven Golden Rules correctly identified 83% by using the Dictionary of Natural Products as a targeted database. Next, we performed structural dereplication for which we used the consensus formula from the three software programs. We submitted two solution sets for these challenges. In the first solution set, avaniya001, we only used the internal MS-FINDER functions for predicting and ranking structures, correctly identifying 53% of the structures as top-hit, 72% within the top-3 structures, and 78% within the top-10 hits. For our second set, avaniya002, we used both MS-FINDER predictions as well as MS/MS queries against the commercial NIST 14, METLIN, and the public MassBank of North America libraries. Here we correctly identified 78% of the structures as top-hit and 83% within the top-3 hits. Three challenge spectra remained unidentified in either of our submissions within the top-10 hits.

Keywords: CASMI; MS-FINDER; compound identification; mass spectrometry; natural products; tandem mass spectrometry.