Second-order peak detection for multicomponent high-resolution LC/MS data

Anal Chem. 2006 Feb 15;78(4):975-83. doi: 10.1021/ac050980b.

Abstract

The first step when analyzing multicomponent LC/MS data from complex samples such as biofluid metabolic profiles is to separate the data into information and noise via, for example, peak detection. Due to the complex nature of this type of data, with problems such as alternating backgrounds and differing peak shapes, this can be a very complex task. This paper presents and evaluates a two-dimensional peak detection algorithm based on raw vector-represented LC/MS data. The algorithm exploits the fact that in high-resolution centroid data chromatographic peaks emerge flanked with data voids in the corresponding mass axis. According to the proposed method, only 4 per thousand of the total amount of data from a urine sample is defined as chromatographic peaks; however, 94% of the raw data variance is captured within these peaks. Compared to bucketed data, results show that essentially the same features that an experienced analyst would define as peaks can automatically be extracted with a minimum of noise and background. The method is simple and requires a priori knowledge of only the minimum chromatographic peak width-a system-dependent parameter that is easily assessed. Additional meta parameters are estimated from the data themselves. The result is well-defined chromatographic peaks that are consistently arranged in a matrix at their corresponding m/z values. In the context of automated analysis, the method thus provides an alternative to the traditional approach of bucketing the data followed by denoising and/or one-dimensional peak detection. The software implementation of the proposed algorithm is available at http://www.anchem.su.se/peakd as compiled code for Matlab.

MeSH terms

  • Algorithms
  • Chromatography, Liquid / methods*
  • Spectrometry, Mass, Electrospray Ionization / methods*