Recent technological advances in flow cytometry instrumentation provide the basis for high-dimensionality and high-throughput biological experimentation in a heterogeneous cellular context. Concomitant advances in scalable computational algorithms are necessary to better utilize the information that is contained in these high-complexity experiments. The development of such tools has the potential to expand the utility of flow cytometric analysis from a predominantly hypothesis-driven mode to one of discovery, or hypothesis-generating research. A new method of analysis of flow cytometric data called Cytometric Fingerprinting (CF) has been developed. CF captures the set of multivariate probability distribution functions corresponding to list-mode data and then "flattens" them into a computationally efficient fingerprint representation that facilitates quantitative comparisons of samples. An experimental and synthetic data were generated to act as reference sets for evaluating CF. Without the introduction of prior knowledge, CF was able to "discover" the location and concentration of spiked cells in ungated analyses over a concentration range covering four orders of magnitude, to a lower limit on the order of 10 spiked events in a background of 100,000 events. We describe a new method for quantitative analysis of list-mode cytometric data. CF includes a novel algorithm for space subdivision that improves estimation of the probability density function by dividing space into nonrectangular polytopes. Additionally it renders a multidimensional distribution in the form of a one-dimensional multiresolution hierarchical fingerprint that creates a computationally efficient representation of high dimensionality distribution functions. CF supports both the generation and testing of hypotheses, eliminates sources of operator bias, and provides an increased level of automation of data analysis.
(c) 2008 International Society for Advancement of Cytometry.