Integrating and mining diverse data in human immunological studies

Bioanalysis. 2014 Jan;6(2):209-23. doi: 10.4155/bio.13.309.


Bioanalysts and immunologists can interrogate the immune system with a variety of high-throughput technologies such as gene expression, multiplex bead arrays and flow cytometry. Conceptually, these assays support systems immunology studies, in which phenomena can be measured and correlated across biological compartments. First, however, the resulting high-dimensional data must be combined in a consistent fashion that supports analysis of the data as an integrated whole. Next, analytical methods must be applied to the hundreds or thousands of readouts. We recommend the use of a four-part analytical pipeline, consisting of data integration, hypothesis generation, prediction and hypothesis testing, and validation. We describe a variety of established methods appropriate for these integrated datasets, and highlight their application to human immunological studies. Our goal is to provide bioanalysts, immunologists and data analysts with a valuable perspective with which to approach the multiassay high-dimensional datasets generated by contemporary immunological studies.

Publication types

  • Review

MeSH terms

  • Cluster Analysis
  • Data Mining*
  • Humans
  • Influenza Vaccines / immunology
  • Interferon-gamma / metabolism
  • Longitudinal Studies
  • Neural Networks, Computer
  • Principal Component Analysis
  • T-Lymphocytes / immunology
  • T-Lymphocytes / metabolism


  • Influenza Vaccines
  • Interferon-gamma