Graphical methods for reducing, visualizing and analyzing large data sets using hierarchical terminologies

AMIA Annu Symp Proc. 2011;2011:635-43. Epub 2011 Oct 22.


Objective: To explore new graphical methods for reducing and analyzing large data sets in which the data are coded with a hierarchical terminology.

Methods: We use a hierarchical terminology to organize a data set and display it in a graph. We reduce the size and complexity of the data set by considering the terminological structure and the data set itself (using a variety of thresholds) as well as contributions of child level nodes to parent level nodes.

Results: We found that our methods can reduce large data sets to manageable size and highlight the differences among graphs. The thresholds used as filters to reduce the data set can be used alone or in combination. We applied our methods to two data sets containing information about how nurses and physicians query online knowledge resources. The reduced graphs make the differences between the two groups readily apparent.

Conclusions: This is a new approach to reduce size and complexity of large data sets and to simplify visualization. This approach can be applied to any data sets that are coded with hierarchical terminologies.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Algorithms*
  • Data Display*
  • Vocabulary, Controlled*