Measuring the performance of visual to auditory information conversion

PLoS One. 2013 May 16;8(5):e63042. doi: 10.1371/journal.pone.0063042. Print 2013.


Background: Visual to auditory conversion systems have been in existence for several decades. Besides being among the front runners in providing visual capabilities to blind users, the auditory cues generated from image sonification systems are still easier to learn and adapt to compared to other similar techniques. Other advantages include low cost, easy customizability, and universality. However, every system developed so far has its own set of strengths and weaknesses. In order to improve these systems further, we propose an automated and quantitative method to measure the performance of such systems. With these quantitative measurements, it is possible to gauge the relative strengths and weaknesses of different systems and rank the systems accordingly.

Methodology: Performance is measured by both the interpretability and also the information preservation of visual to auditory conversions. Interpretability is measured by computing the correlation of inter image distance (IID) and inter sound distance (ISD) whereas the information preservation is computed by applying Information Theory to measure the entropy of both visual and corresponding auditory signals. These measurements provide a basis and some insights on how the systems work.

Conclusions: With an automated interpretability measure as a standard, more image sonification systems can be developed, compared, and then improved. Even though the measure does not test systems as thoroughly as carefully designed psychological experiments, a quantitative measurement like the one proposed here can compare systems to a certain degree without incurring much cost. Underlying this research is the hope that a major breakthrough in image sonification systems will allow blind users to cost effectively regain enough visual functions to allow them to lead secure and productive lives.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Auditory Perception / physiology*
  • Computer Simulation
  • Humans
  • Visual Perception / physiology*

Grants and funding

This work was partly supported by the Malaysian Ministry of Higher Education Malaysia (MOHE) under the Fundamental Research Grant Scheme (FRGS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.