The word frequency effect: a review of recent developments and implications for the choice of frequency estimates in German

Exp Psychol. 2011;58(5):412-24. doi: 10.1027/1618-3169/a000123.


We review recent evidence indicating that researchers in experimental psychology may have used suboptimal estimates of word frequency. Word frequency measures should be based on a corpus of at least 20 million words that contains language participants in psychology experiments are likely to have been exposed to. In addition, the quality of word frequency measures should be ascertained by correlating them with behavioral word processing data. When we apply these criteria to the word frequency measures available for the German language, we find that the commonly used Celex frequencies are the least powerful to predict lexical decision times. Better results are obtained with the Leipzig frequencies, the dlexDB frequencies, and the Google Books 2000-2009 frequencies. However, as in other languages the best performance is observed with subtitle-based word frequencies. The SUBTLEX-DE word frequencies collected for the present ms are made available in easy-to-use files and are free for educational purposes.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Choice Behavior
  • Humans
  • Language*
  • Linguistics
  • Recognition, Psychology*
  • Vocabulary*