Traditional Chinese medicine tongue inspection: an examination of the inter- and intrapractitioner reliability for specific tongue characteristics

J Altern Complement Med. 2008 Jun;14(5):527-36. doi: 10.1089/acm.2007.0079.


Aim: To examine the reliability of Traditional Chinese Medicine (TCM) tongue inspection by evaluation of inter- and intrapractitioner agreement levels for specific tongue characteristics, achieved by a group of TCM practitioners.

Method: Ten (10) realistic tongue slides and a set of questionnaires were used to examine the agreement levels among 30 TCM practitioners in two data collection sessions. Statistical analysis involved descriptive statistics, predominantly percentage frequency agreement with the agreement level of > or =80% set as the criterion for an acceptable level of reliability.

Results: Overall, both inter- and intrapractitioner agreement levels were low. Only on 19 occasions (17.3%) in session 1 and 21 occasions (19.1%) in session 2 were interpractitioner (between a practitioner) agreement levels of > or =80% achieved. Moreover, virtually all (15 occasions in session 1 and 14 occasions in session 2) of these questions involved simple dichotomous response choices, and the practitioners achieved levels of > or =80% reliability in only 5% of occasions where more complex response choices were offered. In terms of intrapractitioner (within a practitioner) agreement, the highest agreement level was achieved for the dichotomous response choice questions on presence of coat and presence of crack, with the 29 and 20 practitioners achieving > or =80% intrapractitioner agreement, respectively. Only 2 subjects achieved higher than 80% intrapractitioner agreement level for all tongue slides on all questions, with the highest intrapractitioner agreement level being 88% followed by 82%. The findings showed that the TCM tongue inspection for specific characteristics examined was not a reliable diagnostic method, at least for the group of TCM practitioners involved in this study.

Conclusions: The findings suggest that a major contribution to the low levels of inter- and intrapractitioner agreements stems from inadequate operational definitions of both the tongue characteristics studied and of the inspection regions of the tongue.

MeSH terms

  • Adult
  • Australia
  • Clinical Competence*
  • Diagnosis, Differential
  • Female
  • Humans
  • Male
  • Medicine, Chinese Traditional / methods*
  • Middle Aged
  • Physical Examination / methods
  • Quality Assurance, Health Care
  • Reproducibility of Results
  • Research Design
  • Tongue / pathology*
  • Tongue Diseases / diagnosis*