Machine learning approach for automatic quality criteria detection of health web pages

Stud Health Technol Inform. 2007;129(Pt 1):705-9.


The number of medical websites is constantly growing [1]. Owing to the open nature of the Web, the reliability of information available on the Web is uneven. Internet users are overwhelmed by the quantity of information available on the Web. The situation is even more critical in the medical area, as the content proposed by health websites can have a direct impact on the users' well being. One way to control the reliability of health websites is to assess their quality and to make this assessment available to users. The HON Foundation has defined a set of eight ethical principles. HON's experts are working in order to manually define whether a given website complies with s the required principles. As the number of medical websites is constantly growing, manual expertise becomes insufficient and automatic systems should be used in order to help medical experts. In this paper we present the design and the evaluation of an automatic system conceived for the categorisation of medical and health documents according to he HONcode ethical principles. A first evaluation shows promising results. Currently the system shows 0.78 micro precision and 0.73 F-measure, with 0.06 errors.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Codes of Ethics
  • Health
  • Humans
  • Information Services / standards
  • Internet / standards*
  • Medical Informatics / standards*
  • Natural Language Processing
  • Quality Control