Automatic classification of communication logs into implementation stages via text analysis

Implement Sci. 2016 Sep 6;11(1):119. doi: 10.1186/s13012-016-0483-6.

Abstract

Background: To improve the quality, quantity, and speed of implementation, careful monitoring of the implementation process is required. However, some health organizations have such limited capacity to collect, organize, and synthesize information relevant to its decision to implement an evidence-based program, the preparation steps necessary for successful program adoption, the fidelity of program delivery, and the sustainment of this program over time. When a large health system implements an evidence-based program across multiple sites, a trained intermediary or broker may provide such monitoring and feedback, but this task is labor intensive and not easily scaled up for large numbers of sites. We present a novel approach to producing an automated system of monitoring implementation stage entrances and exits based on a computational analysis of communication log notes generated by implementation brokers. Potentially discriminating keywords are identified using the definitions of the stages and experts' coding of a portion of the log notes. A machine learning algorithm produces a decision rule to classify remaining, unclassified log notes.

Results: We applied this procedure to log notes in the implementation trial of multidimensional treatment foster care in the California 40-county implementation trial (CAL-40) project, using the stages of implementation completion (SIC) measure. We found that a semi-supervised non-negative matrix factorization method accurately identified most stage transitions. Another computational model was built for determining the start and the end of each stage.

Conclusions: This automated system demonstrated feasibility in this proof of concept challenge. We provide suggestions on how such a system can be used to improve the speed, quality, quantity, and sustainment of implementation. The innovative methods presented here are not intended to replace the expertise and judgement of an expert rater already in place. Rather, these can be used when human monitoring and feedback is too expensive to use or maintain. These methods rely on digitized text that already exists or can be collected with minimal to no intrusiveness and can signal when additional attention or remediation is required during implementation. Thus, resources can be allocated according to need rather than universally applied, or worse, not applied at all due to their cost.

Keywords: Machine learning; Social systems informatics; Text mining; Unobtrusive measures.

MeSH terms

  • California
  • Communication*
  • Computer Simulation
  • Data Mining*
  • Diffusion of Innovation
  • Feasibility Studies
  • Foster Home Care
  • Humans
  • Information Storage and Retrieval
  • Machine Learning
  • Mathematics
  • Medical Informatics / methods*
  • Records
  • Sensitivity and Specificity
  • Translational Research, Biomedical