Comparing Themes Extracted via Topic Modeling and Manual Content Analysis: Korean-Language Discussions of Dementia on Twitter

Haeyoung Lee; Sun Joo Jang; Frederick F Sun; Peter Broadwell; Sunmoo Yoon

doi:10.3233/SHTI220704

Comparing Themes Extracted via Topic Modeling and Manual Content Analysis: Korean-Language Discussions of Dementia on Twitter

Stud Health Technol Inform. 2022 Jun 29:295:230-233. doi: 10.3233/SHTI220704.

Authors

Haeyoung Lee¹, Sun Joo Jang¹, Frederick F Sun², Peter Broadwell³, Sunmoo Yoon^{4

5}

Affiliations

¹ Department of Nursing, Chung-Ang University, South Korea.
² Department of Rehabilitation and Regenerative Medicine, Columbia University, USA.
³ Center for Interdisciplinary Digital Research, Stanford University, USA.
⁴ General Medicine, Department of Medicine, Columbia University, USA.
⁵ Data Science Institute, Columbia University, USA.

Abstract

We randomly examined Korean-language Tweets mentioning dementia/Alzheimer's disease (n= 12,413) posted from November 28 to December 9, 2020, without limiting geographical locations. We independently applied Latent Dirichlet Allocation (LDA) topic modeling and qualitative content analysis to the texts of the Tweets. We compared the themes extracted by LDA topic modeling to those identified via manual coding methods. A total of 16 themes were detected from manual coding, with inter-rater reliability (Cohen's kappa) of 0.842. The proportions of the most prominent themes were: burdens of family caregiving (48.50%), reports of wandering/missing family members with dementia (18.12%), stigma (13.64%), prevention strategies (5.07%), risk factors (4.91%), healthcare policy (3.26%), and elder abuse/safety issues (1.75%). Seven themes whose contents were similar to themes derived from manual coding were extracted from the LDA topic modeling results (perplexity: -6.39, coherence score: 0.45). Our findings suggest that applying LDA topic modeling can be fairly effective at extracting themes from Korean Twitter discussions, in a manner analogous to qualitative coding, to gain insights regarding caregiving for family members with dementia, and our approach can be applied to other languages.

Keywords: Dementia caregiving; online intervention; social media; topic modeling.

MeSH terms

Aged
Dementia*
Humans
Language
Reproducibility of Results
Republic of Korea
Social Media*

Grants and funding

R01 AG060929/AG/NIA NIH HHS/United States