Purpose: Clear and accurate genetic information should be available to health-care consumers at an individualized level of comprehension. The objective of this study is to evaluate the complexity of common online resources and to simplify text content using automated text processing tools.
Methods: We extracted all text from Genetics Home Reference and MedlinePlus in bulk and analyzed content using natural language processing. We applied custom tools to improve the readability and compared readability before and after text optimization.
Results: Commonly used educational materials were more complex than the recommended reading level for the general public. Genetic health information entries from Genetics Home Reference (n = 1279) were written at a median 13.0 grade level. MedlinePlus entries, which are not exclusively genetic (n = 1030), had a median grade level of 7.7. When we optimized text for the 59 actionable conditions by prioritizing medical details using a standard structure, the average reading grade level improved.
Conclusion: Factors that increase complexity are long sentences and difficult words. Future strategies to reduce complexity include prioritizing relevant details and using more illustrations. Simplifying and providing standardized online health resources would benefit diverse consumers and promote inclusivity.
Keywords: consumer health informatics; educational resources; genomics; infographics; natural language processing.