Measuring the diffusion of innovations with paragraph vector topic models

PLoS One. 2020 Jan 22;15(1):e0226685. doi: 10.1371/journal.pone.0226685. eCollection 2020.

Abstract

Measuring the diffusion of innovations from textual data sources besides patent data has not been studied extensively. However, early and accurate indicators of innovation and the recognition of trends in innovation are mandatory to successfully promote economic growth through technological progress via evidence-based policy making. In this study, we propose Paragraph Vector Topic Model (PVTM) and apply it to technology-related news articles to analyze innovation-related topics over time and gain insights regarding their diffusion process. PVTM represents documents in a semantic space, which has been shown to capture latent variables of the underlying documents, e.g., the latent topics. Clusters of documents in the semantic space can then be interpreted and transformed into meaningful topics by means of Gaussian mixture modeling. In using PVTM, we identify innovation-related topics from 170, 000 technology news articles published over a span of 20 years and gather insights about their diffusion state by measuring the topic importance in the corpus over time. Our results suggest that PVTM is a credible alternative to widely used topic models for the discovery of latent topics in (technology-related) news articles. An examination of three exemplary topics shows that innovation diffusion could be assessed using topic importance measures derived from PVTM. Thereby, we find that PVTM diffusion indicators for certain topics are Granger causal to Google Trend indices with matching search terms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Comprehension
  • Diffusion of Innovation*
  • Humans
  • Information Technology*
  • Machine Learning
  • Review Literature as Topic
  • Semantics*
  • Support Vector Machine*

Grants and funding

The German Federal Ministry of Education and Research provided funding for the research project (TOBI - Text Data BasedOutput Indicators as Base of a New Innovation Metric; funding ID: 16IFI001, Prof. Dr. Peter Winker) https://www.bmbf.de/en/index.html The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.