Tracing the Potential Flow of Consumer Data: A Network Analysis of Prominent Health and Fitness Apps

J Med Internet Res. 2017 Jun 28;19(6):e233. doi: 10.2196/jmir.7347.


Background: A great deal of consumer data, collected actively through consumer reporting or passively through sensors, is shared among apps. Developers increasingly allow their programs to communicate with other apps, sensors, and Web-based services, which are promoted as features to potential users. However, health apps also routinely pose risks related to information leaks, information manipulation, and loss of information. There has been less investigation into the kinds of user data that developers are likely to collect, and who might have access to it.

Objective: We sought to describe how consumer data generated from mobile health apps might be distributed and reused. We also aimed to outline risks to individual privacy and security presented by this potential for aggregating and combining user data across apps.

Methods: We purposively sampled prominent health and fitness apps available in the United States, Canada, and Australia Google Play and iTunes app stores in November 2015. Two independent coders extracted data from app promotional materials on app and developer characteristics, and the developer-reported collection and sharing of user data. We conducted a descriptive analysis of app, developer, and user data collection characteristics. Using structural equivalence analysis, we conducted a network analysis of sampled apps' self-reported sharing of user-generated data.

Results: We included 297 unique apps published by 231 individual developers, which requested 58 different permissions (mean 7.95, SD 6.57). We grouped apps into 222 app families on the basis of shared ownership. Analysis of self-reported data sharing revealed a network of 359 app family nodes, with one connected central component of 210 app families (58.5%). Most (143/222, 64.4%) of the sampled app families did not report sharing any data and were therefore isolated from each other and from the core network. Fifteen app families assumed more central network positions as gatekeepers on the shortest paths that data would have to travel between other app families.

Conclusions: This cross-sectional analysis highlights the possibilities for user data collection and potential paths that data is able to travel among a sample of prominent health and fitness apps. While individual apps may not collect personally identifiable information, app families and the partners with which they share data may be able to aggregate consumer data, thus achieving a much more comprehensive picture of the individual consumer. The organizations behind the centrally connected app families represent diverse industries, including apparel manufacturers and social media platforms that are not traditionally involved in health or fitness. This analysis highlights the potential for anticipated and voluntary but also possibly unanticipated and involuntary sharing of user data, validating privacy and security concerns in mobile health.

Keywords: mobile health; privacy; smartphone.

MeSH terms

  • Cross-Sectional Studies
  • Data Collection / methods*
  • Humans
  • Mobile Applications / statistics & numerical data*
  • Telemedicine / statistics & numerical data*