Meta-Analysis of Gene Popularity: Less Than Half of Gene Citations Stem from Gene Regulatory Networks

Genes (Basel). 2021 Feb 23;12(2):319. doi: 10.3390/genes12020319.


The reasons for selecting a gene for further study might vary from historical momentum to funding availability, thus leading to unequal attention distribution among all genes. However, certain biological features tend to be overlooked in evaluating a gene's popularity. Here we present a meta-analysis of the reasons why different genes have been studied and to what extent, with a focus on the gene-specific biological features. From unbiased datasets we can define biological properties of genes that reasonably may affect their perceived importance. We make use of both linear and nonlinear computational approaches for estimating gene popularity to then compare their relative importance. We find that roughly 25% of the studies are the result of a historical positive feedback, which we may think of as social reinforcement. Of the remaining features, gene family membership is the most indicative followed by disease relevance and finally regulatory pathway association. Disease relevance has been an important driver until the 1990s, after which the focus shifted to exploring every single gene. We also present a resource that allows one to study the impact of reinforcement, which may guide our research toward genes that have not yet received proportional attention.

Keywords: Matthew effect; biological feature; gene; gene regulatory networks; genomics; linear model; machine learning.