Evaluation of five methods for genome-wide circadian gene identification

J Biol Rhythms. 2014 Aug;29(4):231-42. doi: 10.1177/0748730414537788.


Identification of circadian-regulated genes based on temporal transcriptome data is important for studying the regulation mechanism of the circadian system. However, various computational methods adopting different strategies for the identification of cycling transcripts usually yield inconsistent results even for the same dataset, making it challenging to choose the optimal method for a specific circadian study. To address this challenge, we evaluate 5 popular methods, including ARSER (ARS), COSOPT (COS), Fisher's G test (FIS), HAYSTACK (HAY), and JTK_CYCLE (JTK), based on both simulated and empirical datasets. Our results show that increasing the number of total samples (through improving sampling frequency or lengthening the sampling time window) is beneficial for computational methods to accurately identify circadian transcripts and measure circadian phase. For a given number of total samples, higher sampling frequency is more important for HAY and JTK, and the longer sampling time window is more crucial for ARS and COS, as testified on simulated and empirical datasets from which circadian signals are computationally identified. In addition, the preference of higher sampling frequency or the longer sampling time window is also obvious for JTK, ARS, and COS in estimating circadian phases of simulated periodic profiles. Our results also indicate that attention should be paid to the significance threshold that is used for each method in selecting circadian genes, especially when analyzing the same empirical dataset with 2 or more methods. To summarize, for any study involving genome-wide identification of circadian genes from transcriptome data, our evaluation results provide suggestions for the selection of an optimal method based on specific goal and experimental design.

Keywords: ARSER; COSOPT; Fisher’s G test; HAYSTACK; JTK_CYCLE; circadian gene; circadian rhythms; comparison.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Circadian Rhythm / genetics*
  • Computational Biology / methods
  • Gene Expression Profiling / methods
  • Genome / genetics*
  • Genome-Wide Association Study / methods*
  • Transcriptome / genetics*