Power and sample size calculations for high-throughput sequencing-based experiments

Chung-I Li; David C Samuels; Ying-Yong Zhao; Yu Shyr; Yan Guo

doi:10.1093/bib/bbx061

Power and sample size calculations for high-throughput sequencing-based experiments

Brief Bioinform. 2018 Nov 27;19(6):1247-1255. doi: 10.1093/bib/bbx061.

Authors

Chung-I Li¹, David C Samuels², Ying-Yong Zhao³, Yu Shyr⁴, Yan Guo⁵

Affiliations

¹ Department of Statistics, National Cheng Kung University in Taiwan.
² Department of Molecular Physiology and Biophysics, Vanderbilt University, USA.
³ School of Life Sciences, Northwest University, China.
⁴ Department of Biostatistics, Vanderbilt University, USA.
⁵ Department of Cancer Biology, Vanderbilt University.

Abstract

Power/sample size (power) analysis estimates the likelihood of successfully finding the statistical significance in a data set. There has been a growing recognition of the importance of power analysis in the proper design of experiments. Power analysis is complex, yet necessary for the success of large studies. It is important to design a study that produces statistically accurate and reliable results. Power computation methods have been well established for both microarray-based gene expression studies and genotyping microarray-based genome-wide association studies. High-throughput sequencing (HTS) has greatly enhanced our ability to conduct biomedical studies at the highest possible resolution (per nucleotide). However, the complexity of power computations is much greater for sequencing data than for the simpler genotyping array data. Research on methods of power computations for HTS-based studies has been recently conducted but is not yet well known or widely used. In this article, we describe the power computation methods that are currently available for a range of HTS-based studies, including DNA sequencing, RNA-sequencing, microbiome sequencing and chromatin immunoprecipitation sequencing. Most importantly, we review the methods of power analysis for several types of sequencing data and guide the reader to the relevant methods for each data type.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Chromatin Immunoprecipitation
Genome-Wide Association Study
Heterozygote
High-Throughput Nucleotide Sequencing / methods*
Humans
Microbiota
Mutation
Poisson Distribution
Sequence Analysis, RNA / methods

Abstract

Publication types

MeSH terms

Grants and funding