Comparing baseline characteristics between groups: an introduction to the CBCgrps package

Ann Transl Med. 2017 Dec;5(24):484. doi: 10.21037/atm.2017.09.39.


A usual practice in observational studies is the comparison of baseline characteristics of participants between study groups. The overall population can be grouped by clinical outcome or exposure status. A combined table reporting baseline characteristics is usually displayed, for the overall population and then separately for each group. The last column usually gives the P value for the comparison between study groups. In the conventional research model, the variables for which data are collected are limited in number. It is thus feasible to calculate descriptive data one by one and to manually create the table. The availability of EHR and big data mining techniques makes it possible to explore a far larger number of variables. However, manual tabulation of big data is particularly error prone; it is exceedingly time-consuming to create and revise such tables manually. In this paper, we introduce an R package called CBCgrps, which is designed to automate and streamline the generation of such tables when working with big data. The package contains two functions, twogrps() and multigrps(), which are used for comparisons between two and multiple groups, respectively.

Keywords: Big data; R package; baseline characteristics; observational study; publication-style.

Publication types

  • Editorial