Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 3 (7), RESEARCH0034

Accurate Normalization of Real-Time Quantitative RT-PCR Data by Geometric Averaging of Multiple Internal Control Genes


Accurate Normalization of Real-Time Quantitative RT-PCR Data by Geometric Averaging of Multiple Internal Control Genes

Jo Vandesompele et al. Genome Biol.


Background: Gene-expression analysis is increasingly important in biological research, with real-time reverse transcription PCR (RT-PCR) becoming the method of choice for high-throughput and accurate expression profiling of selected genes. Given the increased sensitivity, reproducibility and large dynamic range of this methodology, the requirements for a proper internal control gene for normalization have become increasingly stringent. Although housekeeping gene expression has been reported to vary considerably, no systematic survey has properly determined the errors related to the common practice of using only one control gene, nor presented an adequate way of working around this problem.

Results: We outline a robust and innovative strategy to identify the most stably expressed control genes in a given set of tissues, and to determine the minimum number of genes required to calculate a reliable normalization factor. We have evaluated ten housekeeping genes from different abundance and functional classes in various human tissues, and demonstrated that the conventional use of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested. The geometric mean of multiple carefully selected housekeeping genes was validated as an accurate normalization factor by analyzing publicly available microarray data.

Conclusions: The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which, among other things, opens up the possibility of studying the biological relevance of small expression differences.


Figure 1
Figure 1
Single control normalization error values (E) were calculated as the ratio of the ratio of two control genes in two different samples (see Materials and methods), and summarized here as cumulative distribution plots for the different tissue panels, pointing at considerable variation in housekeeping gene expression.
Figure 2
Figure 2
Average expression stability values (M) of remaining control genes during stepwise exclusion of the least stable control gene in the different tissue panels (black circle, neuroblastoma; white circle, normal pool; white square, bone marrow; black square, leukocyte; gray circle, fibroblast; gray square, systematic error). See also Table 3 for the ranking of the genes according to their expression stability.
Figure 3
Figure 3
Determination of the optimal number of control genes for normalization. (a) Pairwise variation (Vn/n + 1) analysis between the normalization factors NFn and NFn + 1 to determine the number of control genes required for accurate normalization (arrowhead = optimal number of control genes for normalization). (b) Selected scatterplots of normalization factors before (x-axis) and after (y-axis) inclusion of an (n + 1)th control gene (r = Spearman rank correlation coefficient). Low variation values, V, correspond to high correlation coefficients. It is clear that there is no need to include more than three, four or five control genes for fibroblast (A), neuroblastoma (B) and the normal pooled tissues (D), respectively. In contrast, panel C demonstrates that inclusion of at least a fourth control gene is required for the normal pooled tissues.
Figure 4
Figure 4
Validation of the gene stability measure and thegeometric averaging of carefully selected control genes for normalization. (a) Validation of the gene stability measure. The average gene-specific variation (determined as coefficient of variation, in percent) for the three control genes with the smallest variation within each tissue panel after normalization with three different factors calculated as the geometric mean of the three control genes with the lowest (NF3(1-3)), highest (NF3(8-10)) and intermediate (NF3(6-8)) gene stability values (as determined by geNorm). NB, neuroblastoma; POOL, normal pooled tissues; LEU, leukocytes; BM, bone marrow; FIB, fibroblasts. (b) Geometric averaging. Comparison of frequently applied microarray scaling factors and the proposed RT-PCR normalization factor based on the geometric mean of selected control genes (NF5, geometric mean of the five control genes with the lowest M value; NFM < 0.7, geometric mean of control genes with M < 0.7; see Results), calculated for eight hybridizations from publicly available microarray data [14].
Figure 5
Figure 5
Logarithmic histogram of the expression levels of 10 internal control genes determined in 13 different human tissues, normalized to the geometric mean of 6 control genes (GAPD, HPRT1, SDHA, TBP, UBC, YWHAZ). An approximately 400-fold expression difference is apparent between the most and least abundantly expressed gene, as well as tissue-specific differences in expression levels for particular genes (for example, B2M).

Similar articles

See all similar articles

Cited by 5,827 PubMed Central articles

See all "Cited by" articles


    1. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470. - PubMed
    1. Fink L, Seeger W, Ermert L, Hanze J, Stahl U, Grimminger F, Kummer W, Bohle RM. Real-time quantitative RT-PCR after laser-assisted cell picking. Nat Med. 1998;4:1329–1333. doi: 10.1038/3327. - DOI - PubMed
    1. Heid CA, Stevens J, Livak KJ, Williams PM. Real time quantitative PCR. Genome Res. 1996;6:986–994. - PubMed
    1. Higuchi R, Fockler C, Dollinger G, Watson R. Kinetic PCR analysis: real-time monitoring of DNA amplification reactions. Biotechnology. 1993;11:1026–1030. - PubMed
    1. Solanas M, Moral R, Escrich E. Unsuitability of using ribosomal RNA as loading control for Northern blot analyses related to the imbalance between messenger and ribosomal RNA content in rat mammary tumors. Anal Biochem. 2001;288:99–102. doi: 10.1006/abio.2000.4889. - DOI - PubMed

Publication types

LinkOut - more resources