The pluripotency of newly developed human induced pluripotent stem cells (iPSCs) is usually characterized by physiological parameters; i.e., by their ability to maintain the undifferentiated state and to differentiate into derivatives of the 3 germ layers. Nevertheless, a molecular comparison of physiologically normal iPSCs to the "gold standard" of pluripotency, embryonic stem cells (ESCs), often reveals a set of genes with different expression and/or methylation patterns in iPSCs and ESCs. To evaluate the contribution of the reprogramming process, parental cell type, and fortuity in the signature of human iPSCs, we developed a complete isogenic reprogramming system. We performed a genome-wide comparison of the transcriptome and the methylome of human isogenic ESCs, 3 types of ESC-derived somatic cells (fibroblasts, retinal pigment epithelium and neural cells), and 3 pairs of iPSC lines derived from these somatic cells. Our analysis revealed a high input of stochasticity in the iPSC signature that does not retain specific traces of the parental cell type and reprogramming process. We showed that 5 iPSC clones are sufficient to find with 95% confidence at least one iPSC clone indistinguishable from their hypothetical isogenic ESC line. Additionally, on the basis of a small set of genes that are characteristic of all iPSC lines and isogenic ESCs, we formulated an approach of "the best iPSC line" selection and confirmed it on an independent dataset.
Keywords: DNA methylation; gene transcription; genome-wide analysis; human pluripotent stem cells; isogenic; reprogramming; somatic memory.