Single Cell RNA Sequencing of Human Milk-Derived Cells Reveals Sub-Populations of Mammary Epithelial Cells with Molecular Signatures of Progenitor and Mature States: a Novel, Non-invasive Framework for Investigating Human Lactation Physiology

J Mammary Gland Biol Neoplasia. 2020 Dec;25(4):367-387. doi: 10.1007/s10911-020-09466-z. Epub 2020 Nov 20.


Cells in human milk are an untapped source, as potential "liquid breast biopsies", of material for investigating lactation physiology in a non-invasive manner. We used single cell RNA sequencing (scRNA-seq) to identify milk-derived mammary epithelial cells (MECs) and their transcriptional signatures in women with diet-controlled gestational diabetes (GDM) with normal lactation. Methodology is described for coordinating milk collections with single cell capture and library preparation via cryopreservation, in addition to scRNA-seq data processing and analyses of MEC transcriptional signatures. We comprehensively characterized 3740 cells from milk samples from two mothers at two weeks postpartum. Most cells (>90%) were luminal MECs (luMECs) expressing lactalbumin alpha and casein beta and positive for keratin 8 and keratin 18. Few cells were keratin 14+ basal MECs and a small immune cell population was present (<10%). Analysis of differential gene expression among clusters identified six potentially distinct luMEC subpopulation signatures, suggesting the potential for subtle functional differences among luMECs, and included one cluster that was positive for both progenitor markers and mature milk transcripts. No expression of pluripotency markers POU class 5 homeobox 1 (POU5F1, encoding OCT4) SRY-box transcription factor 2 (SOX2) or nanog homeobox (NANOG), was observed. These observations were supported by flow cytometric analysis of MECs from mature milk samples from three women with diet-controlled GDM (2-8 mo postpartum), indicating a negligible basal/stem cell population (epithelial cell adhesion molecule (EPCAM)-/integrin subunit alpha 6 (CD49f)+, 0.07%) and a small progenitor population (EPCAM+/CD49f+, 1.1%). We provide a computational framework for others and future studies, as well as report the first milk-derived cells to be analyzed by scRNA-seq. We discuss the clinical potential and current limitations of using milk-derived cells as material for characterizing human mammary physiology.

Keywords: Human lactation; Human milk; Milk-derived cells; Single cell RNA sequencing.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Computational Biology / methods*
  • Diabetes, Gestational / diet therapy
  • Diabetes, Gestational / metabolism*
  • Epithelial Cells / metabolism
  • Female
  • Flow Cytometry
  • Humans
  • Lactation / physiology*
  • Mammary Glands, Human / cytology
  • Mammary Glands, Human / metabolism*
  • Milk, Human / cytology*
  • Postpartum Period / metabolism
  • Pregnancy
  • RNA-Seq / methods
  • Randomized Controlled Trials as Topic
  • Single-Cell Analysis
  • Stem Cells / metabolism