The inner ear contains sensory epithelia that detect head movements, gravity and sound. It is unclear how to develop these sensory epithelia from pluripotent stem cells, a process that will be critical for modelling inner ear disorders or developing cell-based therapies for profound hearing loss and balance disorders. So far, attempts to derive inner ear mechanosensitive hair cells and sensory neurons have resulted in inefficient or incomplete phenotypic conversion of stem cells into inner-ear-like cells. A key insight lacking from these previous studies is the importance of the non-neural and preplacodal ectoderm, two critical precursors during inner ear development. Here we report the stepwise differentiation of inner ear sensory epithelia from mouse embryonic stem cells (ESCs) in three-dimensional culture. We show that by recapitulating in vivo development with precise temporal control of signalling pathways, ESC aggregates transform sequentially into non-neural, preplacodal and otic-placode-like epithelia. Notably, in a self-organized process that mimics normal development, vesicles containing prosensory cells emerge from the presumptive otic placodes and give rise to hair cells bearing stereocilia bundles and a kinocilium. Moreover, these stem-cell-derived hair cells exhibit functional properties of native mechanosensitive hair cells and form specialized synapses with sensory neurons that have also arisen from ESCs in the culture. Finally, we demonstrate how these vesicles are structurally and biochemically comparable to developing vestibular end organs. Our data thus establish a new in vitro model of inner ear differentiation that can be used to gain deeper insight into inner ear development and disorder.