The aim of this study was to investigate the clinical heterogeneity of Parkinson's disease (PD) among a cohort of Chinese patients in early stages. Clinical data on demographics, motor variables, motor phenotypes, disease progression, global cognitive function, depression, apathy, sleep quality, constipation, fatigue, and L-dopa complications were collected from 138 Chinese PD subjects in early stages (Hoehn and Yahr stages 1-3). The PD subject subtypes were classified using k-means cluster analysis according to the clinical data from five- to three-cluster consecutively. Kappa statistical analysis was performed to evaluate the consistency among different subtype solutions. The cluster analysis indicated four main subtypes: the non-tremor dominant subtype (NTD, n=28, 20.3%), rapid disease progression subtype (RDP, n=7, 5.1%), young-onset subtype (YO, n=50, 36.2%), and tremor dominant subtype (TD, n=53, 38.4%). Overall, 78.3% (108/138) of subjects were always classified between the same three groups (52 always in TD, 7 in RDP, and 49 in NTD), and 98.6% (136/138) between five- and four-cluster solutions. However, subjects classified as NTD in the four-cluster analysis were dispersed into different subtypes in the three-cluster analysis, with low concordance between four- and three-cluster solutions (kappa value=-0.139, P=0.001). This study defines clinical heterogeneity of PD patients in early stages using a data-driven approach. The subtypes generated by the four-cluster solution appear to exhibit ideal internal cohesion and external isolation.