India is experiencing a rapid spread of human immunodeficiency virus type 1 (HIV-1), primarily through heterosexual transmission of subtype C viruses. To delineate the molecular features of HIV-1 circulating in India, we sequenced the V3-V4 region of viral env from 21 individuals attending an HIV clinic in Calcutta, the most populous city in the eastern part of the country, and analyzed these and the other Indian sequences in the HIV database. Twenty individuals were infected with viruses having a subtype C env, and one had viruses with a subtype A env. Analyses of 192 subtype C sequences that included one sequence for each subject from this study and from the HIV database revealed that almost all sequences from India, along with a small number from other countries, form a phylogenetically distinct lineage within subtype C, which we designate C(IN). Overall, C(IN) lineage sequences were more closely related to each other (level of diversity, 10.2%) than to subtype C sequences from Botswana, Burundi, South Africa, Tanzania, and Zimbabwe (range, 15.3 to 20.7%). Of the three positions identified as signature amino acid substitution sites for C(IN) sequences (K340E, K350A, and G429E), 56% of the C(IN) sequences contained all three amino acids while 87% of the sequences contained at least two of these substitutions. Among the non-C(IN) sequences, all three amino acids were present in 2%, while 22% contained two or more of these amino acids. These results suggest that much of the current Indian epidemic is descended from a single introduction into the country. Identification of conserved signature amino acid positions could assist epidemiologic tracking and has implications for the development of a vaccine against subtype C HIV-1 in India.