To analyze candidate genes and establish complex genotype-phenotype relationships against a background of high natural genome sequence variability, we have developed approaches to (i) compare candidate gene sequence information in multiple individuals; (ii) predict haplotypes from numerous variants; and (iii) classify haplotypes and identify specific sequence variants, or combinations of variants (pattern), associated with the phenotype. Using the human mu opioid receptor gene (OPRM1) as a model system, we have combined these approaches to test a potential role of OPRM1 in substance (heroin/cocaine) dependence. All known functionally relevant regions of this prime candidate gene were analyzed by multiplex sequence comparison in 250 cases and controls; 43 variants were identified and 52 different haplotypes predicted in the subgroup of 172 African-Americans. These haplotypes were classified by similarity clustering into two functionally related categories, one of which was significantly more frequent in substance-dependent individuals. Common to this category was a characteristic pattern of sequence variants [-1793T-->A, -1699Tins, -1320A-->G, -111C-->T, +17C-->T (A6V)], which was associated with substance dependence. This study provides an example of approaches that have been successfully applied to the establishment of complex genotype-phenotype relationships in the presence of abundant DNA sequence variation.