Down syndrome is caused by an extra copy of human chromosome 21 and the resultant dosage-related overexpression of genes contained within it. To efficiently direct experiments to determine specific gene-phenotype correlations, it is necessary to identify all genes within 21q and assess their functional associations and expression patterns. Analysis of the complete finished sequence of 21q resulted in annotated 225 genes and gene models, most of which were incomplete and/or had little or no experimental verification. Here we correct or complete the genomic structures of 16 genes, 4 of which were not reported in the annotation of the complete sequence. Our data include the identification of six genes encoding short or ambiguous open reading frames; the identification of three cases in which alternative splicing produces two structurally unrelated protein sequences; and the identification of six genes encoding proteins with functional motifs, two genes with unusually low similarity to their orthologous mouse proteins, and four genes with significant conservation in Drosophila melanogaster. We further demonstrate that an additional nine gene models represent bona fide transcripts and develop expression patterns for these genes plus nine additional novel chromosome 21 genes and four paralogous genes mapping elsewhere in the human genome. These data have implications for generating complete transcript maps of chromosome 21 and for the entire human genome, and for defining expression abnormalities in Down syndrome and mouse models.
(c)2002 Elsevier Science (USA).