Background: Heterocellular hereditary persistence of fetal hemoglobin (HPFH) is a common multifactorial trait characterized by a modest increase of fetal hemoglobin levels in adults. We previously localized a Quantitative Trait Locus for HPFH in an extensive Asian-Indian kindred to chromosome 6q23. As part of the strategy of positional cloning and a means towards identification of the specific genetic alteration in this family, a thorough annotation of the candidate interval based on a strategy of in silico / wet biology approach with comparative genomics was conducted.
Results: The ~1.5 Mb candidate region was shown to contain five protein-coding genes. We discovered a very large uncharacterized gene containing WD40 and SH3 domains (AHI1), and extended the annotation of four previously characterized genes (MYB, ALDH8A1, HBS1L and PDE7B). We also identified several genes that do not appear to be protein coding, and generated 17 kb of novel transcript sequence data from re-sequencing 97 EST clones.
Conclusion: Detailed and thorough annotation of this 1.5 Mb interval in 6q confirms a high level of aberrant transcripts in testicular tissue. The candidate interval was shown to exhibit an extraordinary level of alternate splicing - 19 transcripts were identified for the 5 protein coding genes, but it appears that a significant portion (14/19) of these alternate transcripts did not have an open reading frame, hence their functional role is questionable. These transcripts may result from aberrant rather than regulated splicing.