Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Apr 25;103(17):6605-10.
doi: 10.1073/pnas.0601688103. Epub 2006 Apr 24.

Short Blocks From the Noncoding Parts of the Human Genome Have Instances Within Nearly All Known Genes and Relate to Biological Processes

Affiliations
Free PMC article
Comparative Study

Short Blocks From the Noncoding Parts of the Human Genome Have Instances Within Nearly All Known Genes and Relate to Biological Processes

Isidore Rigoutsos et al. Proc Natl Acad Sci U S A. .
Free PMC article

Abstract

Using an unsupervised pattern-discovery method, we processed the human intergenic and intronic regions and catalogued all variable-length patterns with identically conserved copies and multiplicities above what is expected by chance. Among the millions of discovered patterns, we found a subset of 127,998 patterns, termed pyknons, which have additional nonoverlapping instances in the untranslated and protein-coding regions of 30,675 transcripts from 20,059 human genes. The pyknons arrange combinatorially in the untranslated and coding regions of numerous human genes where they form mosaics. Consecutive instances of pyknons in these regions show a strong bias in their relative placement, favoring distances of approximately 22 nucleotides. We also found pyknons to be enriched in a statistically significant manner in genes involved in specific processes, e.g., cell communication, transcription, regulation of transcription, signaling, transport, etc. For approximately 1/3 of the pyknons, the intergenic/intronic instances of their reverse complement lie within 380,084 nonoverlapping regions, typically 60-80 nucleotides long, which are predicted to form double-stranded, energetically stable, hairpin-shaped RNA secondary structures; additionally, the pyknons subsume approximately 40% of the known microRNA sequences, thus suggesting a possible link with posttranscriptional gene silencing and RNA interference. Cross-genome comparisons reveal that many of the pyknons have instances in the 3' UTRs of genes from other vertebrates and invertebrates where they are overrepresented in similar biological processes, as in the human genome. These unexpected findings suggest potential unique functional connections between the coding and noncoding parts of the human genome.

Conflict of interest statement

Conflict of interest statement: No conflicts declared.

Figures

Fig. 1.
Fig. 1.
Pyknons in the 3′ UTRs of the apoptosis inhibitor birc4 (shown above the horizontal line) and nine other genes. The sequences below the line contain some of birc4's pyknons, but in different arrangements; they also contain instances of other pyknons that are not present in birc4's 3′ UTR. The 10 3′ UTRs are pyknon mosaics. The shown pyknons, whether highlighted or in dark gray, have 40 or more instances in the genome's intergenic/intronic regions and additional copies in the untranslated and coding regions of these and other genes. We highlight only those pyknons that appear two or more times in the shown 3′ UTRs. The light gray string -(xx)- indicates that xx nucleotides separate the pyknons that surround it. To appreciate the importance of this picture, it suffices to track the number of copies and relative position of TGCACTCCAGCCTGGG, TAATCCCAGCACTTTGGGA, GGCTGAGGCAGGAGAAT, and GAGGTTGCAGTGAGCC.
Fig. 2.
Fig. 2.
Probability density functions for the distance between the starting points of consecutive instances of pyknons, shown separately for 5′ UTRs, CRs, and 3′ UTRs. The distributions have long tails, and only a portion is shown. Note the peaks at x = 18, 22, 24, 26, 29, 30, and 31.

Similar articles

See all similar articles

Cited by 49 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback