Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jul;5(7):597-600.
doi: 10.1038/nmeth.1224. Epub 2008 Jun 15.

Isoform Discovery by Targeted Cloning, 'Deep-Well' Pooling and Parallel Sequencing

Affiliations
Free PMC article

Isoform Discovery by Targeted Cloning, 'Deep-Well' Pooling and Parallel Sequencing

Kourosh Salehi-Ashtiani et al. Nat Methods. .
Free PMC article

Abstract

Describing the 'ORFeome' of an organism, including all major isoforms, is essential for a system-level understanding of any species; however, conventional cloning and sequencing approaches are prohibitively costly and labor-intensive. We describe a potentially genome-wide methodology for efficiently capturing new coding isoforms using reverse transcriptase (RT)-PCR recombinational cloning, 'deep-well' pooling and a next-generation sequencing platform. This ORFeome discovery pipeline will be applicable to any eukaryotic species with a sequenced genome.

Figures

Figure 1
Figure 1
The isoform discovery pipeline. First, ORFs are captured by RT-PCR experiments, recombinationally cloned and transformed into E. coli. “minipools” of transformants for each gene may contain different isoforms. Second, “deep well” pools are constructed by pooling the PCR-amplified ORF sequence from one transformant for each of many genes. This method of pooling ensures normalization of ORFs and avoids concurrent sequencing of multiple isoforms. Third, parallel sequencing is carried out separately on each deep well. The obtained reads are assembled using a “Smart Bridging Assembly” (SBA) algorithm (Supplemental Methods online). Resulting ORF contigs are filtered for the presence of non-canonical splice acceptor/receptor sites and prior presence in sequence databases to identify unique novel isoforms.
Figure 2
Figure 2
Examples of identified transcripts. Genomic alignments of three representative genes from sets 1–3 compared with RefSeq (black), MGC (blue), GenBank (dark green) and dbEST (light green), following removal of redundant alignments. Results are shown for 3 of 44 genes from which ORFs were cloned (the complete set is in Supplementary Fig. 2 online). Transcripts with exon/intron structures that were exactly recapitulated, over the entire length, by individual MGC, Refseq, or GenBank transcripts, including ESTs, are shown in gray, while those that are novel are shown for the pooled tissue (purple), brain (orange), and testis (cyan) cloning experiments. The positions of primers used for RT-PCR are shown in red. Color saturation indicates % identity, ranging from light ( ≤ 90% identity) to dark ( ≥ 99% identity). Splice signals other than the canonical GT donor and AG acceptor are shown for all sequences. Novel isoforms with only canonical or GC…AG signals are indicated by an asterisk ( * ). For simplicity, ESTs with unusual splice signals are not shown, but they were included in the assessment of novelty. Chromosomal coordinates are indicated at the top of each panel. The blue bar at the bottom of each panel indicates the lengths of exonic (white on blue) and intronic (reversed) segments, in bp (C = 100; K = 1000); introns are compressed to highlight exons.
Figure 3
Figure 3
Sequence assembly results and simulation. (a) Success rates of assembly using conventional and smart bridging assembly methods, at varying fold-coverage (see text for details). The percentage of ORFs with 100% correctly assembled gene structure (exon-intron) was computed (n = 10 repeats). Error bars represent the standard deviation. (b) The set of ORF sequences used in the 454 FLX run were randomly fragmented in silico with average fragment size of 550 base pairs and range of 300–800 bp. Different sequence read lengths and fold coverages were simulated. For each ORF, we assembled contigs based on all available sequence reads that have a corresponding best match in the genomic region of the ORF. The graphs illustrate sensitivity by gene, that is, the percentage of ORFs whose gene structure (all exons) is 100% correctly assembled.

Comment in

  • Hunting hidden transcripts.
    Carninci P. Carninci P. Nat Methods. 2008 Jul;5(7):587-9. doi: 10.1038/nmeth0708-587. Nat Methods. 2008. PMID: 18587315 No abstract available.

Similar articles

See all similar articles

Cited by 19 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback