Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Nov;1(2):E45.
doi: 10.1371/journal.pbio.0000045. Epub 2003 Nov 17.

The Genome Sequence of Caenorhabditis Briggsae: A Platform for Comparative Genomics

Affiliations
Free PMC article
Comparative Study

The Genome Sequence of Caenorhabditis Briggsae: A Platform for Comparative Genomics

Lincoln D Stein et al. PLoS Biol. .
Free PMC article

Abstract

The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found strong evidence for 1,300 new C. elegans genes. In addition, comparisons of the two genomes will help to understand the evolutionary forces that mold nematode genomes.

Conflict of interest statement

The authors have declared that no conflicts of interest exist.

Figures

Figure 1
Figure 1. Joint Refinement of C. elegans and C. briggsae Gene Models: acy-4
When annotating the C. briggsae and C. elegans acy-4 orthologs, we chose the Genefinder ce-acy-4 prediction and the Genefinder cb-acy-4 prediction because, out of the 12 possible combinations of a C. briggsae and a C. elegans prediction, this pair shows the most similarity to each other. Coding sequence (CDS) conservation between cb-acy-4 and ce-acy-4 provides evidence for as many as 12 additional N-terminal exons in the Genefinder ce-acy-4 prediction, as compared to T01C2.1, the WS77 ce-acy-4 prediction. Subsequently, four of the additional N-terminal exons that were predicted by FGENESH and Genefinder were confirmed by new EST data (marked with asterisks).
Figure 2
Figure 2. Distribution of KA/KS Ratio among Ortholog Pairs
Figure 3
Figure 3. Tree View of All Chemosensory Receptor Genes in the Sra Subfamily
C. elegans is shown in white background and C. briggsae in light blue background. The arrows indicate regions of C. elegans-specific expansion of the family. The inset shows a schematic of the region of C. elegans chromosome I corresponding to the sra expansion in the upper right of the tree. C. briggsae genes are named using the prefix CBG, while C. elegans genes are numbered consecutively across the cosmid on which they were first identified. The root of this tree is arbitrary.
Figure 4
Figure 4. A WABA Alignment over a Known C. elegans Gene (snt-1)
WABA coding segments are shown as dark blue, strong alignments as medium blue, and weak alignments as grey. Regions that do not align are shown as dotted lines. The alignments of three sequenced C. elegans mRNA sequences are also shown for comparison.
Figure 5
Figure 5. Representation of the C. briggsae WGS Assembly on a C. elegans Scaffold Using Colinearity Relationships
C. briggsae supercontigs are shown on the y-axis, and C. elegans chromosomes from WS77 are shown on the x-axis. Red dots and lines indicate regions of colinearity identified by WABA alignments between the two genomes. Blue dots are the positions of protein orthologs. Green areas show where blue and red intersect, indicating concordance between the positions of ortholog pairs and colinearity blocks.
Figure 6
Figure 6. Evolutionary Divergence across C. elegans Chromosome V
Each panel corresponds to a C. elegans chromosome, and the individual tracks show different measurements of evolutionary divergence. (A) Regions of synteny (colinearity) between C. elegans and C. briggsae. White areas correspond to areas where the two genomes could not be aligned owing to divergence and are more abundant in the chromosome arms. (B) C. elegans gene density and genetic map position. Gene density is plotted as a histogram, showing a relatively uniform distribution of genes across each chromosome. The relationship of the position of genes on the genetic map to their position on the sequence is superimposed on the y-axis. Steeper slopes in this plot indicate higher rates of meiotic recombination. Inflection points in the genetic map plot reflect the division of the chromosomes into recombinationally active “arms” and recombinationally slow “centers.” (C) C. briggsae/C. elegans orthologs normalized for gene density in 100 kbp sliding windows. Prominent regions of low ortholog density are seen on chromosome arms. (D) C. elegans “orphans,” genes with no significant protein similarity in C. briggsae or the non-C. elegans portion of SwissProt. This histogram has been normalized for gene density in 100 kbp sliding windows. Spikes in orphan density seem to correlate with regions of low ortholog density. (E) C. elegans genes that mutate to lethality or are lethal in RNAi screens, in 100 kbp sliding windows normalized to overall gene density. This track shows the distribution of essential genes and demonstrates their tendency to cluster in the chromosome centers. (F) Repetitive elements, binned in 100 kbp sliding windows. Repeat-rich regions correlate with both the absence of significant syntenic coverage and ortholog-poor regions. (G) The KA/KS ratio in ortholog pairs. Lower values indicate greater levels of purifying selection. (H) The rate of KS within ortholog pairs, in 100 kbp sliding windows.
Figure 7
Figure 7. A Region on C. elegans Chromosome III Containing 33 Genes, and the Syntenic C. briggsae Region, Which Has 38 Genes
Inversions have broken the syntenic region into three conserved segments. Genes that do not have an ortholog in this syntenic region are in grey; orthologs are joined by lines. In C. elegans, genes that differ substantially in structure between the WS77 and hybrid gene sets are marked with an asterisk.
Figure 8
Figure 8. Phylogeny of Caenorhabditis
Courtesy of Karin Kiontke and David H. A. Fitch (unpublished data). This phylogeny is based on weighted-parsimony analysis of DNA sequences from three genes, concatenated: 18S and 28S rRNA genes, and the RNA polymerase II gene. The root of this tree is arbitrary.

Similar articles

See all similar articles

Cited by 394 articles

See all "Cited by" articles

References

    1. Aguinaldo AM, Turbeville JM, Linford LS, Rivera MC, Garey JR, et al. Evidence for a clade of nematodes, arthropods, and other moulting animals. Nature. 1997;387:489–493. - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang A, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Ambros V, Lee RC, Lavanway A, Williams PT, Jewell D. MicroRNAs and other tiny endogenous RNAs in C. elegans. Curr Biol. 2003;13:807–818. - PubMed
    1. Bao Z, Eddy SR. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 2002;12:1269–1276. - PMC - PubMed
    1. Barnes TM, Kohara Y, Coulson A, Hekimi S. Meiotic recombination, noncoding DNA, and genomic organization in Caenorhabditis elegans. Genetics. 1995;141:159–179. - PMC - PubMed

Publication types

MeSH terms

Feedback