Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2007 Jul 31;104(31):12825-30.
doi: 10.1073/pnas.0701291104. Epub 2007 Jul 25.

Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789

Affiliations
Comparative Study

Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789

Wu Wei et al. Proc Natl Acad Sci U S A. .

Abstract

We sequenced the genome of Saccharomyces cerevisiae strain YJM789, which was derived from a yeast isolated from the lung of an AIDS patient with pneumonia. The strain is used for studies of fungal infections and quantitative genetics because of its extensive phenotypic differences to the laboratory reference strain, including growth at high temperature and deadly virulence in mouse models. Here we show that the approximately 12-Mb genome of YJM789 contains approximately 60,000 SNPs and approximately 6,000 indels with respect to the reference S288c genome, leading to protein polymorphisms with a few known cases of phenotypic changes. Several ORFs are found to be unique to YJM789, some of which might have been acquired through horizontal transfer. Localized regions of high polymorphism density are scattered over the genome, in some cases spanning multiple ORFs and in others concentrated within single genes. The sequence of YJM789 contains clues to pathogenicity and spurs the development of more powerful approaches to dissecting the genetic basis of complex hereditary traits.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Alignment of the chromosome XIV sequences from YJM789 and S288c. (A) YJM789 contigs mapped to their locations on the S288c genome. (B) Sequence similarity between YJM789 contigs with a length of at least 10 kb and their corresponding sequences of S288c represented by color and coded from yellow (low) to orange (high). (C) Sequences of ≥100 bp that are present in YJM789 but absent in S288c are represented by blue lines. Similar sequences of <100 bp are represented by gray lines. (D) Sequence alignment between YJM789 and S288c chromosome XIV. Identical sequences are linked by lines. Red represents forward alignment. Green represents reverse complementary alignment. (E) Sequences of ≥100 bp that are absent in YJM789 but present in S288c are represented by blue lines. Similar sequences of <100 bp are represented by gray lines. (F) Repeat sequences of S288c are represented as follows: cyan rectangles, long terminal repeats; pink rectangles, retrotransposons; black rectangles, telomeres; black circle, centromere. (G) Coordinates of S288c in kilobase pairs.
Fig. 2.
Fig. 2.
Phylogenetic tree of YJM-GNAT homologs. (A) YJM-GNAT homologs were retrieved by BLASTP against the nonredundant database by using the threshold (E value of ≤1 × 10−5, identity of ≥30%, and the alignment matching at least 75% of the length of both query and subject sequences). Representative species are shown. Multiple alignments were built by CLUSTALW. A phylogenetic tree was then constructed by using the neighbor-joining method of the PHYLIP package. GNAT homologs are represented by their species names. (B and C) Phylogenies of two other YJM789 genes encoding acetyltrasferases for comparison: ELP3 (B) and ECM40 (C).
Fig. 3.
Fig. 3.
Highly polymorphic region on chromosome I. (A) SNP distribution between YJM789 and S288c determined from a 1-kb sliding window over the nonrepeat sequence of S288c chromosome I. (B) Clustergram of the sequence similarity of chromosomes I of YJM789 compared with S288c, RM11-1a, and S. paradoxus (S. para) using a 1-kb sliding window. (C) Clustergram of the sequence similarity of the high polymorphism region of YJM789 chromosome I compared with S288c, RM11-1a, and S. paradoxus using a 100-bp sliding window. (D) Phylogeny of chromosome I sequences excluding the interval containing the high polymorphism region. The phylogenetic tree was constructed from nucleotide sequence alignments generated by using the program VISTA and the neighbor-joining method of the PHYLIP package. (E) Phylogeny of DUP240 region from YARWdelta6 to YARWdelta7 in S. paradoxus and all sequenced S. cerevisiae strains (38). A phylogenetic tree was constructed from nucleotide sequence alignments generated by CLUSTALW and the neighbor-joining method of the PHYLIP package. The scale bar indicates the evolutionary distance (number of substitutions per nucleotide position). (F) Alignments of YJM789, S288c, and S. paradoxus over the high polymorphism region using S288c (Upper) or YJM789 (Lower) as the reference sequence. The y axis represents the sequence similarity between two genomes along the reference sequence (graphs generated in VISTA). Sequence identity is shown for each pairwise comparison in a 100-bp sliding window. Note that differences in sequence lengths arise because of indels between YJM789 and S288c. Genes, as encoded in S288c, are represented by colored boxes: red, verified ORFs; pink, uncharacterized ORFs; gray, dubious ORFs; black, tRNAs and long terminal repeats.
Fig. 4.
Fig. 4.
Polymorphism density across PDR5 between YJM789 and S288c. (A) Polymorphism distribution on chromosome XV from kilobases 600 to 640. Dashed lines indicate the start and stop positions of the PDR5 ORF. (B) The distribution of nonsynonymous and synonymous substitutions within the PDR5 ORF as determined from a 900-bp sliding window (each slide is 90 bp). The possibilities for nonsynonymous and synonymous substitutions were calculated as described previously (61). Red, nonsynonymous substitutions; green, synonymous substitutions; blue horizontal bars, transmembrane domains; vertical bars at the bottom, substitution sites.

Similar articles

Cited by

References

    1. Steinmetz LM, Sinha H, Richards DR, Spiegelman JI, Oefner PJ, McCusker JH, Davis RW. Nature. 2002;416:326–330. - PubMed
    1. Brem RB, Yvert G, Clinton R, Kruglyak L. Science. 2002;296:752–755. - PubMed
    1. Yvert G, Brem RB, Whittle J, Akey JM, Foss E, Smith EN, Mackelprang R, Kruglyak L. Nat Genet. 2003;35:57–64. - PubMed
    1. Deutschbauer AM, Davis RW. Nat Genet. 2005;37:1333–1340. - PubMed
    1. Ben-Ari G, Zenvirth D, Sherman A, David L, Klutstein M, Lavi U, Hillel J, Simchen G. PLoS Genet. 2006;2:e195. - PMC - PubMed

Publication types

Associated data

LinkOut - more resources