Gene Essentiality Analyzed by In Vivo Transposon Mutagenesis and Machine Learning in a Stable Haploid Isolate of Candida albicans

mBio. 2018 Oct 30;9(5):e02048-18. doi: 10.1128/mBio.02048-18.

Abstract

Knowing the full set of essential genes for a given organism provides important information about ways to promote, and to limit, its growth and survival. For many non-model organisms, the lack of a stable haploid state and low transformation efficiencies impede the use of conventional approaches to generate a genome-wide comprehensive set of mutant strains and the identification of the genes essential for growth. Here we report on the isolation and utilization of a highly stable haploid derivative of the human pathogenic fungus Candida albicans, together with a modified heterologous transposon and machine learning (ML) analysis method, to predict the degree to which all of the open reading frames are required for growth under standard laboratory conditions. We identified 1,610 C. albicans essential genes, including 1,195 with high "essentiality confidence" scores, thereby increasing the number of essential genes (currently 66 in the Candida Genome Database) by >20-fold and providing an unbiased approach to determine the degree of confidence in the determination of essentiality. Among the genes essential in C. albicans were 602 genes also essential in the model budding and fission yeasts analyzed by both deletion and transposon mutagenesis. We also identified essential genes conserved among the four major human pathogens C. albicans, Aspergillus fumigatus, Cryptococcus neoformans, and Histoplasma capsulatum and highlight those that lack homologs in humans and that thus could serve as potential targets for the design of antifungal therapies.IMPORTANCE Comprehensive understanding of an organism requires that we understand the contributions of most, if not all, of its genes. Classical genetic approaches to this issue have involved systematic deletion of each gene in the genome, with comprehensive sets of mutants available only for very-well-studied model organisms. We took a different approach, harnessing the power of in vivo transposition coupled with deep sequencing to identify >500,000 different mutations, one per cell, in the prevalent human fungal pathogen Candida albicans and to map their positions across the genome. The transposition approach is efficient and less labor-intensive than classic approaches. Here, we describe the production and analysis (aided by machine learning) of a large collection of mutants and the comprehensive identification of 1,610 C. albicans genes that are essential for growth under standard laboratory conditions. Among these C. albicans essential genes, we identify those that are also essential in two distantly related model yeasts as well as those that are conserved in all four major human fungal pathogens and that are not conserved in the human genome. This list of genes with functions important for the survival of the pathogen provides a good starting point for the development of new antifungal drugs, which are greatly needed because of the emergence of fungal pathogens with elevated resistance and/or tolerance of the currently limited set of available antifungal drugs.

Keywords: Candida albicans; genome analysis; genomics; machine learning; phenotypic identification; transposons.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aspergillus fumigatus / genetics
  • Candida albicans / genetics*
  • Candida albicans / growth & development
  • Cryptococcus neoformans / genetics
  • DNA Transposable Elements
  • Genes, Essential*
  • Genes, Fungal*
  • Genetics, Microbial / methods*
  • Haploidy
  • Histoplasma / genetics
  • Machine Learning*
  • Mutagenesis, Insertional / methods*

Substances

  • DNA Transposable Elements

Associated data

  • figshare/10.6084/m9.figshare.c.4251182