Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 16, 16

Can-Seq: A PCR and DNA Sequencing Strategy for Identifying New Alleles of Known and Candidate Genes


Can-Seq: A PCR and DNA Sequencing Strategy for Identifying New Alleles of Known and Candidate Genes

Jiangling Cao et al. Plant Methods.


Background: Forward genetic screens are a powerful approach for identifying the genes contributing to a trait of interest. However, mutants arising in genes already known can obscure the identification of new genes contributing to the trait. Here, we describe a strategy called Candidate gene-Sequencing (Can-Seq) for rapidly identifying and filtering out mutants carrying new alleles of known and candidate genes.

Results: We carried out a forward genetic screen and identified 40 independent Arabidopsis mutants with defects in systemic spreading of RNA interference (RNAi), or more specifically in root-to-shoot transmission of post-transcriptional gene silencing (rtp). To classify the mutants as either representing a new allele of a known or candidate gene versus carrying a mutation in an undiscovered gene, bulk genomic DNA from up to 23 independent mutants was used as template to amplify a collection of 47 known or candidate genes. These amplified sequences were combined into Can-Seq libraries and deep sequenced. Subsequently, mutations in the known and candidate genes were identified using a custom Snakemake script (, and PCR zygosity tests were then designed and used to identify the individual mutants carrying each mutation. Using this approach, we showed that 28 of the 40 rtp mutants carried homozygous nonsense, missense or splice site mutations in one or more of the 47 known or candidate genes. We conducted complementation tests to demonstrate that several of the candidate mutations were responsible for the rtp defect. Importantly, by exclusion, the Can-Seq pipeline also identified rtp mutants that did not carry a causative mutation in any of the 47 known and candidate genes, and these mutants represent an undiscovered gene(s) required for systemic RNAi.

Conclusions: Can-Seq offers an accurate, cost-effective method for classifying new mutants into known versus unknown genes. It has several advantages over existing genetic and DNA sequencing approaches that are currently being used in forward genetic screens for gene discovery. Using Can-Seq in conjunction with map-based gene cloning is a cost-effective approach towards identifying the full complement of genes contributing to a trait of interest.

Keywords: Candidate gene-Sequencing (Can-Seq); Ethyl methanesulfonate (EMS); Forward genetics; Map-based gene cloning; Post-transcriptional gene silencing (PTGS); RNA interference (RNAi); Root-to-shoot transmission of PTGS (RTP).

Conflict of interest statement

Competing interestsThe authors declare that they have no competing interests.


Fig. 1
Fig. 1
Chromosomal locations of the 47 candidate genes known or suspected to be involved in systemic RNAi in Arabidopsis. The position of the 10027-3 GFP reporter locus is indicated on the top end of chromosome 1 between CDC5 and MOS9 (10027)
Fig. 2
Fig. 2
The Can-Seq workflow. Bulk DNA is prepared from leaf tissue of up to 23 independent mutants. Candidate gene PCR amplicons generated from this template are then combined in equimolar ratios and deep sequenced. Bioinformatic analysis using the Can-Seq script ( allows for identification of C to T and G to A substitutions at frequencies above an arbitrarily set threshold of 0.75%; the expected frequency for a homozygous candidate mutation in a bulk of 23 independent mutants is 1 in 23 or ~ 4%. The individual mutant containing the candidate mutation is identified via allele-specific PCR assays. Complementation tests involving crosses between independent mutants carrying candidate mutations in the same gene can be used to resolve whether the EMS-induced nucleotide variant detected by Can-Seq is the causative mutation
Fig. 3
Fig. 3
Missense RDR6 mutation in EMS#159 (R828K), but not in EMS#146 (P1073L) or EMS#157 (G19E), is a putative new rdr6 allele. a Rosette phenotypes of EMS#153 (W227*), EMS#157 (G19E), EMS#146 (P1073L), EMS#159 (R828K) and 10027-3 wild type (WT). The 10027-3 wild type shows systemic post-transcriptional gene silencing (PTGS) of GFP. Based on backcrosses to the 10027-3 wild type and analysis of the BC1F1 phenotype and BC1F2 segregation, the rtp phenotypes of EMS#153, EMS#157, EMS#146 and EMS#159 are inherited as recessive traits. bc EMS#153 was complemented by EMS#157 and EMS#146, and F1 plants from these crosses showed almost complete systemic PTGS of GFP. d EMS#153 was not complemented by EMS#159 and F1 plants from this cross showed defective systemic RNAi of GFP. e Location of the new and putative rdr6 alleles recovered by Can-Seq in the RDR6 locus (AT3G49500). Exon and intron sequences are indicated by thick and narrow lines, respectively. Rosette images are of plants grown in soil under long days for four weeks after planting
Fig. 4
Fig. 4
Pathway to gene discovery using Can-Seq in a forward genetic screen. The Can-Seq strategy shown in blue can be used in the M2 generation to identify mutants that carry a recessive candidate mutation in a gene known to contribute to the trait of interest. By exclusion, novel mutants carrying a causative mutation in an unknown gene can also be identified. For these novel mutants, BC1F2 mapping populations can be produced, and whole genome or exome sequencing of bulked BC1F2 mutant plants can be used to determine the chromosomal vicinity of the unknown gene contributing to the trait of interest. Reverse genetics on candidate genes in the chromosomal vicinity or complementation tests by crossing multiple alleles can then be used to reveal the identity of the new gene. Additionally, mutants identified by Can-Seq to carry new missense mutations in known genes can be confirmed by using complementation tests, and then potentially used to characterize the biochemical function of the protein encoded by the gene (dotted arrow and dotted box)

Similar articles

See all similar articles


    1. Page DR, Grossniklaus L. The art and design of genetic screens: Arabidopsis thaliana. Nat Rev Genet. 2002;3(2):124–136. doi: 10.1038/nrg730. - DOI - PubMed
    1. Peters JL, Cnudde F, Gerats T. Forward genetics and map-based cloning approaches. Trends Plant Sci. 2003;8(10):484–491. doi: 10.1016/j.tplants.2003.09.002. - DOI - PubMed
    1. McCallum CM, Comai L, Greene EA, Henikoff S. Targeting induced local lesions in genomes (TILLING) for plant functional genomics. Plant Physiol. 2000;123(2):439–442. doi: 10.1104/pp.123.2.439. - DOI - PMC - PubMed
    1. Kim Y, Schumaker KS, Zhu JK. EMS mutagenesis of Arabidopsis. Methods Mol Biol (Clifton, NJ) 2006;323:101–103. - PubMed
    1. Lau NC, Lim LP, Weinstein EG, Bartel DP. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science. 2001;294(5543):858–862. doi: 10.1126/science.1065062. - DOI - PubMed

LinkOut - more resources