Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Dec;24(12):4793-805.
doi: 10.1105/tpc.112.108068. Epub 2012 Dec 31.

GWAPP: a web application for genome-wide association mapping in Arabidopsis

Affiliations

GWAPP: a web application for genome-wide association mapping in Arabidopsis

Ümit Seren et al. Plant Cell. 2012 Dec.

Abstract

Arabidopsis thaliana is an important model organism for understanding the genetics and molecular biology of plants. Its highly selfing nature, small size, short generation time, small genome size, and wide geographic distribution make it an ideal model organism for understanding natural variation. Genome-wide association studies (GWAS) have proven a useful technique for identifying genetic loci responsible for natural variation in A. thaliana. Previously genotyped accessions (natural inbred lines) can be grown in replicate under different conditions and phenotyped for different traits. These important features greatly simplify association mapping of traits and allow for systematic dissection of the genetics of natural variation by the entire A. thaliana community. To facilitate this, we present GWAPP, an interactive Web-based application for conducting GWAS in A. thaliana. Using an efficient implementation of a linear mixed model, traits measured for a subset of 1386 publicly available ecotypes can be uploaded and mapped with a mixed model and other methods in just a couple of minutes. GWAPP features an extensive, interactive, and user-friendly interface that includes interactive Manhattan plots and linkage disequilibrium plots. It also facilitates exploratory data analysis by implementing features such as the inclusion of candidate polymorphisms in the model as cofactors.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Phenotype View. The phenotype view shows phenotype specific information in four panels. Panel (A) displays phenotype name and number of values. In (B), a list of data sets is shown. Selecting a data set from that list will update the geographic distribution map (C). Two bar charts in (D) show statistical information about the phenotype. The navigation tree on the left side (E) reflects the stored phenotype structure and is used to access different views.
Figure 2.
Figure 2.
Data Set View. (A) The filter box allows the user to exclude specific accessions as well as change the name and the description of the data set. (B) The data set list displays information for each accession in the data set. In edit mode, the user can use the checkbox to add and remove accessions from the data set. (C) A Google map shows the locations of all accessions in the data set. Clicking on one marker will show a pop-up with information about the name and ID of the selected accession. (D) The geographic distribution map (GeoMap) shows the geographic distribution of the accessions in the data set. Moving the mouse over a country will show the number of accessions located in that region.
Figure 3.
Figure 3.
Transformation View. The transformation view consists of four panels. The list of stored transformations is displayed in (A). The use can create a new transformation, delete an existing one, or run one of three available GWAS analysis methods on the transformed phenotype values. Dependent on the selected transformation a histogram of the transformed phenotype values are displayed below the transformation list (B). The Accession-Phenotype-Explorer (C) visualizes additional accession information through a bar chart or a scatterplot. Panel (D) shows the stored GWAS results for the specific transformation.
Figure 4.
Figure 4.
Result View. The result view displays GWAS plots for each of the five chromosomes. Each GWAS plot itself consists of three panels. The top panel (A) contains a scatterplot. The positions on the chromosome are on the x axis and the score on the y axis. The dots in the scatterplot represent SNPs (E). A horizontal dashed line (H) shows the 5% FDR threshold. At the top of the GWAS results view, a search box for genes is displayed (D). These genes will be displayed as a colored band (red in the figure). The second panel (B) shows the gene annotation and is only shown for a specific zoom range (<1.5 Mb). It will display genes, gene features, and gene names. Moving the mouse over a gene will display additional information in a pop-up (F), and clicking on a gene will open the TAIR page for the specific gene. Panel (C) displays various chromosome-wide statistics. The region highlighted by a yellow band (I) is shown in the scatterplot and in the gene annotation. The gear icon opens a pop-up (G) with the available statistics the user can choose from.
Figure 5.
Figure 5.
LD Visualization. The LD is shown for a specific region with 500 SNPs. The triangle plot below (B) the gene annotation panel shows the r2 values for the 500 SNPs. Only r2 values above a certain threshold (0.3) are color coded, ranging from yellow (low) to red (high).
Figure 6.
Figure 6.
First AMM Scan for Flowering Time. A screenshot showing the first mixed-model scan for flowering time, highlighting the positions of four interesting candidate genes (FT, FRI, FLC, and DOG1) for which there seem to be associations.
Figure 7.
Figure 7.
Conditional Mixed-Model Scans for Flowering Time. The first AMM scan (A) without any cofactors is shown on the left. The second AMM scan (B) in the middle is the result from adding the SNP with the smallest P value within the FRI gene into the model as a cofactor. Finally, the third AMM scan (C) on the right is the result from adding the top SNP from the middle figure, which is 5 kb upstream of the FRI gene into the model as a cofactor. The negative log P values are shown on the y axis and the positions on the x axis. The 5% FDR threshold is denoted by a horizontal, dashed, green line.
Figure 8.
Figure 8.
Partition of Variance for the Conditional Mixed-Model Scans. Two screenshots showing the five SNPs included in the model (A) and how the partition of phenotypic variance changes as the five cofactors (FRI, FT, FLC, and DOG1) are added to the mixed model (B).
Figure 9.
Figure 9.
Runtime for Different Mapping Methods. The time, from starting the analysis until the P values are visible in the Manhattan plot, is plotted against the number of individuals used for the GWAS. Lines for all three mapping methods are shown: AMM, LM, and the Wilcoxon rank sum test.

Similar articles

Cited by

References

    1. Alonso-Blanco C., Bentsink L., Hanhart C.J., Blankestijn-de Vries H., Koornneef M. (2003). Analysis of natural allelic variation at seed dormancy loci of Arabidopsis thaliana. Genetics 164: 711–729 - PMC - PubMed
    1. Aranzana M.J., et al. (2005). Genome-wide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance genes. PLoS Genet. 1: e60. - PMC - PubMed
    1. Atwell S., et al. (2010). Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465: 627–631 - PMC - PubMed
    1. Baxter I., Brazelton J.N., Yu D., Huang Y.S., Lahner B., Yakubova E., Li Y., Bergelson J., Borevitz J.O., Nordborg M., Vitek O., Salt D.E. (2010). A coastal cline in sodium accumulation in Arabidopsis thaliana is driven by natural variation of the sodium transporter AtHKT1;1. PLoS Genet. 6: e1001193. - PMC - PubMed
    1. Benjamini Y., Yekutieli D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29: 1165–1188

Publication types

LinkOut - more resources