Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug;183(4):1898-1909.
doi: 10.1104/pp.20.00277. Epub 2020 May 27.

Increased Power and Accuracy of Causal Locus Identification in Time Series Genome-wide Association in Sorghum

Affiliations
Free PMC article

Increased Power and Accuracy of Causal Locus Identification in Time Series Genome-wide Association in Sorghum

Chenyong Miao et al. Plant Physiol. 2020 Aug.
Free PMC article

Abstract

The phenotypes of plants develop over time and change in response to the environment. New engineering and computer vision technologies track these phenotypic changes. Identifying the genetic loci regulating differences in the pattern of phenotypic change remains challenging. This study used functional principal component analysis (FPCA) to achieve this aim. Time series phenotype data were collected from a sorghum (Sorghum bicolor) diversity panel using a number of technologies including conventional color photography and hyperspectral imaging. This imaging lasted for 37 d and centered on reproductive transition. A new higher density marker set was generated for the same population. Several genes known to control trait variation in sorghum have been previously cloned and characterized. These genes were not confidently identified in genome-wide association analyses at single time points. However, FPCA successfully identified the same known and characterized genes. FPCA analyses partitioned the role these genes play in controlling phenotypes. Partitioning was consistent with the known molecular function of the individual cloned genes. These data demonstrate that FPCA-based genome-wide association studies can enable robust time series mapping analyses in a wide range of contexts. Moreover, time series analysis can increase the accuracy and power of quantitative genetic analyses.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Reduced power to identify causal loci in phenotypically constrained populations. A, A genome-wide association analyses for plant height, defined as the distance between the soil surface and the top of the panicle at maturity, using field collected data for 357 lines from SAP and the set of genotype call data used in this study. The location of Dw1, Dw2, and Ma1 are indicated with dashed lines, as is an additional known height locus (KHL) identified in multiple prior GWAS conducted on height in this population using different genetic marker data. B, Distribution of observed heights for the 357 lines employed for association analysis in A. The set of 38 lines above 2 m in height are marked in red. C, A genome-wide association analysis identical to that shown in A but with the exclusion of lines with the field heights >2 m (38 lines) and those which we were not able to successfully germinate and phenotype in this study (27 lines).
Figure 2.
Figure 2.
Different methods to define and measure plant height produce different outcomes. A, Conventional RGB images of a single sorghum plant (PI 576401) taken on eight different days spanning the transition from vegetative to reproductive development. Pixels identified as “plant” through whole plant segmentation are outlined in red. Measured plant height, defined as the distance between the plant pixels with the smallest and greatest y axis value, is indicated by the horizontal blue bar in each image. B, Semantically segmented images of same plant taken on the same day from a moderately different viewing angle using a hyperspectral camera. Pixels classified as “leaf” are indicated in green, pixels classified as “stem” are indicated in orange, and pixels classified as “panicle” are indicated in purple. Measured plant height, defined as the distance between stem or panicle pixels with the smallest and greatest y axis value, is indicated by the horizontal red bar in each image. C, Observed and imputed plant heights for the same sorghum plant on each day within the range of phenotypic data collection. Blue and red circles indicate measured height values from whole plant segmentation of RGB images and semantic segmentation of hyperspectral images, respectively. Solid blue and solid red lines indicate height values imputed for unobserved time points using nonparametric regression for whole plant and semantic height datasets, respectively.
Figure 3.
Figure 3.
Comparison of change in plant height over time for members of the SAP population when anchoring either on planting date (DAP) or panicle emergence date (DAPE). A, Growth curves imputed using nonparametric regression for 20 representative sorghum genotypes, anchored for comparison based on sharing the same date of planting. B, Growth curves imputed using nonparametric regression for the same 20 sorghum genotypes shown in A, anchored for comparison based on sharing the same date of panicle emergence. Lines with identical colors in A and B indicate data taken from the same plants. Regression lines are not extended beyond the range of observed data points. As panicle emergence occurred less than 18 d after the start of imaging for some lines and less than 17 d before the last day of imaging for others, curves in B are incomplete.
Figure 4.
Figure 4.
Several known causal loci show statistically significant associations with plant height when sequential genome-wide association studies are conducted using data anchored to the date of panicle emergence. A summary of where statistically significant trait-associated SNPs were identified in separate genome-wide association studies conducted using height data for each day between −14 to +10 DAPE. Each vertical column summarizes the results from one of the 25 independently conducted genome-wide association studies. Each sorghum chromosome is divided into 16 bins containing equal numbers of SNP markers. Each cell in each vertical column is color coded based on the single most significant P-value observed for any marker within that bin on that day. Light pink cells indicate bins that contain no markers that exceed the multiple testing-corrected threshold for statistical significance. The locations of the two cloned dwarf genes and the one cloned maturity gene which were successfully identified in analysis of data from at least one time point are indicated with horizontal dashed lines.
Figure 5.
Figure 5.
Two functional principal components explain >97% of variance in the sorghum growth curves observed in this study. Functional principal component analysis seeks to describe the pattern of change in height over time of each observed plant using a mean function combined with variable weightings of a set of eigenfunctions. A, Comparison of the empirical mean function (red) for all growth curves observed in this study (gray) and the mean function estimated using functional principal component analysis (blue). B, Illustration of how changing the score for functional principal component one alters the resulting growth curve. C, Illustration of how changing the score for functional principal component two alters the resulting growth curve.
Figure 6.
Figure 6.
Mapping genes associated with variation in functional principal component scores among sorghum genotypes. A, Distribution of functional principal component one scores among the 292 genotypes phenotyped as part of this study. Genotypes with the most negative values for functional principal component one are indicated in red, and genotypes with the most positive values for functional principal component one are indicated in blue. B, Growth curves for a subset of genotypes with the most negative values for functional principal component one are indicated in red. Growth curves for a subset of genotypes with the most positive values for functional principal component one are indicated in blue. C, Distribution of functional principal component two scores among the 292 genotypes phenotyped as part of this study. Genotypes with the most negative values for functional principal component two are indicated in red, and genotypes with the most positive values for functional principal component two are indicated in blue. D, Growth curves for a subset of genotypes with the most negative values for functional principal component two are indicated in red. Growth curves for a subset of genotypes with the most positive values for functional principal component two are indicated in blue. E, Results of conducting a genome-wide association analysis for functional principal component one scores. F, Results of conducting a genome wide association analysis for functional principal component two scores. In E and F, the positions of three cloned dwarf genes Dw1, Dw2, and Dw3 as well as the cloned maturity gene Ma1 are indicated using vertical dash lines. Horizontal dash lines indicate multiple testing corrected cutoff of a statistically significant association.

Comment in

Similar articles

Cited by

References

    1. Bolger AM, Lohse M, Usadel B(2014) Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120 - PMC - PubMed
    1. Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES(2007) TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 23: 2633–2635 - PubMed
    1. Brown PJ, Klein PE, Bortiri E, Acharya CB, Rooney WL, Kresovich S(2006) Inheritance of inflorescence architecture in sorghum. Theor Appl Genet 113: 931–942 - PubMed
    1. Brown PJ, Rooney WL, Franks C, Kresovich S(2008) Efficient mapping of plant height quantitative trait loci in a sorghum association population with introgressed dwarfing genes. Genetics 180: 629–637 - PMC - PubMed
    1. Browning BL, Browning SR(2016) Genotype imputation with millions of reference samples. Am J Hum Genet 98: 116–126 - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources