A practical guide to environmental association analysis in landscape genomics

Mol Ecol. 2015 Sep;24(17):4348-70. doi: 10.1111/mec.13322.


Landscape genomics is an emerging research field that aims to identify the environmental factors that shape adaptive genetic variation and the gene variants that drive local adaptation. Its development has been facilitated by next-generation sequencing, which allows for screening thousands to millions of single nucleotide polymorphisms in many individuals and populations at reasonable costs. In parallel, data sets describing environmental factors have greatly improved and increasingly become publicly accessible. Accordingly, numerous analytical methods for environmental association studies have been developed. Environmental association analysis identifies genetic variants associated with particular environmental factors and has the potential to uncover adaptive patterns that are not discovered by traditional tests for the detection of outlier loci based on population genetic differentiation. We review methods for conducting environmental association analysis including categorical tests, logistic regressions, matrix correlations, general linear models and mixed effects models. We discuss the advantages and disadvantages of different approaches, provide a list of dedicated software packages and their specific properties, and stress the importance of incorporating neutral genetic structure in the analysis. We also touch on additional important aspects such as sampling design, environmental data preparation, pooled and reduced-representation sequencing, candidate-gene approaches, linearity of allele-environment associations and the combination of environmental association analyses with traditional outlier detection tests. We conclude by summarizing expected future directions in the field, such as the extension of statistical approaches, environmental association analysis for ecological gene annotation, and the need for replication and post hoc validation studies.

Keywords: adaptive genetic variation; ecological association; environmental correlation analysis; genetic-environment association; genotype-environment correlation; local adaptation; natural selection; neutral genetic structure; population genomics.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Adaptation, Physiological / genetics
  • Alleles
  • Environment*
  • Gene Frequency
  • Gene-Environment Interaction
  • Genetics, Population / methods*
  • Genomics / methods*
  • Genotype
  • Linear Models
  • Logistic Models
  • Models, Genetic*
  • Phenotype
  • Software
  • Statistics as Topic