Allelematch: an R package for identifying unique multilocus genotypes where genotyping error and missing data may be present

Mol Ecol Resour. 2012 Jul;12(4):771-8. doi: 10.1111/j.1755-0998.2012.03137.x. Epub 2012 Mar 29.


We present allelematch, an R package, to automate the identification of unique multilocus genotypes in data sets where the number of individuals is unknown, and where genotyping error and missing data may be present. Such conditions commonly occur in noninvasive sampling protocols. Output from the software enables a comparison of unique genotypes and their matches, and facilitates the review of differences between profiles. The software has a variety of applications in molecular ecology, and may be valuable where a large number of samples must be processed, unique genotypes identified, and repeated observations made over space and time. We used simulations to assess the performance of allelematch and found that it can reliably and accurately determine the correct number of unique genotypes (± 3%) across a broad range of data set properties. We found that the software performs with highest accuracy when genotyping error is below 4%. The R package is available from the Comprehensive R Archive Network ( Supplementary documentation and tutorials are provided.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genotype*
  • High-Throughput Nucleotide Sequencing / methods
  • Multilocus Sequence Typing / methods*
  • Pattern Recognition, Automated*
  • Sequence Analysis, DNA
  • Software*