gkm-SVM is a sequence-based method for predicting and detecting the regulatory vocabulary encoded in functional DNA elements, and is a commonly used tool for studying gene regulatory mechanisms. Here we introduce new software, LS-GKM, which removes several limitations of our previous releases, enabling training on much larger scale (LS) datasets. LS-GKM also provides additional advanced gapped k-mer based kernel functions. With these improvements, LS-GKM achieves considerably higher accuracy than the original gkm-SVM.
Availability and implementation: C/C ++ source codes and related scripts are freely available from http://github.com/Dongwon-Lee/lsgkm/, and supported on Linux and Mac OS X.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: firstname.lastname@example.org.