A Feature Selection and Classification Algorithm Based on Randomized Extraction of Model Populations

IEEE Trans Cybern. 2018 Apr;48(4):1151-1162. doi: 10.1109/TCYB.2017.2682418. Epub 2017 Mar 30.


We here introduce a novel classification approach adopted from the nonlinear model identification framework, which jointly addresses the feature selection (FS) and classifier design tasks. The classifier is constructed as a polynomial expansion of the original features and a selection process is applied to find the relevant model terms. The selection method progressively refines a probability distribution defined on the model structure space, by extracting sample models from the current distribution and using the aggregate information obtained from the evaluation of the population of models to reinforce the probability of extracting the most important terms. To reduce the initial search space, distance correlation filtering is optionally applied as a preprocessing technique. The proposed method is compared to other well-known FS and classification methods on standard benchmark problems. Besides the favorable properties of the method regarding classification accuracy, the obtained models have a simple structure, easily amenable to interpretation and analysis.