Feature Selection Using Improved Forest Optimization Algorithm
Feature selection is one of the hottest topics in the field of machine learning and data mining. In 2016, the feature selection using forest optimization algorithm (FSFOA) was proposed, which had a better classification performance and dimensionality reduction ability. However, there are some shortcomings in FSFOA. Feature Selection using Improved Forest Optimization Algorithm (FSIFOA) is proposed in this article, which aims at solving the problems of FSFOA during the stages of random initialization, forming the candidate population and updating the best tree. FSIFOA uses the Pearson correlation coefficient and the L1 regularization method to replace the random initialization strategy in the initialization stage, uses the method of separating good and bad trees and filling the quantity gap between them to solve the problem of category imbalance in the candidate population generation stage, adds trees of the same precision but different dimension compared with the best tree to the forest in the update stage. In experiment, the new algorithm uses the same data and parameters as the traditional algorithm to test the small, medium and large dimensional data respectively. The results of the experiments show that the new algorithm can improve the classification accuracy of classifiers and increase the dimension reduction ratio compared with the traditional algorithms in the medium and large dimension data set.