Multi-label feature selection based on competitive swarm optimization

Document Type : Persian Original Article

Authors

Department of Computer Engineering, Faculty of Engineering, Lorestan University, Khorramabad, Iran.

Abstract

Feature selection is one of the important preprocessing steps in data mining and machine learning, which is used to dimensionality reduction and selecting a subset of representative features. Feature selection can remove redundant and irrelevant features that can increase the accuracy of the machine learning tasks. In this paper, a novel embedded approach for the multi-label feature selection method using competitive Swarm Optimizer (CSO) is proposed. In this method, at first, of particles is generated, then the particles are divided into two equal groups and compete in pairs, the winners are moved to the next iteration and the losers learn from the winners, and at the end of each iteration, the objective function for all the particles is computed. To increase the convergence rate, half of the initial population is generated by the similarity between features and labels, and a local search method inspired by the gradient descent algorithm is applied to discover the local structure of data. Finally, based on the best particle, the feature selection is done. The performance of the proposed method is compared with six known and state-of-the-art multi-label feature selection methods. The experimental results on the image and text multi-label datasets show the efficiency and superiority of the proposed method in different multi-label evaluation measures criteria.

Keywords


[1]        S. Feng and M. F. Duarte, “Graph autoencoder-based unsupervised feature selection with broad and local data structure preservation,” Neurocomputing, vol. 312, pp. 310–323, Oct. 2018.
[2]        E. Elhamifar and R. Vidal, “Sparse subspace clustering: Algorithm, theory, and applications,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 11, pp. 2765–2781, 2013.
[3]        N. Zhou, Y. Xu, H. Cheng, J. Fang, and W. Pedrycz, “Global and local structure preserving sparse subspace learning: An iterative approach to unsupervised feature selection,” Pattern Recognit., vol. 53, pp. 87–101, May 2016.
[4]        M. B. Dowlatshahi and H. Nezamabadi-Pour, “GGSA: A Grouping Gravitational Search Algorithm for data clustering,” Eng. Appl. Artif. Intell., vol. 36, pp. 114–121, Nov. 2014.
[5]        S. Kashef, H. Nezamabadi-pour, and B. Nikpour, “Multilabel feature selection: A comprehensive review and guiding experiments,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 8, no. 2, p. e1240, Mar. 2018.
[6]        L. Du and Y. D. Shen, “Unsupervised feature selection with adaptive structure learning,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, vol. 2015-August, pp. 209–218.
[7]        R. Sheikhpour, M. A. Sarram, S. Gharaghani, and M. A. Z. Chahooki, “A Survey on semi-supervised feature selection methods,” Pattern Recognit., vol. 64, pp. 141–158, Apr. 2017.
[8]        J. Luo, L. Jiao, F. Liu, S. Yang, and W. Ma, “A Pareto-Based Sparse Subspace Learning Framework,” IEEE Trans. Cybern., vol. 49, no. 11, pp. 3859–3872, Nov. 2019.
[9]        N. Spolaôr, E. A. Cherman, M. C. Monard, and H. D. Lee, “A comparison of multi-label feature selection methods using the problem transformation approach,” Electron. Notes Theor. Comput. Sci., vol. 292, pp. 135–151, Mar. 2013.
[10]      A. Hashemi and M. B. Dowlatshahi, “MLCR: A Fast Multi-label Feature Selection Method Based on K-means and L2-norm,” in 2020 25th International Computer Conference, Computer Society of Iran, CSICC 2020, 2020.
[11]      M. B. Dowlatshahi, V. Derhami, and H. Nezamabadi-pour, “A Novel Three-Stage Filter-Wrapper Framework for miRNA Subset Selection in Cancer Classification,” Informatics, vol. 5, no. 1, p. 13, Mar. 2018.
[12]      M. Paniri, M. B. Dowlatshahi, and H. Nezamabadi-pour, “MLACO: A multi-label feature selection algorithm based on ant colony optimization,” Knowledge-Based Syst., p. 105285, Dec. 2019.
[13]      S. Tabakhi, P. Moradi, and F. Akhlaghian, “An unsupervised feature selection algorithm based on ant colony optimization,” Eng. Appl. Artif. Intell., vol. 32, pp. 112–123, 2014.
[14]      S. Wang, J. Tang, and H. Liu, “Embedded unsupervised feature selection,” Proc. Natl. Conf. Artif. Intell., vol. 1, pp. 470–476, 2015.
[15]      M. B. Dowlatshahi, V. Derhami, and H. Nezamabadi-pour, “Ensemble of Filter-Based Rankers to Guide an Epsilon-Greedy Swarm Optimizer for High-Dimensional Feature Subset Selection,” Information, vol. 8, no. 4, p. 152, Nov. 2017.
[16]      M. B. Dowlatshahi, V. Derhami, and H. Nezamabadi-Pour, “Fuzzy particle swarm optimization with nearest-better neighborhood for multimodal optimization,” Iran. J. Fuzzy Syst., vol. 17, no. 4, pp. 7–24, Jul. 2020.
[17]      M. B. Dowlatshahi and V. Derhami, “Winner Determination in Combinatorial Auctions using Hybrid Ant Colony Optimization and Multi-Neighborhood Local Search,” J. AI Data Min., vol. 5, no. 2, pp. 169–181, Jul. 2017.
[18]      M. B. Dowlatshahi and M. Rezaeian, “Training spiking neurons with gravitational search algorithm for data classification,” in 1st Conference on Swarm Intelligence and Evolutionary Computation, CSIEC 2016 - Proceedings, 2016, pp. 53–58.
[19]      Dowlatshahi, M., Derhami, V., Nezamabadi-pour, H. “Gravitational Locally Informed Particle Swarm Algorithm for solving Multimodal Optimization Problems,” Tabriz Journal Of Electrical Engineering, 48(3), 2018, pp.1131-1140.
[20]      E. G. Talbi, Metaheuristics: From Design to Implementation. 2009.
[21]      H. Bayati, M. B. Dowlatshahi, and M. Paniri, “MLPSO: A Filter Multi-label Feature Selection Based on Particle Swarm Optimization,” in 2020 25th International Computer Conference, Computer Society of Iran, CSICC 2020, 2020.
[22]      R. Cheng and Y. Jin, “A competitive swarm optimizer for large scale optimization,” IEEE Trans. Cybern., vol. 45, no. 2, pp. 191–204, Feb. 2015.
[23]      J. Lee and D. W. Kim, “Feature selection for multi-label classification using multivariate mutual information,” Pattern Recognit. Lett., vol. 34, no. 3, pp. 349–357, Feb. 2013.
[24]      O. Reyes, C. Morell, and S. Ventura, “Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context,” Neurocomputing, vol. 161, pp. 168–182, Aug. 2015.
[25]      Y. Yu, Y. Yu, and Y. L. Wang, “Feature selection for multi-label learning using mutual information and GA,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, vol. 8818, pp. 454–463.
[26]      R. Huang, W. Jiang, and G. Sun, “Manifold-based constraint Laplacian score for multi-label feature selection,” Pattern Recognit. Lett., vol. 112, pp. 346–352, Sep. 2018.
[27]      W. Chen, J. Yan, B. Zhang, Z. Chen, and Q. Yang, “Document transformation for multi-label feature selection in text categorization,” in Proceedings - IEEE International Conference on Data Mining, ICDM, 2007, pp. 451–456.
[28]      P. Zhang, G. Liu, and W. Gao, “Distinguishing two types of labels for multi-label feature selection,” Pattern Recognit., vol. 95, pp. 72–82, Nov. 2019.
[29]      A. Hashemi, M. B. Dowlatshahi, and H. Nezamabadi-pour, “MGFS: A multi-label graph-based feature selection algorithm via PageRank centrality,” Expert Syst. Appl., vol. 142, p. 113024, Mar. 2020.
[30]      J. Kennedy and R. Eberhart, “Particle Swarm Optimization,” pp. 1942–1948, 1995.
[31]      F. Charte and D. Charte, “Working with multilabel datasets in R: The mldr package,” R J., vol. 7, no. 2, pp. 149–162, 2015.
[32]      M. L. Zhang and Z. H. Zhou, “ML-KNN: A lazy learning approach to multi-label learning,” Pattern Recognit., vol. 40, no. 7, pp. 2038–2048, Jul. 2007.
[33]      D. Sheskin, “Handbook of parametric and nonparametric statistical procedures, 2003‏.