انتخاب ویژگی شورایی مبتنی بر حداقل افزونگی، حداکثر همبستگی: یک رویکرد دو هدفه بر اساس مفهوم غلبه پارتو

نوع مقاله : مقاله پژوهشی انگلیسی

نویسندگان

1 دانشکده فنی و مهندسی، دانشگاه لرستان، خرم آباد، ایران.

2 گروه مهندسی کامپیوتر، دانشکده فنی و مهندسی، دانشگاه لرستان.

3 دانشکده مهندسی برق، دانشگاه شهید باهنر کرمان، کرمان، ایران.

چکیده

برای بهبود الگوریتم‌های انتخاب ویژگی، روش‌های شورایی مورد استفاده قرار می‌گیرند. در این رویکردها نتایج چندین روش انتخاب ویژگی با هم ترکیب می‌شوند تا مجموعه ویژگی نهایی حاصل شود. انتخاب ویژگی شورایی بر اساس این حقیقت است که تنوع روش‌های انتخاب ویژگی بهتر از تنها یک روش عمل می‌کند. هر الگوریتم انتخاب ویژگی ممکن است یک اپتیموم محلی را در فضای ویژگی‌ها در نظر بگیرد. در نتیجه روش‌های انتخاب ویژگی شورایی برای حل این مشکلات مورد استفاده قرار می‌گیرند. در این مقاله ما یک الگوریتم انتخاب ویژگی شورایی بر اساس رتبه‌دهی مبتنی بر مفهوم غلبه پارتو برای بهبود دقت دسته‌بندی روش‌های انتخاب ویژگی شورایی حاضر و روش‌های پایه انتخاب ویژگی ارائه داده‌ایم. این روش با استفاده از یک فرآیند بهینه‌سازی دو‌هدفه و مفهوم فاصله ازدحام، ویژگی‌ها در این فضا و در نظر گرفتن میزان همبستگی با برچسب کلاس و نیز افزونگی هر ویژگی به رتبه‌دهی آنها می پردازد. ما این روش را با روش‌های انتخاب ویژگی شورایی جدید و الگوریتم‌های پایه انتخاب ویژگی مقایسه کرده‌ایم. نتایج نشان‌دهنده برتری روش در معیار دقت دسته‌بندی است و همچنین در زمان کوتاه‌تری نسبت به سایر روش‌ها اجرا می‌شود.

کلیدواژه‌ها


[1]     G. Ansari, T. Ahmad, and M. N. Doja, “Ensemble of feature ranking methods using hesitant fuzzy sets for sentiment classification,” Int. J. Mach. Learn. Comput., vol. 9, no. 5, pp. 599–608, 2019.
[2]     M. Bache, K. & Lichman, “Repository, UCI Machine Learning,” Irvine, CA: University of California, 2013. .
[3]     H. Bayati, M. B. Dowlatshahi, and M. Paniri, “MLPSO: A Filter Multi-label Feature Selection Based on Particle Swarm Optimization,” in 2020 25th International Computer Conference, Computer Society of Iran (CSICC), 2020, pp. 1–6.
[4]     V. Bolón-Canedo and A. Alonso-Betanzos, “Ensembles for feature selection: A review and future trends,” Inf. Fusion, vol. 52, pp. 1–12, 2019.
[5]     V. Bolón-Canedo and A. Alonso-Betanzos, “Evaluation of ensembles for feature selection,” in Intelligent Systems Reference Library, vol. 147, 2018, pp. 97–113.
[6]     A. Ben Brahim and M. Limam, “Ensemble feature selection for high dimensional data: a new method and a comparative study,” Adv. Data Anal. Classif., pp. 1–16, 2017.
[7]     J. Cai, J. Luo, S. Wang, and S. Yang, “Feature selection in machine learning: A new perspective,” Neurocomputing, vol. 300, pp. 70–79, 2018.
[8]     C. W. Coakley and W. J. Conover, “Practical Nonparametric Statistics,” J. Am. Stat. Assoc., vol. 95, no. 449, p. 332, 2000.
[9]     A. K. Das, S. Das, and A. Ghosh, “Ensemble feature selection using bi-objective genetic algorithm,” Knowledge-Based Syst., vol. 123, pp. 116–127, 2017.
[10]   M. B. Dowlatshahi, V. Derhami, and H. Nezamabadi-Pour, “Fuzzy particle swarm optimization with nearest-better neighborhood for multimodal optimization,” Iran. J. Fuzzy Syst., vol. 17, no. 4, pp. 7–24, 2020.
[11]   M. B. Dowlatshahi and V. Derhami, “Winner Determination in Combinatorial Auctions using Hybrid Ant Colony Optimization and Multi-Neighborhood Local Search,” J. AI Data Min., vol. 5, no. 2, pp. 169–181, 2017.
[12]   M. B. Dowlatshahi, V. Derhami, and H. Nezamabadi-pour, “A novel three-stage filter-wrapper framework for miRNA subset selection in cancer classification,” Informatics, vol. 5, no. 1, 2018.
[13]   M. B. Dowlatshahi, V. Derhami, and H. Nezamabadi-Pour, “Ensemble of filter-based rankers to guide an epsilon-greedy swarm optimizer for high-dimensional feature subset selection,” Inf., vol. 8, no. 4, 2017.
[14]   M. B. Dowlatshahi, M. Kuchaki Rafsanjani, and B. B. Gupta, “An energy aware grouping memetic algorithm to schedule the sensing activity in WSNs-based IoT for smart cities,” Appl. Soft Comput., vol. 108, p. 107473, Sep. 2021.
[15]   M. B. Dowlatshahi and H. Nezamabadi-Pour, “GGSA: A Grouping Gravitational Search Algorithm for data clustering,” Eng. Appl. Artif. Intell., vol. 36, pp. 114–121, 2014.
[16]   M. B. Dowlatshahi, H. Nezamabadi-Pour, and M. Mashinchi, “A discrete gravitational search algorithm for solving combinatorial optimization problems,” Inf. Sci. (Ny)., vol. 258, pp. 94–107, 2014.
[17]   M. B. Dowlatshahi and M. Rezaeian, “Training spiking neurons with gravitational search algorithm for data classification,” in 1st Conference on Swarm Intelligence and Evolutionary Computation, CSIEC 2016 - Proceedings, 2016, pp. 53–58.
[18]   P. Drotár, M. Gazda, and L. Vokorokos, “Ensemble feature selection using election methods and ranker clustering,” Inf. Sci. (Ny)., vol. 480, pp. 365–380, 2019.
[19]   R. O. Duda, P. E. Hart, and D. G. Stork, “Pattern classification,” New York John Wiley, Sect., vol. 10, p. l, 2001.
[20]   M. K. Ebrahimpour and M. Eftekhari, “Ensemble of feature selection methods: A hesitant fuzzy sets approach,” Appl. Soft Comput. J., vol. 50, pp. 300–312, 2017.
[21]   A. Hashemi, M. B. Dowlatshahi, and H. Nezamabadi-pour, “MGFS: A multi-label graph-based feature selection algorithm via PageRank centrality,” Expert Syst. Appl., vol. 142, 2020.
[22]   A. Hashemi, M. Bagher Dowlatshahi, and H. Nezamabadi-pour, “VMFS: A VIKOR-based multi-target feature selection,” Expert Syst. Appl., p. 115224, May 2021.
[23]   A. Hashemi, M. Bagher Dowlatshahi, and H. Nezamabadi-pour, “A pareto-based ensemble of feature selection algorithms,” Expert Syst. Appl., vol. 180, p. 115130, Oct. 2021.
[24]   A. Hashemi and M. B. Dowlatshahi, “MLCR: A Fast Multi-label Feature Selection Method Based on K-means and L2-norm,” in 2020 25th International Computer Conference, Computer Society of Iran (CSICC), 2020, pp. 1–7.
[25]   A. Hashemi, M. B. Dowlatshahi, and H. Nezamabadi-pour, “MFS-MCDM: Multi-label feature selection using multi-criteria decision making,” Knowledge-Based Syst., p. 106365, Aug. 2020.
[26]   A. Hashemi, M. B. Dowlatshahi, and H. Nezamabadi-Pour, “A bipartite matching-based feature selection for multi-label learning,” Int. J. Mach. Learn. Cybern., vol. 12, no. 2, pp. 459–475, Feb. 2021.
[27]   T. Hastie, R. Tibshirani, J. Friedman, and J. Franklin, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction,” Math. Intell., 2017.
[28]   N. Hoque, M. Singh, and D. K. Bhattacharyya, “EFS-MI: an ensemble feature selection method for classification,” Complex Intell. Syst., 2018.
[29]   J. Z. Huang, “An Introduction to Statistical Learning: With Applications in R By Gareth James, Trevor Hastie, Robert Tibshirani, Daniela Witten,” J. Agric. Biol. Environ. Stat., vol. 19, no. 4, pp. 556–557, Dec. 2014.
[30]   J. Li et al., “Feature Selection: A Data Perspective,” ACM Comput. Surv., vol. 50, 2017.
[31]   M. Li, S. Yang, and X. Liu, “Bi-goal evolution for many-objective optimization problems,” Artif. Intell., vol. 228, pp. 45–65, 2015.
[32]   C. Von Lücken, B. Barán, and C. Brizuela, “A survey on multi-objective evolutionary algorithms for many-objective problems,” Comput. Optim. Appl., vol. 58, no. 3, pp. 707–756, 2014.
[33]   J. Miao and L. Niu, “A Survey on Feature Selection,” in Procedia Computer Science, 2016, vol. 91, pp. 919–926.
[34]   K. Michalak and H. Kwasnicka, “Correlation based feature selection method,” Int. J. Bio-Inspired Comput., vol. 2, no. 5, pp. 319–332, 2010.
[35]   W. C. Mlambo, N, “A survey and comparative study of filter and wrapper feature selection techniques,” Int. J. Eng. Sci., vol. 5, no. 8, pp. 57–67, 2016.
[36]   E. Momeni, M. B. Dowlatshahi, F. Omidinasab, H. Maizir, and D. J. Armaghani, “Gaussian Process Regression Technique to Estimate the Pile Bearing Capacity,” Arab. J. Sci. Eng., Jun. 2020.
[37]   E. Momeni, A. Yarivand, M. Bagher Dowlatshahi, and D. Jahed Armaghani, “An Efficient Optimal Neural Network Based on Gravitational Search Algorithm in Predicting the Deformation of Geogrid-Reinforced Soil Structures,” Transp. Geotech., p. 100446, 2020.
[38]   W. W. Y. Ng, Y. Tuo, J. Zhang, and S. Kwong, “Training error and sensitivity-based ensemble feature selection,” Int. J. Mach. Learn. Cybern., vol. 11, no. 10, pp. 2313–2326, Oct. 2020.
[39]   H. Noormohammadi and M. B. Dowlatshahi, “Feature Selection in Multi-label Classification based on Binary Quantum Gravitational Search Algorithm,” in 2021 26th International Computer Conference, Computer Society of Iran (CSICC), 2021, pp. 1–6.
[40]   M. Paniri, M. B. Dowlatshahi, and H. Nezamabadi-pour, “MLACO: A multi-label feature selection algorithm based on ant colony optimization,” Knowledge-Based Syst., vol. 192, 2020.
[41]   M. Paniri, M. B. Dowlatshahi, and H. Nezamabadi-pour, “Ant-TD: Ant colony optimization plus temporal difference reinforcement learning for multi-label feature selection,” Swarm Evol. Comput., vol. 64, p. 100892, Jul. 2021.
[42]   R. B. Pereira, A. Plastino, B. Zadrozny, and L. H. C. Merschmann, “Categorizing feature selection methods for multi-label classification,” Artif. Intell. Rev., vol. 49, no. 1, pp. 57–78, Jan. 2018.
[43]   S. L. Pomeroy et al., “Prediction of central nervous system embryonal tumour outcome based on gene expression,” Nature, vol. 415, no. 6870, pp. 436–442, 2002.
[44]   M. K. Rafsanjani and M. B. Dowlatshahi, “Using Gravitational Search Algorithm for Finding Near-optimal Base Station Location in Two-Tiered WSNs,” Int. J. Mach. Learn. Comput., pp. 377–380, 2012.
[45]   M. K. Rafsanjani, M. B. Dowlatshahi, and H. Nezamabadi-Pour, “Gravitational search algorithm to solve the K-of-N lifetime problem in two-tiered WSNs,” Iran. J. Math. Sci. Informatics, vol. 10, no. 1, pp. 81–93, 2015.
[46]   C. R. Raquel and P. C. Naval, “An effective use of crowding distance in multiobjective particle swarm optimization,” in GECCO 2005 - Genetic and Evolutionary Computation Conference, 2005, pp. 257–264.
[47]   O. Reyes, C. Morell, and S. Ventura, “Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context,” Neurocomputing, vol. 161, 2015.
[48]   B. Seijo-Pardo, I. Porto-Díaz, V. Bolón-Canedo, and A. Alonso-Betanzos, “Ensemble feature selection: Homogeneous and heterogeneous approaches,” Knowledge-Based Syst., vol. 118, pp. 124–139, 2017.
[49]   R. Sheikhpour, M. A. Sarram, S. Gharaghani, and M. A. Z. Chahooki, “A Survey on semi-supervised feature selection methods,” Pattern Recognit., vol. 64, pp. 141–158, 2017.
[50]   M. A. Shipp et al., “Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning,” Nat. Med., vol. 8, no. 1, pp. 68–74, 2002.
[51]   E. G. Talbi, Metaheuristics: From Design to Implementation. 2009.
[52]   Y. Tian, J. Zhang, J. Wang, Y. Geng, and X. Wang, “Robust human activity recognition using single accelerometer via wavelet energy spectrum features and ensemble feature selection,” Syst. Sci. Control Eng., 2020.
[53]   C. F. Tsai and Y. T. Sung, “Ensemble feature selection in high dimension, low sample size datasets: Parallel and serial combination approaches,” Knowledge-Based Syst., 2020.
[54]   B. Venkatesh and J. Anuradha, “A review of Feature Selection and its methods,” Cybern. Inf. Technol., vol. 19, no. 1, pp. 3–26, 2019.
[55]   W. Xu, A. Chong, O. T. Karaguzel, and K. P. Lam, “Improving evolutionary algorithm performance for integer type multi-objective building system design optimization,” Energy Build., vol. 127, pp. 714–729, 2016.
[56]   H. Zeng and Y. M. Cheung, “Feature selection and kernel learning for local learning-based clustering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 8, pp. 1532–1547, 2011.
[57]   R. Zhang, F. Nie, X. Li, and X. Wei, “Feature selection with multi-view data: A survey,” Inf. Fusion, vol. 50, pp. 158–167, 2019.