کاهش نمونه در داده‌ها به کمک الگوریتم بهینه‌سازی چندهدفه ازدحام ذرات آشوبی

نوع مقاله : مقاله پژوهشی فارسی

نویسندگان

1 دانشگاه صنعتی سجاد

2 کارشناسی ارشد، مهندسی نرم افزار، دانشکده مهندسی کامپیوتر و فناوری اطلاعات، دانشگاه صنعتی سجاد، مشهد.

چکیده

امروزه با توجه به حجم وسیع داده‌ها، مسئله کاهش نمونه حائز اهمیت است. همچنین عدم وجود توازن در توزیع داده‌ها بین کلاسهای مختلف یک چالش جدی در دادهکاوی است. در روش پیشنهادی، مسئله کاهش نمونه بهعنوان مسئلهی چندهدفه در نظر گرفته شده است که توانسته است با درنظر گرفتن دو معیار متضاد صحت طبقهبندی و نرخ کاهش نمونهها و همچنین توجه به معیارهای مربوط به دادههای نامتوازن عملکرد خوبی داشته باشد. ایجاد و حفظ توازن در انواع مختلف توزیع داده مهمترین هدف روش پیشنهادی است. مسئله چندهدفه طراحی شده با استفاده از الگوریتم بهینهسازی ازدحام ذرات آشوبی حل شده است. سطح تصمیم مبتنی بر فاصله در روش پیشنهادی، وظیفه تشخیص حفظ و یا حذف نمونههای آزمایشی را دارد. نتایج آزمایشات نشاندهندهی برتری روش پیشنهادی از نظر دقت و صحت طبقهبندی و نرخ کاهش دادهها نسبت به روشهای مرز دانش است.

نتایج آزمایشات نشاندهندهی برتری روش پیشنهادی از نظر دقت و صحت طبقهبندی و نرخ کاهش دادهها نسبت به روشهای مرز دانش است.

کلیدواژه‌ها


[1].      L. Nanni and A. Lumini, "Prototype Reduction Technique: A Comparison among Different Approaches," Expert System with Application, 2011.
[2].      D. J. Dittman, T. M. Khoshgoftaar and A. Napolitano, "Selecting the Appropriate Data Sampling Approach for Imbalanced and High-Dimensional Bioinformatics Datasets," IEEE 14th International Conference on Bioinformatics and Bioengineering, 2014.
[3].      J. Derrac, S. Garcia and F. Herrera, “A Survey on Evolutionary Instance Selection and Generation,” International Journal of Applied Metaheuristic Computing, vol. 1, no. 1, pp. 60-92, January-March 2010.
[4].      J. Kennedy and R. Eberhart, “Particle Swarm Optimization,” IEEE International Conference on Neural Network, vol. 4. Pp. 1942-1948, 1995.
[5].      Y. Shi and R. Eberhart, “A Modified Particle Swarm Optimizer”, IEEE International CEC, pp. 69-73, 1998.
[6].      T. Zhai and Zh. He, “Instance Selection for Time Series Classification Based on Immune Binary Particle Swarm Optimization,” Knowledge-Based Systems, ­vol. 49, pp. 106-115, 2013.
[7].      S. Sakinah S. Ahmad and W. Pedrycz, "Feature and Instance Selection Via Cooperative PSO," Systems, Man, and Cybernetics (SMC), 2011 IEEE International Conference on, pp. 2127-2132. IEEE, 2011.
[8].      C. Alejandro, I. Galván and P. Isasi "Michigan particle swarm optimization for prototype reduction in classification problems," New Generation Computing, vol. 27, no. 3, pp. 239-257, 2009.
[9].      N. Garcia-Pedrajas, J. Perez-Rodriguez and A. Haro-Garcia, "OligoIS: Scalable Instance Selection for Class-Imbalanced Data Sets," IEEE Transactions on Cybernetics, vol. 43, no. 1, February 2013.
[10].   P.E. Hart, "The Condensed Nearest Neighbor Rule," IEEE Transaction on Information Theory, vol. 14, no. 3, pp. 515-516, May 1968.
[11].   G.W. Gates, "The Reduced Nearest Neighbor Rule," IEEE Transaction Information Theory, vol. 18, no. 3, pp. 431-433, May 1972.
[12].   D.L. Wilson, "Asymptotic Properties of Nearest Neighbor Rules Using Edited Data", IEEE Transaction on Systems, Man, and Cybernetics, vol. 2, no. 3, pp. 408-421, July 1972.
[13].   S. Alzberg, "A Nearest Hyperrectangle Learning Method," Machine Learning, vol. 6, pp. 251-276, 1991.
[14].   J. Hamidzadeh, R. Monsefi and H. Sadoghi Yazdi, "LMIRA: Large Margin Instance Reduction Algorithm," Neurocomputing, vol. 145, pp. 477-487, 2014.
[15].   J. Hamidzadeh, R. Monsefi and H. Sadoghi Yazdi, "IRAHC: Instance Reduction Algorithm using Hyperrectangle Clustering," Pattern Recognition, vol. 48, pp. 1878-1889, 2015.
[16].   J.R. Cano, F. Herrera, M. Lozano, "Using evolutionary Algorithms as Instance Selection for Data Reduction: an experimental study," IEEE Transactions on Evolutionary Computation, vol. 7, no. 6, pp. 561–575, 2003.
[17].   Ch. Tsai, Z. Chen and Sh. Ke, "Evolutionary Instance Selection for Text Classification," The Journal of Systems and Softwares, vol. 90, 2014.
[18].   I. M. Anwar, Kh. M. Salama and A. M. Abdelbar, "Instance Selection with Ant Colony Optimization," Procedia Computer Science, vol. 53, 2015.
[19].   J. Perez, A. German and N. Garcia, "Simultaneous instance and feature selection and weighting using evolutionary Computation: Proposal and Study," Applied Soft Computing, vol. 37, 2015.
[20].   J. Derrac, S. Garcia, F. Herrera and IFS-CoCo, "Instance and Feature Selection Based on Cooperative Coevolution with Nearest Neighbor Rule," Pattern Recognition, vol. 43, 2010.
[21].   N. Garcia and C. Garcia, "Boosting for Class-Imbalanced Datasets Using Genetically Evolved Supervised Non-Linear Projections," Artificial Intelligence, vol. 2, 2013.
[22].   M. Blachnik, "Ensemble of Instance Selection Methods based on Feature Subset," 18th International Conference on Knowledge-based and Intelligent Information & Engineering Systems, 2014.
[23].   N. Garcia and A. de haro, "Boosting Instance Selection Algorithms," Knowledge-based systems, no. 19, 2014.
[24].   I. Trieguero, D. Peralta, J. Bakardit, S. Garcia and F. Herrera, "A MapReduce Solution for Prototype Reduction in Big Data Classification," Neurocomputing, vol. 150, 2015.
[25].   F. Dornaika and I. Kamal Aldine, "Detrimental Sparse Modeling Representative selection for Prototype Selection," Pattern Recognition, vol. 48, 2015.
[26].   M. Reyes-Sierra and C. A. Coello Coello, "Multi-Objective Particle Swarm Optimizers: A Survey of the State-of-the-Art," International Journal of Computational Intelligence Research, vol. 2, no. 3, pp. 287-308, 2006.
[27].   A. Charlos and C. Coello, "Handling Multiple Objective with Particle Swarm Optimization," IEEE transactions on Evolutionary Computation, vol. 8, no. 3, June 2004.
[28].   E. Ott, Chaos in Dynamic System, Cambridge UK: Cambridge University Press, 2002.
[29].   J. Hamidzadeh, R. Monsefi and H. Sadoghi Yazdi, "DDC: Distance-based Decision Classifier,” Nerual Comput & Applic, pp. 1697-1707, vol. 21, 2012.
[30].   V. Hooshmand Moghadam and J. Hamidzadeh, "New Hermite Orthogonal Polynomial Kernel and Combined Kernels in Support Vector Machine classifier," Pattern Recognition, vol. 60, pp. 921-935, 2016.
[31].   A. Asuncion and D.J. Newman, UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, CA.
[32].   Sheskin D., Handbook of Parametric and Nonparametric Statistical Procedures, Chapman & Hall/CRC, 2003.