Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
skip to main content
research-article

Self-Adaptive Particle Swarm Optimization for Large-Scale Feature Selection in Classification

Published: 24 September 2019 Publication History

Abstract

Many evolutionary computation (EC) methods have been used to solve feature selection problems and they perform well on most small-scale feature selection problems. However, as the dimensionality of feature selection problems increases, the solution space increases exponentially. Meanwhile, there are more irrelevant features than relevant features in datasets, which leads to many local optima in the huge solution space. Therefore, the existing EC methods still suffer from the problem of stagnation in local optima on large-scale feature selection problems. Furthermore, large-scale feature selection problems with different datasets may have different properties. Thus, it may be of low performance to solve different large-scale feature selection problems with an existing EC method that has only one candidate solution generation strategy (CSGS). In addition, it is time-consuming to find a suitable EC method and corresponding suitable parameter values for a given large-scale feature selection problem if we want to solve it effectively and efficiently. In this article, we propose a self-adaptive particle swarm optimization (SaPSO) algorithm for feature selection, particularly for large-scale feature selection. First, an encoding scheme for the feature selection problem is employed in the SaPSO. Second, three important issues related to self-adaptive algorithms are investigated. After that, the SaPSO algorithm with a typical self-adaptive mechanism is proposed. The experimental results on 12 datasets show that the solution size obtained by the SaPSO algorithm is smaller than its EC counterparts on all datasets. The SaPSO algorithm performs better than its non-EC and EC counterparts in terms of classification accuracy not only on most training sets but also on most test sets. Furthermore, as the dimensionality of the feature selection problem increases, the advantages of SaPSO become more prominent. This highlights that the SaPSO algorithm is suitable for solving feature selection problems, particularly large-scale feature selection problems.

Supplementary Material

a50-xue-apndx.pdf (xue.zip)
Supplemental movie, appendix, image and software files for, Self-Adaptive Particle Swarm Optimization for Large-Scale Feature Selection in Classification

References

[1]
J. Aguaron, M. T. Escobar, and J. M. Moreno-Jimenez. 2016. The precise consistency consensus matrix in a local AHP-group decision making context. Annals of Operations Research 245, 12 (2016), 245--259.
[2]
A. Al-Ani, A. Alsukker, and R. Khushaba. 2013. Feature subset selection using differential evolution and a wheel based search strategy. Swarm and Evolutionary Computation 9 (2013), 15--26.
[3]
K. Bache and M. Lichman. 2016. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/index.php.
[4]
K. K. Bharti and P. K. Singh. 2016. Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering. Applied Soft Computing 43 (2016), 20--34.
[5]
X. J. Chang, F. P. Nie, Y. Yang, C. Q. Zhang, and H. Huang. 2016. Convex sparse PCA for unsupervised feature learning. ACM Transactions on Knowledge Discovery from Data 11, 1 (2016), 16.
[6]
M. Dorigo and L. M. Gambardella. 1997. Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolutionary Computation 1, 1 (1997), 53--66.
[7]
D. B. Fogel. 1994. An introduction to simulated evolutionary optimization. IEEE Transactions on Neural Networks 5, 1 (1994), 3--14.
[8]
A. S. Ghareb, B. A. Abu, and A. R. Hamdan. 2016. Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Systems with Applications 49 (2016), 31--47.
[9]
I. A. Gheyas and L. S. Smith. 2010. Feature subset selection in large dimensionality domains. Pattern Recognition 43, 1 (2010), 5--13.
[10]
K. R. Harrison, A. P. Engelbrecht, and B. M. Ombuki-Berman. 2018. Self-adaptive particle swarm optimization: A review and analysis of convergence. Swarm Intelligence 12 (2018), 187--226.
[11]
J. H. Holland. 1975. Adaptation in Natural and Artificial Systems. University of Michigan Press.
[12]
U. Kamath, J. Compton, R. I. Dogan, K. D. Jong, and A. Shehu. 2012. An evolutionary algorithm approach for feature generation from sequence data and its application to DNA splice site prediction. IEEE-ACM Transactions on Computational Biology and Bioinformatics 9, 5 (2012), 1387--1398.
[13]
J. Kennedy and R. Eberhart. 1995. Particle swarm optimization. In IEEE International Conference on Neural Networks. 1942--1948.
[14]
J. R. Koza. 1990. Genetic Programming: A Paradigm for Genetically Breeding Populations of Computer Programs to Solve Problems, Vol. 34. Stanford University.
[15]
C. H. Li, S. X. Yang, and T. T. Nguyen. 2012. A self-learning particle swarm optimizer for global optimization problems. IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics 42, 3 (2012), 627--646.
[16]
J. Li and D. C. Tao. 2012. On preserving original variables in Bayesian PCA with application to image analysis. IEEE Transactions on Image Processing 21, 12 (2012), 4830--4843.
[17]
J. J. Liang, A. K. Qin, P. N. Suganthan, and S. Baskar. 2006. Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Transactions on Evolutionary Computation 10, 3 (2006), 281--295.
[18]
T. Marill and D. Green. 1963. On the effectiveness of receptors in recognition systems. IEEE Transactions on Information Theory 9, 1 (1963), 11--17.
[19]
V. E. Neagoe and E. C. Neghina. 2016. Feature selection with ant colony optimization and its applications for pattern recognition in space imagery. In International Conference on Communications. 101--104.
[20]
C. Pornsing, M. S. Sodhi, and B. F. Lamond. 2016. Novel self-adaptive particle swarm optimization methods. Soft Computing 20, 9 (2016), 3579--3593.
[21]
P. Pudil, J. Novovičová, and J. Kittler. 1994. Floating search methods in feature selection. Pattern Recognition Letters 15, 11 (1994), 1119--1125.
[22]
A. K. Qin, V. L. Huang, and P. N. Suganthan. 2009. Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Transactions on Evolutionary Computation 13, 2 (2009), 398--417.
[23]
T. L. Saaty. 1990. How to make a decision: The analytic hierarchy process. European Journal of Operational Research 48, 1 (1990), 9--26.
[24]
S. D. Stearns. 1976. On selecting features for pattern classifiers. In 3rd International Joint Conference on Pattern Recognition. 71--75.
[25]
R. Storn and K. Price. 1997. Differential evolution -- A simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization 11, 4 (1997), 341--359.
[26]
A. Stuhlsatz, J. Lippel, and T. Zielke. 2012. Feature extraction with deep neural networks by a generalized discriminant analysis. IEEE Transactions on Neural Networks and Learning Systems 23, 4 (2012), 596--608.
[27]
T. Sudo, K. Goto, Y. Nojima, and H. Ishibuchi. 2015. Effects of ensemble action selection with different usage of player’s memory resource on the evolution of cooperative strategies for iterated prisoner’s dilemma game. In IEEE Congress on Evolutionary Computation. 1505--1512.
[28]
J. L. Tang and H. Liu. 2014. Feature selection for social media data. ACM Transactions on Knowledge Discovery from Data 8, 4 (2014), 1--27.
[29]
H. Wang, H. Sun, C. H. Li, S. Rahnamayan, and J. S. Pan. 2013. Diversity enhanced particle swarm optimization with neighborhood search. Information Sciences 223 (2013), 119--135.
[30]
Y. Wang. 2011. Chaotic self-adaptive particle swarm optimization algorithm for dynamic economic dispatch problem with valve-point effects. Expert Systems with Applications 38, 11 (2011), 14231--14237.
[31]
Y. Wang, B. Li, T. Weise, J. Y. Wang, B. Yuan, and Q. J. Tian. 2011. Self-adaptive learning based particle swarm optimization. Information Sciences 181, 20 (2011), 4515--4538.
[32]
A. W. Whitney. 1971. A direct method of nonparametric measurement selection. IEEE Transactions on Computers 100, 9 (1971), 1100--1103.
[33]
Y. Wu, S. C. H. Hoi, T. Mei, and N. H. Yu. 2017. Large-scale online feature selection for ultra-high dimensional sparse data. ACM Transactions on Knowledge Discovery from Data 11, 4 (2017), 1--22.
[34]
R. Xu, G. C. Anagnostopoulos, and D. C. Wunsch. 2007. Multiclass cancer classification using semisupervised ellipsoid ARTMAP and particle swarm optimization with gene expression data. IEEE-ACM Transactions on Computational Biology and Bioinformatics 4, 1 (2007), 65--77.
[35]
B. Xue, W. N. Browne, M. J. Zhang, and X. Yao. 2016. A survey on evolutionary computation approaches to feature selection. IEEE Transactions on Evolutionary Computation 20, 4 (2016), 606--626.
[36]
B. Xue, M. J. Zhang, and W. N. Browne. 2013. Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions on Cybernetics 43, 6 (2013), 1656--1671.
[37]
B. Xue, M. J. Zhang, and W. N. Browne. 2014a. Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Applied Soft Computing 18 (2014), 261--276.
[38]
Y. Xue, J. Jiang, B. Zhao, and T. H. Ma. 2017. A self-adaptive artificial bee colony algorithm based on global best for global optimization. Soft Computing 22, 9 (2017), 2935--2952.
[39]
Y. Xue, S. M. Zhong, Y. Zhuang, and B. Xu. 2014b. An ensemble algorithm with self-adaptive learning techniques for high-dimensional numerical optimization. Applied Mathematics and Computation 231 (2014), 329--346.
[40]
H. Q. Yang, M. R. Lyu, and I. King. 2013. Efficient online learning for multitask feature selection. ACM Transactions on Knowledge Discovery from Data 7, 2 (2013), 1--27.
[41]
J. H. Yang and V. Honavar. 1998. Feature subset selection using a genetic algorithm. IEEE Intelligent Systems and their Applications 13, 2 (1998), 44--49.
[42]
X. S. Yang. 2008. Nature-Inspired Metaheuristic Algorithms. LuniverPress.
[43]
K. Yu, X. D. Wu, W. Ding, and J. Pei. 2016. Scalable and accurate online feature selection for big data. ACM Transactions on Knowledge Discovery from Data 11, 2 (2016), 39.
[44]
Y. Zhang, D. W. Gong, and J. Cheng. 2017a. Multi-objective particle swarm optimization approach for cost-based feature selection in classification. IEEE-ACM Transactions on Computational Biology and Bioinformatics 14, 1 (2017), 64--75.
[45]
Y. Zhang, D. W. Gong, Y. Hu, and W. Q. Zhang. 2015. Feature selection algorithm based on bare bones particle swarm optimization. Neurocomputing 148 (2015), 150--157.
[46]
Y. Zhang, X. F. Song, and D. W. Gong. 2017b. A return-cost-based binary firefly algorithm for feature selection. Information Sciences 418 (2017), 561--574.

Cited By

View all
  • (2024)An optimization method for wireless sensor networks coverage based on genetic algorithm and reinforced whale algorithmMathematical Biosciences and Engineering10.3934/mbe.202412421:2(2787-2812)Online publication date: 2024
  • (2024)Grey Wolf Optimizer and Deep Neural Network based Feature Selection and Classification in Medical Image Analysis2024 11th International Conference on Computing for Sustainable Global Development (INDIACom)10.23919/INDIACom61295.2024.10498958(957-960)Online publication date: 28-Feb-2024
  • (2024)A Population Initialization Method Based on Similarity and Mutual Information in Evolutionary Algorithm for Bi-Objective Feature SelectionACM Transactions on Evolutionary Learning and Optimization10.1145/36530254:3(1-21)Online publication date: 19-Mar-2024
  • Show More Cited By

Index Terms

  1. Self-Adaptive Particle Swarm Optimization for Large-Scale Feature Selection in Classification

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      ACM Transactions on Knowledge Discovery from Data  Volume 13, Issue 5
      October 2019
      258 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3364623
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 September 2019
      Accepted: 01 June 2019
      Revised: 01 April 2019
      Received: 01 March 2018
      Published in TKDD Volume 13, Issue 5

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Feature selection
      2. classification
      3. large-scale
      4. particle swarm optimization
      5. self-adaptive

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • National Natural Science Foundation of China
      • Priority Academic Program Development of Jiangsu Higher Education Institutions
      • Natural Science Foundation of Jiangsu Province
      • Natural Science Foundation of the Jiangsu Higher Education Institutions of China

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)150
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 18 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)An optimization method for wireless sensor networks coverage based on genetic algorithm and reinforced whale algorithmMathematical Biosciences and Engineering10.3934/mbe.202412421:2(2787-2812)Online publication date: 2024
      • (2024)Grey Wolf Optimizer and Deep Neural Network based Feature Selection and Classification in Medical Image Analysis2024 11th International Conference on Computing for Sustainable Global Development (INDIACom)10.23919/INDIACom61295.2024.10498958(957-960)Online publication date: 28-Feb-2024
      • (2024)A Population Initialization Method Based on Similarity and Mutual Information in Evolutionary Algorithm for Bi-Objective Feature SelectionACM Transactions on Evolutionary Learning and Optimization10.1145/36530254:3(1-21)Online publication date: 19-Mar-2024
      • (2024)Multi-Objective Self-Adaptive Particle Swarm Optimization for Large-Scale Feature Selection in ClassificationInternational Journal of Neural Systems10.1142/S012906572450014X34:03Online publication date: 9-Feb-2024
      • (2024)MEL: Efficient Multi-Task Evolutionary Learning for High-Dimensional Feature SelectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336633336:8(4020-4033)Online publication date: 1-Aug-2024
      • (2024)A Swarm Intelligence Assisted IoT-Based Activity Recognition System for Basketball RookiesIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2023.33194328:1(82-94)Online publication date: Feb-2024
      • (2024)An Objective Space Constraint-Based Evolutionary Method for High-Dimensional Feature Selection [Research Frontier]IEEE Computational Intelligence Magazine10.1109/MCI.2024.336442919:2(113-128)Online publication date: 8-Apr-2024
      • (2024)A Self-Adapting and Efficient Dandelion Algorithm and Its Application to Feature Selection for Credit Card Fraud DetectionIEEE/CAA Journal of Automatica Sinica10.1109/JAS.2023.12400811:2(377-390)Online publication date: Feb-2024
      • (2024)Sequential Transfer with Multi-Objective Genetic Algorithm for Feature Selection of Small, High-Dimensional Datasets2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10612068(1-8)Online publication date: 30-Jun-2024
      • (2024)A New Evolutionary Multitasking Algorithm for High-Dimensional Feature SelectionIEEE Access10.1109/ACCESS.2024.341880912(89856-89872)Online publication date: 2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media