Svoboda | Graniru | BBC Russia | Golosameriki | Facebook

research-article

Data stream classification using active learned neural networks

Authors:

Paweł Ksieniewicz,

Michał Woźniak,

Bogusław Cyganek,

Andrzej Kasprzak, and

Krzysztof WalkowiakAuthors Info & Claims

Volume 353, Issue C

Pages 74 - 82

https://doi.org/10.1016/j.neucom.2018.05.130

Published: 11 August 2019 Publication History

Abstract

Due to variety of modern real-life tasks, where analyzed data is often not a static set, the data stream mining gained a substantial focus of machine learning community. Main property of such systems is the large amount of data arriving in a sequential manner, which creates an endless stream of objects. Taking into consideration the limited resources as memory and computational power, it is widely accepted that each instance can be processed up once and it is not remembered, making reevaluation impossible. In the following work, we will focus on the data stream classification task where parameters of a classification model may vary over time, so the model should be able to adapt to the changes. It requires a forgetting mechanism, ensuring that outdated samples will not impact a model. The most popular approaches base on so-called windowing, requiring storage of a batch of objects and when new examples arrive, the least relevant ones are forgotten. Objects in a new window are used to retrain the model, which is cumbersome especially for online learners and contradicts the principle of processing each object at most once. Therefore, this work employs inbuilt forgetting mechanism of neural networks. Additionally, to reduce a need of expensive (sometimes even impossible) object labeling, we are focusing on active learning, which asks for labels only for interesting examples, crucial for appropriate model upgrading. Characteristics of proposed methods were evaluated on the basis of the computer experiments, performed over diverse pool of data streams. Their results confirmed the convenience of proposed strategy.

References

[1]

R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, 2 ed., Wiley, New York, 2001.

[2]

G. Widmer, M. Kubat, Learning in the presence of concept drift and hidden contexts, Mach. Learn. 23 (1) (1996) 69–101.

[3]

A.A. Beyene, T. Welemariam, M. Persson, N. Lavesson, Improved concept drift handling in surgery prediction and other applications, Knowl. Inf. Syst. 44 (1) (2015) 177–196,.

Digital Library

[4]

T. Lane, C.E. Brodley, Approaches to online learning and concept drift for user identification in computer security, in: R. Agrawal, P.E. Stolorz, G. Piatetsky-Shapiro (Eds.), Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), New York City, New York, USA, AAAI Press, 1998, pp. 259–263.

[5]

J.R. Méndez, F. Fdez-Riverola, E.L. Iglesias, F. Díaz, J.M. Corchado, Tracking Concept Drift at Feature Selection Stage in SpamHunting: An Anti-spam Instance-Based Reasoning System, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 504–518.

[6]

M. Black, R. Hickey, Classification of customer call data in the presence of concept drift and noise, Soft-Ware Comput. Imperfect World (2002) 221–254.

[7]

A.D. Pozzolo, G. Boracchi, O. Caelen, C. Alippi, G. Bontempi, Credit card fraud detection and concept-drift adaptation with delayed supervised information., Proceedings of the IJCNN, IEEE, 2015, pp. 1–8.

[8]

J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, A. Bouchachia, A survey on concept drift adaptation, ACM Comput. Surveys (2013).

[9]

L.I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms, Wiley-Interscience, 2004.

Digital Library

[10]

P. Sobolewski, M. Woźniak, Concept drift detection and model selection with simulated recurrence and ensembles of statistical detectors, J. Univ. Comput. Sci. 19 (4) (2013) 462–483.

[11]

J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, A. Bouchachia, A survey on concept drift adaptation, ACM Comput. Surv. 46 (4) (2014) 44:1–44:37.

Digital Library

[12]

P. Domingos, G. Hulten, A general framework for mining massive data streams., J. Comput. Graph. Stat. 12 (2003) 945–949.

[13]

N. Littlestone, Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm, Mach. Learn. 2 (4) (1988) 285–318,.

Digital Library

[14]

P. Domingos, G. Hulten, Mining high-speed data streams, Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’00, ACM, New York, NY, USA, 2000, pp. 71–80.

[15]

G.A. Carpenter, S. Grossberg, D.B. Rosen, Fuzzy art: Fast stable learning and categorization of analog patterns by an adaptive resonance system, Neural Netw. 4 (6) (1991) 759–771,.

Digital Library

[16]

J.C. Schlimmer, R.H. Granger Jr., Incremental learning from noisy data, Mach. Learn. 1 (3) (1986) 317–354.

[17]

J. Kolter, M. Maloof, Dynamic weighted majority: a new ensemble method for tracking concept drift, Proceedings of the Third IEEE International Conference on Data Mining ICDM, 2003, pp. 123–130.

[18]

A. Bouchachia, C. Vanaret, GT2FC: an online growing interval type-2 self-learning fuzzy classifier, IEEE Trans. Fuzzy Syst. 22 (4) (2014) 999–1018,.

[19]

B. Krawczyk, M. Woźniak, Incremental learning and forgetting in one-class classifiers for data streams, in: R. Burduk, K. Jackowski, M. Kurzynski, M. Wozniak, A. Zolnierek (Eds.), Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013, in: Advances in Intelligent Systems and Computing, 226, Springer International Publishing, 2013, pp. 319–328.

[20]

A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, R. Gavaldà, New ensemble methods for evolving data streams, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’09, ACM, New York, NY, USA, 2009, pp. 139–148.

[21]

I. Koychev, Gradual forgetting for adaptation to concept drift, Proceedings of the ECAI Workshop on Current Issues in Spatio-Temporal Reasoning, Berlin, Germany, 2000, pp. 101–106.

[22]

R. Klinkenberg, Learning drifting concepts: example selection vs. example weighting, Intell. Data Anal. 8 (3) (2004) 281–300.

[23]

M. Woźniak, A. Kasprzak, P. Cal, Application of combined classifiers to data stream classification, Proceedings of the 10th International Conference on Flexible Query Answering Systems FQAS 2013, LNCS, Springer-Verlag, Berlin, Heidelberg, 2013, p. in press.

[24]

J.S. Vitter, Random sampling with a reservoir, ACM Trans. Math. Softw. 11 (1) (1985) 37–57,.

Digital Library

[25]

B. Kurlej, M. Woźniak, Impact of window size in active learning of evolving data streams, Proceedings of the 45th International Conference on Modelling and Simulation of Systems MOSIS, 2011, pp. 56–62.

[26]

A. Bifet, R. Gavaldà, Learning from time-changing data with adaptive windowing, Proceedings of the Seventh SIAM International Conference on Data Mining, Minneapolis, Minnesota, USA, 2007, pp. 443–448.

[27]

M.M. Lazarescu, S. Venkatesh, H.H. Bui, Using multiple windows to track concept drift, Intell. Data Anal. 8 (1) (2004) 29–59.

[28]

J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A.A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, R. Hadsell, Overcoming catastrophic forgetting in neural networks, Proc. Natl. Acad. Sci. 114 (13) (2017) 3521–3526,.

[29]

D. Kumaran, D. Hassabis, J.L. McClelland, What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated, Trends Cogn. Sci. 20 (7) (2016) 512–534,.

[30]

R. Greiner, A.J. Grove, D. Roth, Learning cost-sensitive active classifiers, Artif. Intell. 139 (2) (2002) 137–174.

[31]

I. Žliobaitė, A. Bifet, B. Pfahringer, G. Holmes, Active learning with drifting streaming data, IEEE Trans. Neural Netw. Learn. Syst. 25 (1) (2014) 27–39.

[32]

B. Kurlej, M. Woźniak, Active learning approach to concept drift problem, Log. J. IGPL 20 (3) (2012) 550–559.

[33]

B. Kurlej, M. Woźniak, Learning curve in concept drift while using active learning paradigm, in: A. Bouchachia (Ed.), Adaptive and Intelligent Systems, in: Lecture Notes in Computer Science, 6943, Springer Berlin Heidelberg, 2011, pp. 98–106.

[34]

H.-L. Nguyen, W.-K. Ng, Y.-K. Woon, Concurrent Semi-supervised Learning with Active Learning of Data Streams, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 113–136.

[35]

S. Mohamad, A. Bouchachia, M. Sayed-Mouchaweh, A bi-criteria active learning algorithm for dynamic data streams, IEEE Trans. Neural Netw. Learn. Syst. 29 (1) (2018) 74–86,.

[36]

Ł. Korycki, B. Krawczyk, Combining Active Learning and Self-Labeling for Data Stream Mining, Springer International Publishing, Cham, pp. 481–490.

[37]

D.W. Aha, D. Kibler, M.K. Albert, Instance-based learning algorithms, Mach. Learn. 6 (1) (1991) 37–66.

[38]

C. Chow, On optimum error and reject trade-off, IEEE Trans. Inf. Theory 16 (1970) 41–46.

[39]

G. Fumera, F. Roli, G. Giacinto, Multiple Reject Thresholds for Improving Classification Reliability, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 863–871.

[40]

B. Krawczyk, L.L. Minku, J. Gama, J. Stefanowski, M. Woźniak, Ensemble learning for data stream analysis: A survey, Inf. Fus. 37 (2017) 132–156.

[41]

A. Bifet, G. Holmes, R. Kirkby, B. Pfahringer, Moa: Massive online analysis, J. Mach. Learn. Res. 11 (2010) 1601–1604.

[42]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python, J. Mach. Learn. Res. 12 (2011) 2825–2830.

[43]

A. Mokhtari, A. Ribeiro, Global convergence of online limited memory BFGS, J. Mach. Learn. Res. 16 (2015) 3151–3181.

[44]

A. Frank, A. Asuncion, UCI machine learning repository, 2010.

[45]

M. Harries, SPLICE-2 Comparative Evaluation: Electricity Pricing, Technical Report, The University of South Wales, 1999.

[46]

W.N. Street, Y. Kim, A streaming ensemble algorithm (sea) for large-scale classification, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’01, ACM, New York, NY, USA, 2001, pp. 377–382,.

Digital Library

[47]

I. Zliobaite, How good is the electricity benchmark for evaluating concept drift adaptation, CoRR abs/1301.3524 (2013).

Cited By

Zhang TPeng FTang XYan RZhang CDeng R(2024)An active semi-supervised transfer learning method for robot pose error prediction and compensationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107476128:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.engappai.2023.107476
Ksieniewicz PZyblewski PBorek-Marciniec WKozik RChoraś MWoźniak M(2023) Alphabet Flatting as a variant of n-gram feature extraction method in ensemble classification of fake newsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105882120:COnline publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1016/j.engappai.2023.105882
Ksieniewicz P(2023)Processing data stream with chunk-similarity model selectionApplied Intelligence10.1007/s10489-022-03826-453:7(7931-7956)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1007/s10489-022-03826-4
Show More Cited By

Index Terms

Data stream classification using active learned neural networks
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
      2. Neural networks
2. Information systems
  1. Information systems applications

Index terms have been assigned to the content through auto-classification.

Recommendations

Active Weighted Aging Ensemble for drifted data stream classification
Abstract Purpose
One of the significant problems in data stream classification is the concept drift phenomenon, which consists of the change in probabilistic characteristics of the classification task. Such changes in posterior ...
Highlights
- The proposal of a new chunk-base classifier ensemble for non-stationary data streams.
Read More
Online semi-supervised active learning ensemble classification for evolving imbalanced data streams
Abstract
Concept drift is a core challenge in classification tasks of data streams. Although many drift adaptation methods have been presented, most of them assume that labels of all data are available, which is impractical in many real-world ...
Highlights
- An improved active learning strategy is present to select representative data.
- An improved semi-supervised clustering is developed to learn from unlabeled data.
- A novel combination of active learning and semi-supervised learning is ...
Read More
An Adaptive Framework for Multistream Classification
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

A typical data stream classification involves predicting label of data instances generated from a non-stationary process. Studies in the past decade have focused on this problem setting to address various challenges such as concept drift and concept ...
Read More

Comments

Information & Contributors

Information

Published In

Neurocomputing Volume 353, Issue C

Aug 2019

120 pages

ISSN:0925-2312

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 11 August 2019

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

Zhang TPeng FTang XYan RZhang CDeng R(2024)An active semi-supervised transfer learning method for robot pose error prediction and compensationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107476128:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.engappai.2023.107476
Ksieniewicz PZyblewski PBorek-Marciniec WKozik RChoraś MWoźniak M(2023) Alphabet Flatting as a variant of n-gram feature extraction method in ensemble classification of fake newsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105882120:COnline publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1016/j.engappai.2023.105882
Ksieniewicz P(2023)Processing data stream with chunk-similarity model selectionApplied Intelligence10.1007/s10489-022-03826-453:7(7931-7956)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1007/s10489-022-03826-4
Liu PWang LRanjan RHe GZhao L(2022)A Survey on Active Deep Learning: From Model Driven to Data DrivenACM Computing Surveys10.1145/351041454:10s(1-34)Online publication date: 23-Mar-2022
https://dl.acm.org/doi/10.1145/3510414
Karamichailidou DKoletsios SAlexandridis A(2022)An RBF online learning scheme for non-stationary environments based on fuzzy means and Givens rotationsNeurocomputing10.1016/j.neucom.2022.06.016501:C(370-386)Online publication date: 28-Aug-2022
https://dl.acm.org/doi/10.1016/j.neucom.2022.06.016
Jain MKaur GSaxena V(2022)A K-Means clustering and SVM based hybrid concept drift detection technique for network anomaly detectionExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.116510193:COnline publication date: 1-May-2022
https://dl.acm.org/doi/10.1016/j.eswa.2022.116510

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents