Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
skip to main content
research-article

Data stream classification using active learned neural networks

Published: 11 August 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Due to variety of modern real-life tasks, where analyzed data is often not a static set, the data stream mining gained a substantial focus of machine learning community. Main property of such systems is the large amount of data arriving in a sequential manner, which creates an endless stream of objects. Taking into consideration the limited resources as memory and computational power, it is widely accepted that each instance can be processed up once and it is not remembered, making reevaluation impossible. In the following work, we will focus on the data stream classification task where parameters of a classification model may vary over time, so the model should be able to adapt to the changes. It requires a forgetting mechanism, ensuring that outdated samples will not impact a model. The most popular approaches base on so-called windowing, requiring storage of a batch of objects and when new examples arrive, the least relevant ones are forgotten. Objects in a new window are used to retrain the model, which is cumbersome especially for online learners and contradicts the principle of processing each object at most once. Therefore, this work employs inbuilt forgetting mechanism of neural networks. Additionally, to reduce a need of expensive (sometimes even impossible) object labeling, we are focusing on active learning, which asks for labels only for interesting examples, crucial for appropriate model upgrading. Characteristics of proposed methods were evaluated on the basis of the computer experiments, performed over diverse pool of data streams. Their results confirmed the convenience of proposed strategy.

    References

    [1]
    R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, 2 ed., Wiley, New York, 2001.
    [2]
    G. Widmer, M. Kubat, Learning in the presence of concept drift and hidden contexts, Mach. Learn. 23 (1) (1996) 69–101.
    [3]
    A.A. Beyene, T. Welemariam, M. Persson, N. Lavesson, Improved concept drift handling in surgery prediction and other applications, Knowl. Inf. Syst. 44 (1) (2015) 177–196,.
    [4]
    T. Lane, C.E. Brodley, Approaches to online learning and concept drift for user identification in computer security, in: R. Agrawal, P.E. Stolorz, G. Piatetsky-Shapiro (Eds.), Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), New York City, New York, USA, AAAI Press, 1998, pp. 259–263.
    [5]
    J.R. Méndez, F. Fdez-Riverola, E.L. Iglesias, F. Díaz, J.M. Corchado, Tracking Concept Drift at Feature Selection Stage in SpamHunting: An Anti-spam Instance-Based Reasoning System, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 504–518.
    [6]
    M. Black, R. Hickey, Classification of customer call data in the presence of concept drift and noise, Soft-Ware Comput. Imperfect World (2002) 221–254.
    [7]
    A.D. Pozzolo, G. Boracchi, O. Caelen, C. Alippi, G. Bontempi, Credit card fraud detection and concept-drift adaptation with delayed supervised information., Proceedings of the IJCNN, IEEE, 2015, pp. 1–8.
    [8]
    J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, A. Bouchachia, A survey on concept drift adaptation, ACM Comput. Surveys (2013).
    [9]
    L.I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms, Wiley-Interscience, 2004.
    [10]
    P. Sobolewski, M. Woźniak, Concept drift detection and model selection with simulated recurrence and ensembles of statistical detectors, J. Univ. Comput. Sci. 19 (4) (2013) 462–483.
    [11]
    J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, A. Bouchachia, A survey on concept drift adaptation, ACM Comput. Surv. 46 (4) (2014) 44:1–44:37.
    [12]
    P. Domingos, G. Hulten, A general framework for mining massive data streams., J. Comput. Graph. Stat. 12 (2003) 945–949.
    [13]
    N. Littlestone, Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm, Mach. Learn. 2 (4) (1988) 285–318,.
    [14]
    P. Domingos, G. Hulten, Mining high-speed data streams, Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’00, ACM, New York, NY, USA, 2000, pp. 71–80.
    [15]
    G.A. Carpenter, S. Grossberg, D.B. Rosen, Fuzzy art: Fast stable learning and categorization of analog patterns by an adaptive resonance system, Neural Netw. 4 (6) (1991) 759–771,.
    [16]
    J.C. Schlimmer, R.H. Granger Jr., Incremental learning from noisy data, Mach. Learn. 1 (3) (1986) 317–354.
    [17]
    J. Kolter, M. Maloof, Dynamic weighted majority: a new ensemble method for tracking concept drift, Proceedings of the Third IEEE International Conference on Data Mining ICDM, 2003, pp. 123–130.
    [18]
    A. Bouchachia, C. Vanaret, GT2FC: an online growing interval type-2 self-learning fuzzy classifier, IEEE Trans. Fuzzy Syst. 22 (4) (2014) 999–1018,.
    [19]
    B. Krawczyk, M. Woźniak, Incremental learning and forgetting in one-class classifiers for data streams, in: R. Burduk, K. Jackowski, M. Kurzynski, M. Wozniak, A. Zolnierek (Eds.), Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013, in: Advances in Intelligent Systems and Computing, 226, Springer International Publishing, 2013, pp. 319–328.
    [20]
    A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, R. Gavaldà, New ensemble methods for evolving data streams, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’09, ACM, New York, NY, USA, 2009, pp. 139–148.
    [21]
    I. Koychev, Gradual forgetting for adaptation to concept drift, Proceedings of the ECAI Workshop on Current Issues in Spatio-Temporal Reasoning, Berlin, Germany, 2000, pp. 101–106.
    [22]
    R. Klinkenberg, Learning drifting concepts: example selection vs. example weighting, Intell. Data Anal. 8 (3) (2004) 281–300.
    [23]
    M. Woźniak, A. Kasprzak, P. Cal, Application of combined classifiers to data stream classification, Proceedings of the 10th International Conference on Flexible Query Answering Systems FQAS 2013, LNCS, Springer-Verlag, Berlin, Heidelberg, 2013, p. in press.
    [24]
    J.S. Vitter, Random sampling with a reservoir, ACM Trans. Math. Softw. 11 (1) (1985) 37–57,.
    [25]
    B. Kurlej, M. Woźniak, Impact of window size in active learning of evolving data streams, Proceedings of the 45th International Conference on Modelling and Simulation of Systems MOSIS, 2011, pp. 56–62.
    [26]
    A. Bifet, R. Gavaldà, Learning from time-changing data with adaptive windowing, Proceedings of the Seventh SIAM International Conference on Data Mining, Minneapolis, Minnesota, USA, 2007, pp. 443–448.
    [27]
    M.M. Lazarescu, S. Venkatesh, H.H. Bui, Using multiple windows to track concept drift, Intell. Data Anal. 8 (1) (2004) 29–59.
    [28]
    J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A.A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, R. Hadsell, Overcoming catastrophic forgetting in neural networks, Proc. Natl. Acad. Sci. 114 (13) (2017) 3521–3526,.
    [29]
    D. Kumaran, D. Hassabis, J.L. McClelland, What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated, Trends Cogn. Sci. 20 (7) (2016) 512–534,.
    [30]
    R. Greiner, A.J. Grove, D. Roth, Learning cost-sensitive active classifiers, Artif. Intell. 139 (2) (2002) 137–174.
    [31]
    I. Žliobaitė, A. Bifet, B. Pfahringer, G. Holmes, Active learning with drifting streaming data, IEEE Trans. Neural Netw. Learn. Syst. 25 (1) (2014) 27–39.
    [32]
    B. Kurlej, M. Woźniak, Active learning approach to concept drift problem, Log. J. IGPL 20 (3) (2012) 550–559.
    [33]
    B. Kurlej, M. Woźniak, Learning curve in concept drift while using active learning paradigm, in: A. Bouchachia (Ed.), Adaptive and Intelligent Systems, in: Lecture Notes in Computer Science, 6943, Springer Berlin Heidelberg, 2011, pp. 98–106.
    [34]
    H.-L. Nguyen, W.-K. Ng, Y.-K. Woon, Concurrent Semi-supervised Learning with Active Learning of Data Streams, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 113–136.
    [35]
    S. Mohamad, A. Bouchachia, M. Sayed-Mouchaweh, A bi-criteria active learning algorithm for dynamic data streams, IEEE Trans. Neural Netw. Learn. Syst. 29 (1) (2018) 74–86,.
    [36]
    Ł. Korycki, B. Krawczyk, Combining Active Learning and Self-Labeling for Data Stream Mining, Springer International Publishing, Cham, pp. 481–490.
    [37]
    D.W. Aha, D. Kibler, M.K. Albert, Instance-based learning algorithms, Mach. Learn. 6 (1) (1991) 37–66.
    [38]
    C. Chow, On optimum error and reject trade-off, IEEE Trans. Inf. Theory 16 (1970) 41–46.
    [39]
    G. Fumera, F. Roli, G. Giacinto, Multiple Reject Thresholds for Improving Classification Reliability, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 863–871.
    [40]
    B. Krawczyk, L.L. Minku, J. Gama, J. Stefanowski, M. Woźniak, Ensemble learning for data stream analysis: A survey, Inf. Fus. 37 (2017) 132–156.
    [41]
    A. Bifet, G. Holmes, R. Kirkby, B. Pfahringer, Moa: Massive online analysis, J. Mach. Learn. Res. 11 (2010) 1601–1604.
    [42]
    F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python, J. Mach. Learn. Res. 12 (2011) 2825–2830.
    [43]
    A. Mokhtari, A. Ribeiro, Global convergence of online limited memory BFGS, J. Mach. Learn. Res. 16 (2015) 3151–3181.
    [44]
    A. Frank, A. Asuncion, UCI machine learning repository, 2010.
    [45]
    M. Harries, SPLICE-2 Comparative Evaluation: Electricity Pricing, Technical Report, The University of South Wales, 1999.
    [46]
    W.N. Street, Y. Kim, A streaming ensemble algorithm (sea) for large-scale classification, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’01, ACM, New York, NY, USA, 2001, pp. 377–382,.
    [47]
    I. Zliobaite, How good is the electricity benchmark for evaluating concept drift adaptation, CoRR abs/1301.3524 (2013).

    Cited By

    View all
    • (2024)An active semi-supervised transfer learning method for robot pose error prediction and compensationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107476128:COnline publication date: 14-Mar-2024
    • (2023) Alphabet Flatting as a variant of n-gram feature extraction method in ensemble classification of fake newsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105882120:COnline publication date: 1-Apr-2023
    • (2023)Processing data stream with chunk-similarity model selectionApplied Intelligence10.1007/s10489-022-03826-453:7(7931-7956)Online publication date: 1-Apr-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    Neurocomputing  Volume 353, Issue C
    Aug 2019
    120 pages

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 11 August 2019

    Author Tags

    1. Pattern classification
    2. Data stream
    3. Active learning
    4. Concept drift
    5. Forgetting

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)An active semi-supervised transfer learning method for robot pose error prediction and compensationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107476128:COnline publication date: 14-Mar-2024
    • (2023) Alphabet Flatting as a variant of n-gram feature extraction method in ensemble classification of fake newsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105882120:COnline publication date: 1-Apr-2023
    • (2023)Processing data stream with chunk-similarity model selectionApplied Intelligence10.1007/s10489-022-03826-453:7(7931-7956)Online publication date: 1-Apr-2023
    • (2022)A Survey on Active Deep Learning: From Model Driven to Data DrivenACM Computing Surveys10.1145/351041454:10s(1-34)Online publication date: 23-Mar-2022
    • (2022)An RBF online learning scheme for non-stationary environments based on fuzzy means and Givens rotationsNeurocomputing10.1016/j.neucom.2022.06.016501:C(370-386)Online publication date: 28-Aug-2022
    • (2022)A K-Means clustering and SVM based hybrid concept drift detection technique for network anomaly detectionExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.116510193:COnline publication date: 1-May-2022

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media