Streaming Data Preprocessing via Online Tensor Recovery for Large Environmental Sensor Networks

Published: 30 July 2022 Publication History


Measuring the built and natural environment at a fine-grained scale is now possible with low-cost urban environmental sensor networks. However, fine-grained city-scale data analysis is complicated by tedious data cleaning including removing outliers and imputing missing data. While many methods exist to automatically correct anomalies and impute missing entries, challenges still exist on data with large spatial-temporal scales and shifting patterns. To address these challenges, we propose an online robust tensor recovery (OLRTR) method to preprocess streaming high-dimensional urban environmental datasets. A small-sized dictionary that captures the underlying patterns of the data is computed and constantly updated with new data. OLRTR enables online recovery for large-scale sensor networks that provide continuous data streams, with a lower computational memory usage compared to offline batch counterparts. In addition, we formulate the objective function so that OLRTR can detect structured outliers, such as faulty readings over a long period of time. We validate OLRTR on a synthetically degraded National Oceanic and Atmospheric Administration temperature dataset, and apply it to the Array of Things city-scale sensor network in Chicago, IL, showing superior results compared with several established online and batch-based low-rank decomposition methods.


  A Novel Nonconvex Low-Rank Tensor Completion Approach for Traffic Sensor Data Recovery From Incomplete MeasurementsIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2023.328492972(1-15)Online publication date: 2023
  State of the art on quality control for data streamsComputer Science Review10.1016/j.cosrev.2023.10055448:COnline publication date: 1-May-2023
  A Contemporary and Comprehensive Survey on Streaming Tensor DecompositionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.323087435:11(10897-10921)Online publication date: 20-Dec-2022
Published In

December 2022
December 2022
631 pages
Association for Computing Machinery

New York, NY, United States

Published: 30 July 2022

Published: 30 July 2022
Online AM: 04 May 2022
Accepted: 01 April 2022
Revised: 01 March 2022
Received: 01 September 2021
Published in TKDD Volume 16, Issue 6


  Robust tensor recovery
  tensor factorization
  multilinear analysis
  outlier detection
  internet of things
  urban computing


Funding Sources

  USDOT Eisenhower Fellowship program
  • USDOT Eisenhower Fellowship program


  • Downloads (Last 12 months)98
  • Downloads (Last 6 weeks)7
Reflects downloads up to 27 Aug 2024

  A Novel Nonconvex Low-Rank Tensor Completion Approach for Traffic Sensor Data Recovery From Incomplete MeasurementsIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2023.328492972(1-15)Online publication date: 2023
  State of the art on quality control for data streamsComputer Science Review10.1016/j.cosrev.2023.10055448:COnline publication date: 1-May-2023
  A Contemporary and Comprehensive Survey on Streaming Tensor DecompositionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.323087435:11(10897-10921)Online publication date: 20-Dec-2022
  Scalable Joins over Big Data Streams: Actual and Future Research Trends2022 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW58026.2022.00132(1016-1019)Online publication date: Nov-2022

