COMPARISON OF TIME SERIES IMPUTATION METHODS - USING SAITS FOR DATA RECOVERY

Draylon Vieira Lopes; Lucas Gonçalves Brach; Emerson Cassiano da Silva; Peterson Gonçalves Alano and Rafael Stubs Parpinelli

Home

Digital Library

Visit Digital Library

Conference Proceedings

IADIS International Conference WWW/Internet - ICWI

IADIS International Conference WWW/Internet 2025

Document Info

Title:	COMPARISON OF TIME SERIES IMPUTATION METHODS - USING SAITS FOR DATA RECOVERY
Author(s):	Draylon Vieira Lopes, Lucas Gonçalves Brach, Emerson Cassiano da Silva, Peterson Gonçalves Alano and Rafael Stubs Parpinelli
ISBN:	978-989-8704-71-9
Editors:	Paula Miranda and Pedro Isaías
Year:	2025
Edition:	Single
Keywords:	Data Imputation, Time Series, Missing Data, Industrial Monitoring, Machine Learning, Predictive Maintenance
Type:	Full Paper
First Page:	29
Last Page:	36
Language:	English
Cover:
Full Contents:	if you are a member please login
Paper Abstract:	Missing data in wind turbine Supervisory Control and Data Acquisition (SCADA) is a common issue caused by sensor faults, communication losses, and maintenance downtime. These gaps reduce the reliability of condition monitoring, anomaly detection, and predictive maintenance, where complete and high-quality data are essential. This work focuses on addressing the imputation of missing values in turbine datasets to improve the quality and usability of the data for machine learning applications and operational decision-making. We start by collecting multivariate SCADA data from real turbine operations and artificially introduce block gaps of 6-60 samples to replicate realistic sensor interruptions. Several strategies for filling these gaps are evaluated, including time-based linear interpolation, multivariate linear regression, MIARMA, which uses ARMA models to preserve spectral properties, and SAITS, a modern self-attention-based architecture. The comparison is carried out under identical missingness conditions, and accuracy is assessed only at the masked points using MAE, RMSE, and sMAPE. The results, aggregated across more than 200 million masked points, show that multivariate linear regression is the most effective among the classical methods, performing better than simple time interpolation, while MIARMA delivers similar results at a way higher cost in the multivariate contexts. SAITS achieves the best overall performance and fidelity on the datasets, confirming that deep learning models are highly effective in reconstructing complex turbine data, though they require greater computational resources for producing fast results. The findings highlight the importance of exploiting cross-variable relationships in turbine monitoring and demonstrate that the proposed pipeline can serve as a reproducible framework for evaluating imputation methods in other industrial domains as well.

	Go Back

Social Media Links

amazon

Search

Login

Top Visited