To incorporate a high share of intermittent renewable sources in energy systems, energy system optimization models rely on weather and climate time series data. However, data for renewable energy sources often contains missing values due to sensor or transmission faults. This study evaluates various data imputation methods for minutely-resolved global horizontal irradiance, direct normal irradiance, and wind speed time series, with missingness ranging from two to ninety percent. Alongside standard statistical tests, a novel validation criterion is introduced by directly evaluating the impact of imputation methods on energy system modeling. While certain imputation methods demonstrate strong point-wise statistical accuracy, they do not necessarily preserve the underlying data distribution. The performance of these methods is strongly influenced by the type of time series and the missingness mechanism, either continuous gaps or randomly missing data points. In energy system optimization, multiple imputation by chained equations, k-nearest neighbors, linear interpolation, and simple moving average yield the best results, outperforming more sophisticated deep learning-based methods. Overall, k-nearest neighbors consistently outperformed the other approaches across all validation criteria. By comprehensively evaluating the statistical performance of imputation methods and their impact on energy system modeling, this study offers valuable insights for researchers and practitioners addressing missing data in energy system applications.
Data imputation methods for intermittent renewable energy sources: Implications for energy system modeling / Mantuano, Claudio; Omoyele, Olalekan; Hoffmann, Maximilian; Weinand, Jann Michael; Panella, Massimo; Stolten, Detlef. - In: ENERGY CONVERSION AND MANAGEMENT. - ISSN 0196-8904. - 339:(2025), pp. 1-22. [10.1016/j.enconman.2025.119857]
Data imputation methods for intermittent renewable energy sources: Implications for energy system modeling
Mantuano, Claudio;Panella, Massimo;
2025
Abstract
To incorporate a high share of intermittent renewable sources in energy systems, energy system optimization models rely on weather and climate time series data. However, data for renewable energy sources often contains missing values due to sensor or transmission faults. This study evaluates various data imputation methods for minutely-resolved global horizontal irradiance, direct normal irradiance, and wind speed time series, with missingness ranging from two to ninety percent. Alongside standard statistical tests, a novel validation criterion is introduced by directly evaluating the impact of imputation methods on energy system modeling. While certain imputation methods demonstrate strong point-wise statistical accuracy, they do not necessarily preserve the underlying data distribution. The performance of these methods is strongly influenced by the type of time series and the missingness mechanism, either continuous gaps or randomly missing data points. In energy system optimization, multiple imputation by chained equations, k-nearest neighbors, linear interpolation, and simple moving average yield the best results, outperforming more sophisticated deep learning-based methods. Overall, k-nearest neighbors consistently outperformed the other approaches across all validation criteria. By comprehensively evaluating the statistical performance of imputation methods and their impact on energy system modeling, this study offers valuable insights for researchers and practitioners addressing missing data in energy system applications.| File | Dimensione | Formato | |
|---|---|---|---|
|
Mantuano_Data_2025.pdf
accesso aperto
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
9.37 MB
Formato
Adobe PDF
|
9.37 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


