07 Sep 2020

07 Sep 2020

Review status: a revised version of this preprint is currently under review for the journal GI.

A comparison of gap-filling algorithms for eddy covariance fluxes and their drivers

Atbin Mahabbati1, Jason Beringer1, Matthias Leopold1, Ian McHugh2, James Cleverly3, Peter Isaac4, and Azizallah Izady5 Atbin Mahabbati et al.
  • 1School of Agriculture and Environment, The University of Western Australia, 35 Stirling Hwy, Crawley, Perth WA, 6009, Australia
  • 2School of Ecosystem and Forest Sciences, The University of Melbourne, Richmond, VIC, 3121, Australia
  • 3School of Life Sciences University of Technology Sydney Broadway NSW 2007
  • 4OzFlux Central Node, TERN Ecosystem Processes, Melbourne, VIC 3159, Australia
  • 5Water Research Center, Sultan Qaboos University, Muscat, Oman

Abstract. The errors and uncertainties associated with gap-filling algorithms of water, carbon and energy fluxes data, have always been one of the prominent challenges of the global network of microclimatological tower sites that use eddy covariance (EC) technique. To address this concern, and find more efficient gap-filling algorithms, we reviewed eight algorithms to estimate missing values of environmental drivers, and separately three major fluxes in EC time series. We then examined the performance of mentioned algorithms for different gap-filling scenarios utilising data from five OzFlux Network towers during 2013. The objectives of this research were (a) to evaluate the impact of training and testing window lengths on the performance of each algorithm; (b) to compare the performance of traditional and new gap-filling techniques for the EC data, for fluxes and their corresponding meteorological drivers. The performance of algorithms was evaluated by generating nine different training-testing window lengths, ranging from a day to 365 days. In each scenario, the gaps covered the data for the entirety of 2013 by consecutively repeating them, where, in each step, values were modelled by using earlier window data. After running each scenario, a variety of statistical metrics was used to evaluate the performance of the algorithms. The algorithms showed different levels of sensitivity to training-testing windows; The Prophet Forecast Model (FBP) revealed the most sensitivity, whilst the performance of artificial neural networks (ANNs), for instance, did not vary considerably by changing the window length. The performance of the algorithms generally decreased with increasing training-testing window length, yet the differences were not considerable for the windows smaller than 60 days. Gap-filling of the environmental drivers showed there was not a significant difference amongst the algorithms, the linear algorithms showed slight superiority over those of machine learning (ML), except the random forest algorithm estimating the ground heat flux (RMSEs of 30.17 and 34.93 for RF and CLR respectively). For the major fluxes, though, ML algorithms showed superiority (9 % less RMSE on average), except the Support Vector Regression (SVR), which provided significant bias in its estimations. Even though ANNs, random forest (RF) and extreme gradient boost (XGB) showed close performance in gap-filling of the major fluxes, RF provided more consistent results with less bias, relatively. The results indicated that there is no single algorithm which outperforms in all situations and therefore, but RF is a potential alternative for the ANNs as regards flux gap-filling.

Atbin Mahabbati et al.

Status: final response (author comments only)
Status: final response (author comments only)
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment

Atbin Mahabbati et al.

Data sets

A comparison of gap-filling algorithms for eddy covariance fluxes and their drivers Atbin Mahabbati

Atbin Mahabbati et al.


Total article views: 736 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
648 85 3 736 6 5
  • HTML: 648
  • PDF: 85
  • XML: 3
  • Total: 736
  • BibTeX: 6
  • EndNote: 5
Views and downloads (calculated since 07 Sep 2020)
Cumulative views and downloads (calculated since 07 Sep 2020)

Viewed (geographical distribution)

Total article views: 555 (including HTML, PDF, and XML) Thereof 550 with geography defined and 5 with unknown origin.
Country # Views %
  • 1
Latest update: 27 Feb 2021
Short summary
We reviewed eight algorithms to estimate missing values of environmental drivers, and separately, three major fluxes in eddy covariance time series. Overall, machine learning algorithms showed superiority over the rest, and among the top three models, feedforward neural networks, eXtreme Gradient Boost, and random forest algorithms, the later showed the most solid performance in different scenarios.