2025-01-01 null null 18(卷), null(期), (null页)
This study evaluates the accuracy and interpretability of 12 selected machine learning (ML) models for estimating daily reference evapotranspiration (ETo) under semi-arid conditions in Burkina Faso, West African Sahel. Meteorological data (1988-2017) from 9 synoptic stations are used to evaluate model performance. The interpreted variable importance was assessed using SHapley Additive exPlanations (SHAP) values and compared against the reference FAO-56 Penman-Monteith model. Spatiotemporal patterns in meteorological variables influencing ETo are first analysed through trend and change point analyses. The ML models are then calibrated station-wise, using a fivefold cross validation scheme. All ML models demonstrated strong predictive capabilities (R2 = 0.93-1.00, RMSE = 0.05-0.20 mm day-1, NRMSE = 0.40%-2.30%, KGE = 0.99-1.00 at the daily timescale). In terms of accuracy, Extreme Gradient Boosting (XGBoost) emerged as the top-performing model (lowest MAE = 0.03 mm day-1). Tree-based ensemble methods and advanced neural networks consistently outperformed other ML approaches across multiple evaluation metrics. At the monthly and annual timescales, ML models accurately captured ETo patterns and interannual variability. The SHAP value analysis showed that Random Forest (RF), Support Vector Machine (SVM), and boosted models (GBoost and XGBoost) most accurately represented the variable importance hierarchies for ETo estimation, although most models overestimated wind speed contribution. This study highlights the potential of ML approaches for ETo estimation in semi-arid regions, while emphasizing the importance of model interpretability. The findings have significant implications for applications in irrigation planning, water resource management, and climate impact assessment in semi-arid regions.