Up | Back | Next

Universal Downscaling Task: Model Evaluation and Comparison

The universal downscaling task considers the GLM model, where 81 separate GLMs are constructed to predict each of the target global solar radiation grid points, and the CNN-LSTMU architecture, where grid points for all GCM predictors (except prsn and plev variables having missing data) are provided as inputs and all 81 radiation grid points are generated by the network output (as shown in Figure 24). Both sites Springs and Mossman South Alchera Drive were excluded from the data set due to missing data for some of the target grid points, leaving a remaining 16 sites available for evaluation. The configuration of the CNN-LSTMU architecture resulted after a preliminary grid search, which narrowed the choices for the CNN filter configuration and the final candidate was selected after fine tuning loss functions and learning rate as described in Table 12. Metrics are calculated for each of the 81 grid points by converting from a matrix into a single vector and combining all vectors for the date range in the held-out test period.

Figure 49 Taylor diagram comparing GLM_UNIVERSAL and CNN-LSTM UNIVERSAL models.

The CNN-LSTMU model exhibits slightly higher correlation and less RMSE than the GLM model as illustrated by the Taylor diagram in Figure 49, however the standard deviation is less than that of the observed data. The lower mean and standard deviation are penalised by the Kling-Gupta Efficiency for the CNN-LSTMU where the GLM model exhibits a better score for the majority of sites (Table 22). For example, the overall mean of the observations across all grid points and sites is \(228Wm^{- 2}\) whereas the overall standard deviation is \(47.27Wm^{- 2}\), in comparison the simulation for the CNN-LSTMU has overall mean and standard deviation of \(223.74Wm^{- 2}\) and \(37.14Wm^{- 2}\) respectively, and similarly for the GLM, \(229.07Wm^{- 2}\) and \(43.9Wm^{- 2}\). This result is in contrast to the other metrics where the CNN-LSTMU exhibits better performance for all sites except Carpentaria Downs Station and Majors Creek. The CNN-LSTMU model exhibits higher \(R^{2}\) for the majority of sites with the highest metric of 0.83 for Harewood and Mount Larcom Post Office and the lowest of 0.57 at Carpentaria Downs Station (Table 23). While the GLM model exhibits highest \(R^{2}\) for the Harewood site of 0.74 and the minimum \(R^{2}\) at Carpentaria Downs Station of 0.52. Willmott’s index of Agreement is higher for the GLM for four of the observation sites, with the highest value being for Woolooga at a 0.92 and lowest of 0.83 for Carpentaria Downs Station (Table 23). The CNN-LSTMU model exhibits higher agreement for the majority of the sites, with the maximum being 0.94 at Mount Larcom Post Office and the lowest at the Carpentaria Downs Station of 0.77. The GLM model demonstrates a higher Nash-Sutcliffe Efficiency for three sites with the maximum being 0.73 for sites Harewood and minimum of 0.32 at Carpentaria Downs Station (Table 22). Maximum value of efficiency is achieved by the CNN-LSTMU model is for the Mount Larcom Post Office at 0.81 with the minimum being 0.13 at Carpentaria Downs Station. Results for the RMSE indicate that the GLM model has lower RMSE for Carpentaria Downs Station and Majors Creek with the minimum RMSE given for Woolooga at 24.43 \(Wm^{- 2}\) and the maximum RMSE at Wooleebee Nevasa of 39.54 \(Wm^{- 2}\) (Table 24). The CNN-LSTMU model has lower RMSE for the remaining sites with Mount Larcom Post Office having the smallest RMSE at 19.43 \(Wm^{- 2}\) and the largest at Carpentaria Downs Station with 36.27 \(Wm^{- 2}\). Maximum value for MAE produced by the GLM model also correspond to the Wooleebee Nevasa site at 31.20 \(Wm^{- 2}\)and the minimum produced at Mount Larcom Post Office of 19.9 \(Wm^{- 2}\) (Table 24). The \(CNN - LSTM_{U}\) exhibits a minimum MAE at Mount Larcom Post Office at 15.25 \(Wm^{- 2}\) and the maximum of 31 \(Wm^{- 2}\ \)at Carpentaria Downs Station. The RRMSE for the GLM model resides within the 10% - 20% interval with the best value at Glenlands of 10.83% and largest at Woleebee Nevasa of 17.23% (Table 25). A number of sites for the CNN-LSTMU model exhibit a RRMSE below 10% with the best score at Mount Larcom Post office of 8.43%. The worst RRMSE score for the CNN-LSTMmodel is for the Carpentaria Downs Station at 15.43%.

Table 22 Kling-Gupta Efficiency and Nash-Sutcliffe Efficiency per site for the universal downscaling task. Values in bold indicate better scores.

Kling-Gupta Efficiency Nash-Sutcliffe Efficiency
Site CNN-LSTMU GLM CNN-LSTMU GLM
Barmount 0.77 0.78 0.69 0.63
Carpentaria Downs Station 0.71 0.70 0.13 0.32
Comet Post Office 0.76 0.81 0.67 0.64
Glenlands 0.81 0.82 0.79 0.68
Harewood 0.71 0.79 0.79 0.73
Majors Creek 0.77 0.80 0.55 0.59
Miles Post Office 0.71 0.80 0.79 0.73
Mount Larcom Post Office 0.81 0.79 0.81 0.66
New Caledonia 0.78 0.81 0.70 0.65
Riverview Hopeland 0.72 0.77 0.79 0.72
Talagai 0.77 0.78 0.66 0.60
Woleebee Nevasa 0.72 0.78 0.78 0.39
Woolooga 0.76 0.84 0.72 0.72

Table 23 Comparison of \(R^{2}\) and Willmott’s index of Agreement for universal models. Values in bold indicate better scores.

\(\mathbf{R}^{\mathbf{2}}\) Willmott’s Index of Agreement
Site CNN-LSTMU GLM CNN-LSTMU GLM
Barmount 0.74 0.64 0.90 0.89
Carpentaria Downs Station 0.57 0.52 0.77 0.83
Comet Post Office 0.74 0.67 0.90 0.90
Glenlands 0.82 0.70 0.93 0.91
Harewood 0.83 0.74 0.93 0.92
Majors Creek 0.68 0.64 0.87 0.89
Miles Post Office 0.82 0.73 0.93 0.92
Mount Larcom Post Office 0.83 0.68 0.94 0.90
New Caledonia 0.75 0.67 0.91 0.90
Riverview Hopeland 0.82 0.73 0.93 0.91
Talagai 0.73 0.64 0.90 0.88
Woleebee Nevasa 0.81 0.65 0.92 0.85
Woolooga 0.80 0.73 0.91 0.92

Table 24 Comparison of Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) for universal models. Values in bold indicate better scores.

RMSE \(\mathbf{W}\mathbf{m}^{\mathbf{- 2}}\) MAE \(\mathbf{W}\mathbf{m}^{\mathbf{- 2}}\)
Site CNN-LSTMU GLM CNN-LSTMU GLM
Barmount 24.75 27.27 19.66 20.90
Carpentaria Downs Station 36.27 31.99 31.00 24.40
Comet Post Office 25.63 26.87 20.29 20.51
Glenlands 20.59 25.13 16.20 19.56
Harewood 23.45 26.44 18.59 21.25
Majors Creek 28.06 26.82 23.36 21.22
Miles Post Office 23.65 26.49 18.76 20.89
Mount Larcom Post Office 19.43 25.74 15.25 19.90
New Caledonia 24.40 26.30 19.28 19.94
Riverview Hopeland 23.15 26.69 18.28 21.20
Talagai 25.81 28.11 20.63 22.16
Woleebee Nevasa 23.85 39.54 18.86 31.20
Woolooga 24.39 24.43 18.82 19.28

Table 25 Comparison of Relative Root Mean Square Error (RRMSE) for universal models. Values in bold indicate better scores.

RRMSE %
Site Name CNN-LSTMU GLM
Barmount 10.74 11.81
Carpentaria Downs Station 15.43 13.58
Comet Post Office 11.09 11.60
Glenlands 8.89 10.83
Harewood 10.3 11.59
Majors Creek 12.11 11.56
Miles Post Office 10.38 11.60
Mount Larcom Post Office 8.43 11.14
New Caledonia 10.55 11.35
Riverview Hopeland 10.21 11.75
Talagai 11.15 12.12
Woleebee Nevasa 10.42 17.23
Woolooga 11.17 11.16

The best performing metrics for the CNN-LSTMmodel include sites Glenlands and Mount Larcom Post Office, while the worst performing metrics are identified as Carpentaria Downs Station.

In comparing the bias for Mount Larcom Post Office, the histogram in Figure 50 indicates that the CNN-LSTMto be slightly more normally distributed in comparison to the GLM model with a slightly wider spread of bias between -60 and 60 \(Wm^{- 2}\).

Figure 50 Histogram of the residual error \(Wm^{- 2}\) for all predictions at Mount Larcom Post Office.

Both models appear to exhibit general extremes of higher absolute bias during the summer months and in general minimums appear to occur during the winter months. Mean values for the error per month of year indicate higher values for January – March with the CNN-LSTMU model exhibiting a lower overall range of mean errors between -9.2\(Wm^{- 2}\) to 17.7\(Wm^{- 2}\) as opposed to the GLM model with mean errors in the range of -12.\(5Wm^{- 2}\) to 26.9\(Wm^{- 2}\) (Figure 51).

table>

Figure 51 Comparison of mean error \(Wm^{- 2}\) for month of year at Mount Larcom Post Office.

Comparing the distribution for bias at the site Carpentaria Downs Station, Figure 52 indicates difference in means between the residuals for the GLM and CNN-LSTMwith the latter having mean residuals for this site closer to 40\(Wm^{- 2}\), as opposed to errors for the GLM model appear to be closer to a mean of \(0Wm^{- 2}\) between \(- 100Wm^{- 2}\) and \(- 60Wm^{- 2}\).

Figure 52 Histogram of residuals in \(Wm^{- 2}\) at Carpentaria Downs Station for both models.

Higher extremes of bias at the Carpentaria Downs Station also appear to occur during the summer months for the GLM and during early Autumn and Spring for the CNN-LSTMmodel. Reviewing Figure 53 the range of the mean error for the GLM model per month is lower (-34.34\(\text{\ W}m^{- 2}\) to 6.55\(\text{\ W}m^{- 2}\)) than that of the CNN-LSTMmodel (-2.58\(\text{\ W}m^{- 2}\) to 46.93 \(Wm^{- 2}\)) at this site with the GLM reflecting larger mean error in Summer (February, December) while the CNN-LSTMreflects a larger mean error in early Autumn (March, April) and Spring (September, October).

Figure 53 Comparison of mean error \(Wm^{- 2}\) for month of year at Comet Post Office.

Future Scenario RCP4.5 2006 to 2020

In most locations under the RCP4.5 climate warming scenario, the CNN-LSTM universal model demonstrates better performance on the evaluation metrics for the period 2006 to 2020. The exceptions occur in the KGE metric for the sites Harewood, Miles Post Office, Riverview Hopeland, Talagai and Woolooga where the GLM exhibits a higher KGE.

Table 26 Evaluation metrics under the RCP4.5 profile for both universal models 2006 to 2020. Values in bold indicate better scores.

Site Name Model Name KGE E \[\mathbf{R}^{\mathbf{2}}\] d RMSE \(\mathbf{W}\mathbf{m}^{\mathbf{- 2}}\) MAE \(\mathbf{W}\mathbf{m}^{\mathbf{- 2}}\) RRMSE %
Barmount CNN-LSTMU 0.76 0.70 0.71 0.90 22.77 17.32 10.80
GLM 0.74 -0.19 0.68 0.76 48.65 41.76 22.05
Carpentaria Downs Station CNN-LSTMU 0.81 0.64 0.68 0.90 21.50 17.88 9.88
GLM 0.71 -0.63 0.63 0.71 49.14 42.10 21.57
Comet Post Office CNN-LSTMU 0.76 0.73 0.74 0.91 21.72 16.99 10.11
GLM 0.75 -0.11 0.68 0.76 47.74 40.90 21.21
Glenlands CNN-LSTMU 0.77 0.71 0.74 0.91 22.19 16.93 10.59
GLM 0.74 -0.16 0.68 0.76 47.96 40.99 21.88
Harewood CNN-LSTMU 0.70 0.77 0.82 0.92 22.71 18.24 10.73
GLM 0.78 0.15 0.78 0.81 47.15 41.24 21.30
Majors Creek CNN-LSTMU 0.77 0.67 0.67 0.90 22.45 17.63 10.62
GLM 0.72 -0.42 0.65 0.73 49.75 42.96 22.48
Miles Post Office CNN-LSTMU 0.70 0.77 0.81 0.92 22.60 18.07 10.64
GLM 0.77 0.18 0.76 0.81 46.05 39.08 20.72
Mount Larcom Post Office CNN-LSTMU 0.78 0.70 0.76 0.91 22.32 17.01 10.76
GLM 0.73 -0.33 0.69 0.73 50.71 44.12 23.36
New Caledonia CNN-LSTMU 0.76 0.73 0.73 0.91 21.89 17.00 10.26
GLM 0.74 -0.10 0.67 0.77 47.27 39.89 21.17
Riverview Hopeland CNN-LSTMU 0.71 0.76 0.82 0.92 22.78 18.29 10.81
GLM 0.77 0.17 0.78 0.81 46.02 39.97 20.88
Talagai CNN-LSTMU 0.75 0.72 0.72 0.91 22.25 17.26 10.43
GLM 0.76 0.04 0.67 0.79 44.07 37.22 19.74
Woleebee Nevasa CNN-LSTMU 0.71 0.76 0.80 0.92 22.55 17.91 10.60
GLM 0.66 -0.67 0.67 0.71 64.65 56.39 29.02
Woolooga CNN-LSTMU 0.73 0.62 0.78 0.88 27.02 21.52 13.39
GLM 0.74 -0.15 0.73 0.76 50.51 44.38 23.94

Future Scenario RCP8.5 2006 to 2020

Under the RCP8.5 climate warming scenario, the KGE demonstrate better performance of the GLM model at a majority of sites similar to the test set. However, the CNN-LSTMU exhibits good performance for all other metrics as shown in Table 27.

Table 27 Evaluation metrics under the RCP8.5 profile for both universal models 2006 to 2020. Values in bold indicate better scores.

Site Name Model Name KGE E \[\mathbf{R}^{\mathbf{2}}\] d RMSE \(\mathbf{W}\mathbf{m}^{\mathbf{- 2}}\) MAE \(\mathbf{W}\mathbf{m}^{\mathbf{- 2}}\) RRMSE %
Barmount CNN-LSTMU 0.76 0.70 0.71 0.91 22.73 17.34 10.79
GLM 0.77 0.59 0.64 0.88 28.54 22.28 12.94
Carpentaria Downs Station CNN-LSTMU 0.82 0.66 0.69 0.90 21.00 17.35 9.66
GLM 0.74 0.29 0.58 0.82 32.36 26.21 14.20
Comet Post Office CNN-LSTMU 0.76 0.74 0.74 0.92 21.56 16.78 10.03
GLM 0.81 0.64 0.69 0.90 27.34 20.90 12.15
Glenlands CNN-LSTMU 0.77 0.71 0.74 0.91 22.27 17.03 10.63
GLM 0.79 0.58 0.64 0.88 28.87 21.71 13.17
Harewood CNN-LSTMU 0.70 0.77 0.82 0.92 22.74 18.25 10.75
GLM 0.81 0.70 0.75 0.91 27.81 21.60 12.56
Majors Creek CNN-LSTMU 0.78 0.67 0.68 0.90 22.26 17.39 10.53
GLM 0.75 0.47 0.58 0.85 30.30 23.82 13.70
Miles Post Office CNN-LSTMU 0.70 0.77 0.81 0.92 22.59 18.07 10.64
GLM 0.83 0.76 0.76 0.93 24.96 19.19 11.23
Mount Larcom Post Office CNN-LSTMU 0.78 0.70 0.76 0.91 22.42 17.10 10.80
GLM 0.74 0.58 0.62 0.87 28.56 21.88 13.15
New Caledonia CNN-LSTMU 0.76 0.73 0.73 0.91 21.75 16.85 10.20
GLM 0.80 0.62 0.67 0.89 27.68 21.43 12.40
Riverview Hopeland CNN-LSTMU 0.71 0.76 0.82 0.92 22.79 18.32 10.82
GLM 0.79 0.72 0.75 0.91 26.83 20.81 12.17
Talagai CNN-LSTMU 0.76 0.72 0.72 0.91 22.09 17.12 10.35
GLM 0.81 0.67 0.68 0.91 25.92 19.26 11.61
Woleebee Nevasa CNN-LSTMU 0.71 0.77 0.80 0.92 22.49 17.86 10.57
GLM 0.77 0.31 0.65 0.83 41.49 34.00 18.62
Woolooga CNN-LSTMU 0.73 0.61 0.78 0.88 27.32 21.82 13.54
GLM 0.80 0.59 0.67 0.89 30.17 23.47 14.30

Creative Commons License
Downscaling Global Climate Models with Convolutional and Long-Short-Term Memory Networks for Solar Energy Applications by C.P. Davey is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.