DieboldMariano test¶
This module provides a function DM
that implements the onesided version of the DieboldMariano (DM) test in the context of electricity price forecasting.
Besides the DM test, the module also provides a function plot_multivariate_DM_test
to plot the DM results when comparing multiple forecasts.
DM test¶
The DieboldMariano (DM) test is probably the most commonly used tool to evaluate the significance of differences in forecasting accuracy. It is an asymptotic ztest of the hypothesis that the mean of the loss differential series:
where \(\varepsilon^\mathrm{Z}_{k}=p_{k}\hat{p}_{k}\) is the prediction error of model Z for time step \(k\) and \(L(\cdot)\) is the loss function. For point forecasts, we usually take \(L(\varepsilon^\mathrm{Z}_{k})=\varepsilon^\mathrm{Z}_{k}^p\) with \(p=1\) or \(2\), which corresponds to the absolute and squared losses.
This module implements the onesided version of the DM test using the a function DM
function. Given the forecast of a model A and the forecast of a model B, the test evaluates the null hypothesis \(H_0\) of the mean of the loss differential of model A being lower or equal than that of model B. Hence, rejecting the null \(H_0\) means that the forecasts of model B are significantly more accurate than those of model A.
The module provides the two standard versions of the test in electricity price forecasting: an univariate and a multivariate version. The univariate version of the test has the advantage of providing a deeper analysis as it indicates which forecast is significantly better for which hour of the days. The multivariate version grants a better representation of the results as it summarizes the comparison in a single pvalue.

epftoolbox.evaluation.
DM
(p_real, p_pred_1, p_pred_2, norm=1, version='univariate')[source]¶ Function that performs the onesided DM test in the contex of electricity price forecasting
The test compares whether there is a difference in predictive accuracy between two forecast
p_pred_1
andp_pred_2
. Particularly, the onesided DM test evaluates the null hypothesis H0 of the forecasting errors ofp_pred_2
being larger (worse) than the forecasting errorsp_pred_1
vs the alternative hypothesis H1 of the errors ofp_pred_2
being smaller (better). Hence, rejecting H0 means that the forecastp_pred_2
is significantly more accurate that forecastp_pred_1
. (Note that this is an informal definition. For a formal one we refer to here)Two versions of the test are possible:
1. A univariate version with as many independent tests performed as prices per day, i.e. 24 tests in most dayahead electricity markets.
2. A multivariate with the test performed jointly for all hours using the multivariate loss differential series (see this article for details.
Parameters:  p_real (numpy.ndarray) – Array of shape \((n_\mathrm{days}, n_\mathrm{prices/day})\) representing the real market prices
 p_pred_1 (TYPE) – Array of shape \((n_\mathrm{days}, n_\mathrm{prices/day})\) representing the first forecast
 p_pred_2 (TYPE) – Array of shape \((n_\mathrm{days}, n_\mathrm{prices/day})\) representing the second forecast
 norm (int, optional) – Norm used to compute the loss differential series. At the moment, this value must either be 1 (for the norm1) or 2 (for the norm2).
 version (str, optional) –
Version of the test as defined in here. It can have two values:
'univariate
or'multivariate
Returns: The pvalue after performing the test. It is a float in the case of the multivariate test and a numpy array with a pvalue per hour for the univariate test
Return type: float, numpy.ndarray
Example
>>> from epftoolbox.evaluation import DM >>> from epftoolbox.data import read_data >>> import pandas as pd >>> >>> # Generating forecasts of multiple models >>> >>> # Download available forecast of the NP market available in the library repository >>> # These forecasts accompany the original paper >>> forecasts = pd.read_csv('https://raw.githubusercontent.com/jeslago/epftoolbox/master/' + ... 'forecasts/Forecasts_NP_DNN_LEAR_ensembles.csv', index_col=0) >>> >>> # Deleting the real price field as it the actual real price and not a forecast >>> del forecasts['Real price'] >>> >>> # Transforming indices to datetime format >>> forecasts.index = pd.to_datetime(forecasts.index) >>> >>> # Extracting the real prices from the market >>> _, df_test = read_data(path='.', dataset='NP', begin_test_date=forecasts.index[0], ... end_test_date=forecasts.index[1]) Test datasets: 20161227 00:00:00  20181224 23:00:00 >>> >>> real_price = df_test.loc[:, ['Price']] >>> >>> # Testing the univariate DM version on an ensemble of DNN models versus an ensemble >>> # of LEAR models >>> DM(p_real=real_price.values.reshape(1, 24), ... p_pred_1=forecasts.loc[:, 'LEAR Ensemble'].values.reshape(1, 24), ... p_pred_2=forecasts.loc[:, 'DNN Ensemble'].values.reshape(1, 24), ... norm=1, version='univariate') array([9.99999944e01, 9.97562415e01, 8.10333949e01, 8.85201928e01, 9.33505978e01, 8.78116764e01, 1.70135981e02, 2.37961920e04, 5.52337353e04, 6.07843340e05, 1.51249750e03, 1.70415008e03, 4.22319907e03, 2.32808010e03, 3.55958698e03, 4.80663621e03, 1.64841032e04, 4.55829140e02, 5.86609688e02, 1.98878375e03, 1.04045731e01, 8.71203187e02, 2.64266732e01, 4.06676195e02]) >>> >>> # Testing the multivariate DM version >>> DM(p_real=real_price.values.reshape(1, 24), ... p_pred_1=forecasts.loc[:, 'LEAR Ensemble'].values.reshape(1, 24), ... p_pred_2=forecasts.loc[:, 'DNN Ensemble'].values.reshape(1, 24), ... norm=1, version='multivariate') 0.003005725748326471
plot_multivariate_DM_test¶
The plot_multivariate_DM_test
provides an easytouse interface to plot in a heat map with a chessboard shape the results of using the DM test to compare the forecasts of multiple models. An example of the heat map is provided below in the function example.

epftoolbox.evaluation.
plot_multivariate_DM_test
(real_price, forecasts, norm=1, title='DM test', savefig=False, path='')[source]¶ Plotting the results of comparing forecasts using the multivariate DM test.
The resulting plot is a heat map in a chessboard shape. It represents the pvalue of the null hypothesis of the forecast in the yaxis being significantly more accurate than the forecast in the xaxis. In other words, pvalues close to 0 represent cases where the forecast in the xaxis is significantly more accurate than the forecast in the yaxis.
Parameters:  real_price (pandas.DataFrame) – Dataframe that contains the real prices
 forecasts (TYPE) – Dataframe that contains the forecasts of different models. The column names are the
forecast/model names. The number of datapoints should equal the number of datapoints
in
real_price
.  norm (int, optional) – Norm used to compute the loss differential series. At the moment, this value must either be 1 (for the norm1) or 2 (for the norm2).
 title (str, optional) – Title of the generated plot
 savefig (bool, optional) – Boolean that selects whether the figure should be saved in the current folder
 path (str, optional) – Path to save the figure. Only necessary when savefig=True
Example
>>> from epftoolbox.evaluation import DM, plot_multivariate_DM_test >>> from epftoolbox.data import read_data >>> import pandas as pd >>> >>> # Generating forecasts of multiple models >>> >>> # Download available forecast of the NP market available in the library repository >>> # These forecasts accompany the original paper >>> forecasts = pd.read_csv('https://raw.githubusercontent.com/jeslago/epftoolbox/master/' + ... 'forecasts/Forecasts_NP_DNN_LEAR_ensembles.csv', index_col=0) >>> >>> # Deleting the real price field as it the actual real price and not a forecast >>> del forecasts['Real price'] >>> >>> # Transforming indices to datetime format >>> forecasts.index = pd.to_datetime(forecasts.index) >>> >>> # Extracting the real prices from the market >>> _, df_test = read_data(path='.', dataset='NP', begin_test_date=forecasts.index[0], ... end_test_date=forecasts.index[1]) Test datasets: 20161227 00:00:00  20181224 23:00:00 >>> >>> real_price = df_test.loc[:, ['Price']] >>> >>> # Generating a plot to compare the models using the multivariate DM test >>> plot_multivariate_DM_test(real_price=real_price, forecasts=forecasts)