Python – R2 from Statsmodels VAR

R2 from Statsmodels VAR… here is a solution to the problem.

R2 from Statsmodels VAR

Is there an easy way to extract R2 from Statsmodels’ VAR package?

Follow the example in the statsmodels documentation:
http://www.statsmodels.org/dev/vector_ar.html

from statsmodels.tsa.api import VAR
model = VAR(data)
results = model.fit(2)
results.summary()

Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Tue, 28, Feb, 2017
Time:                     21:38:11
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -27.5830
Nobs:                     200.000    HQIC:                  -27.7892
Log likelihood:           1962.57    FPE:                7.42129e-13
AIC:                     -27.9293    Det(Omega_mle):     6.69358e-13
--------------------------------------------------------------------
.
.

It then proceeds to show the coefficients for each equation and finally shows the correlation matrix for the residuals. However, it does not display the R square of each equation.

Does anyone know if there is an easy way to extract the R square from statsmodels VAR without having to calculate it from scratch?

Solution

Using sklearn.metrics.r2_score for each equation is feasible (unfortunately beyond statsmodels). The example code assumes that there is a column named ‘foobar' in the dataframe data, which will be the equation R2 we extracted; Obviously, the VAR() and fit() methods should be suitable for your specific situation

import statsmodels.api as sm
import sklearn.metrics as skm
estVAR = sm.tsa.VAR(data).fit(1)
skm.r2_score(estVAR.fittedvalues['foobar']+estVAR.resid['foobar'],
  estVAR.fittedvalues['foobar'])

The reason for adding the fitted value to the

residuals is to retrieve the actual data for which VAR can construct the fitted value (rather than the entire sample, due to the need for right-hand lagged observation). By the way, we can confirm that this is what we want R2, for example by doing so

y = estVAR.fittedvalues['foobar']+estVAR.resid['foobar']
R2 = 1 - np.sum(estVAR.resid['foobar'].values**2)/np.sum((y.values-y.mean())**2)

Related Problems and Solutions