Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to interpret adfuller test results?

I am struggling to understand the concept of p-value and the various other results of adfuller test.

The code I am using:

(I found this code in Stack Overflow)

import numpy as np
import os
import pandas as pd
import statsmodels.api as sm
import cython

import statsmodels.tsa.stattools as ts

loc = r"C:\Stock Study\Stock Research\Hist Data"
os.chdir(loc)
xl_file1 = pd.ExcelFile("HDFCBANK.xlsx")
xl_file2 = pd.ExcelFile("KOTAKBANK.xlsx")
y1 = xl_file1.parse("Sheet1")
x1 = xl_file2.parse("Sheet1")

x = x1['Close']
y = y1['Close']


def cointegration_test(y, x):
    # Step 1: regress on variable on the other
    ols_result = sm.OLS(y, x).fit()
    # Step 2: obtain the residual (ols_resuld.resid)
    # Step 3: apply Augmented Dickey-Fuller test to see whether
    #        the residual is unit root
    return ts.adfuller(ols_result.resid)

The output:

(-1.8481210964862593, 0.35684591783869046, 0, 1954, {'10%': -2.5675580437891359, '1%': -3.4337010293693235, '5%': -2.863020285222162}, 21029.870846458849)

If I understand the test correctly:

Value
adf : float Test statistic
pvalue : float MacKinnon’s approximate p-value based on MacKinnon (1994, 2010)
usedlag : int Number of lags used
nobs : int Number of observations used for the ADF regression and calculation of the critical values
critical values : dict Critical values for the test statistic at the 1 %, 5 %, and 10 % levels. Based on MacKinnon (2010)
icbest : float The maximized information criterion if autolag is not None.
resstore : ResultStore, optional

I am unable to completely understand the results and was hoping someone would be willing to explain them in layman's language. All the explanations I am finding are very technical.

My interpretation is: they are cointegrated, i.e. we failed to disprove the null hypothesis(i.e. unit root exists). Confidence levels are the % numbers.

Am I completely wrong?

like image 528
Sid Avatar asked Nov 17 '17 11:11

Sid


3 Answers

what you stated in your question is correct. Once you applied the Adfuller test over your OLS regression residue, you were checking whether your residue had any heterocedasticity, in another words, if your residue was stationary.

Since your adfuller p-value is lower than a certain specified alpha (i.e.: 5%), then you may reject the null hypothesis (Ho), because the probability of getting a p-value as low as that by mere luck (random chance) is very unlikely.

Once the Ho is rejected, the alternative hypothesis (Ha) can be accepted, which in this case would be: the residue series is stationary.

Here is the hypothesis relation for you:

Ho: the series is not stationary, it presents heterocedasticity. In another words, your residue depends on itself (i.e.: yt depends on yt-1, yt-1 depends on yt-2 ..., and so on)

Ha: the series is stationary (That is normally what we desire in regression analysis). Nothing more is needed to be done.

like image 152
Philipe Riskalla Leal Avatar answered Nov 18 '22 23:11

Philipe Riskalla Leal


Null hypothesis: Non Stationarity exists in the series.

Alternative Hypothesis: Stationarity exists in the series

Data: (-1.8481210964862593, 0.35684591783869046, 0, 1954, {'10%': -2.5675580437891359, 
'1%': -3.4337010293693235, '5%': -2.863020285222162}, 21029.870846458849)

Lets break data one by one.

First data point: -1.8481210964862593: Critical value of the data in your case

Second data point: 0.35684591783869046: Probability that null hypothesis will not be rejected(p-value)

Third data point: 0: Number of lags used in regression to determine t-statistic. So there are no auto correlations going back to '0' periods here.

Forth data point: 1954: Number of observations used in the analysis.

Fifth data point: {'10%': -2.5675580437891359, '1%': -3.4337010293693235, '5%': -2.863020285222162}: T values corresponding to adfuller test.

Since critical value -1.8>-2.5,-3.4,-2.8 (t-values at 1%,5%and 10% confidence intervals), null hypothesis cannot be rejected. So there is non stationarity in your data

Also p-value of 0.35>0.05(if we take 5% significance level or 95% confidence interval), null hypothesis cannot be rejected.

Hence data is non stationary (that means it has relation with time)

like image 36
noob Avatar answered Nov 19 '22 00:11

noob


The typical way to reject the null hypothesis would be that your t-test result -1.84 is less than all critical values (1%, 5%, 10%) which in this case, it's not less than your critical values.

like image 8
antonio_zeus Avatar answered Nov 19 '22 00:11

antonio_zeus