Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it Safe to Say That Non-Stationary Series Can Not Be Simulated Using R

Tags:

r

arima

I am performing an experiment using classes of ARIMA models which I have to simulate for instance AR, MA, ARMA and possibly ARIMA using arima.sim() function in R. I am able to AR, MA, ARMA but ARIMA seems impossible as the coefficient of AR must be greater than 1 ($|\phi| > 1$) for the series to be non-stationary.

Simulate MA of 10 sample size with a coefficient of 0.8

set.seed(837530)
ma1 <- arima.sim(n = 10, model = list(ma = c(0.8), order = c(0, 0, 1)), sd = 1)
forecast::auto.arima(ma1, ic = "aicc")

Simulate MA of 10 sample size with coefficients of 0.4 and 0.4

set.seed(181917)
ma2 <- arima.sim(n = 10, model = list(ma = c(0.4, 0.4), order = c(0, 0, 2)), sd = 1)
forecast::auto.arima(ma2, ic = "aicc")

Simulate ARMA of 10 sample size with coefficients of 0.4 and 0.4

set.seed(455287)
arma11 <- arima.sim(n = 10, model = list(ar = c(0.4), ma = c(0.4), order = c(1, 0, 1)), sd = 1)
forecast::auto.arima(arma11, ic = "aicc")

My attempt to simulate from ARIMA

arima111 <- arima.sim(n = 10, model = list(ar = c(1.1), ma = c(0.3), order = c(1, 1, 1)), sd = 1)
Error in arima.sim(n = 10, model = list(ar = c(1.1), ma = c(0.3), order = c(1,  : 
  'ar' part of model is not stationary

WHAT I WANT

  1. Can it be said in a public domain that it is not possible to simulate a series from ARIMA distribution? (having $\phi$ value as the estimate of AR parameter and $\theta$ asthe estimate of `MA parameter).

  2. If it is possible, please demonstrate it for me using arima.sim() function or through any other way using R.

like image 726
Daniel James Avatar asked Nov 23 '25 23:11

Daniel James


1 Answers

It is possible to simulate a nonstationary process using arima.sim. From ?arima:

For ARIMA models with differencing, the differenced series follows a zero-mean ARMA model.

Hence, in the call arima.sim(model = list(order, ar, ma), n) with order = c(p, d, q), ar and ma specify parameters of an ARMA(p, q) model for a d-order differenced time series, not an undifferenced time series.

Stationarity of the the differenced time series is enforced with the following constraint on ar, which you can find in the body of arima.sim:

    p <- length(model$ar)
    if (p) {
        minroots <- min(Mod(polyroot(c(1, -model$ar))))
        if (minroots <= 1) 
            stop("'ar' part of model is not stationary")
    }

In other words, the roots of the polynomial whose coefficients are c(1, -ar) are required to lie outside of the unit circle. (The very same constraint is stated here.)

Stationarity of the differenced time series does not imply stationarity of the undifferenced time series. In fact, the undifferenced time series is never stationary. ?arima.sim gives an example:

ts.sim <- arima.sim(list(order = c(1, 1, 0), ar = 0.7), n = 200)
ts.plot(ts.sim)

But it is better to compare collections of simulations:

set.seed(1L)
X <- replicate(50L, arima.sim(list(order = c(1, 0, 0), ar = 0.7), n = 200))
Y <- replicate(50L, arima.sim(list(order = c(1, 1, 0), ar = 0.7), n = 200))

par(mfrow = c(1L, 2L))
matplot(X, type = "l", main = "AR(1) = ARIMA(1, 0, 0)")
matplot(Y, type = "l", main = "ARIMA(1, 1, 0)")

enter image description here

If you insist on p-values, then you can perform Dickey-Fuller tests:

apply(X, 2L, function(x) tseries::adf.test(x, k = 0)$p.value)
##  [1] 0.9263981 0.9900000 0.9900000 0.2561765 0.9218488
##  [6] 0.9900000 0.1816286 0.9900000 0.9900000 0.9372659
## [11] 0.6991314 0.9018098 0.8618604 0.9390623 0.6203794
## [16] 0.9740316 0.9002020 0.8196272 0.6657746 0.9593784
## [21] 0.9900000 0.9900000 0.8572043 0.1337945 0.9817778
## [26] 0.9900000 0.9900000 0.9816193 0.9562682 0.9900000
## [31] 0.4298187 0.9550181 0.7608263 0.9900000 0.9455187
## [36] 0.7616230 0.9412379 0.9588200 0.9354637 0.9900000
## [41] 0.9900000 0.9628406 0.9818668 0.9203677 0.9804385
## [46] 0.9900000 0.9900000 0.9807893 0.9048136 0.9822757

apply(X, 2L, function(x) tseries::adf.test(diff(x), k = 0)$p.value)
##  [1] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
## [11] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
## [21] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
## [31] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
## [41] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

apply(Y, 2L, function(x) tseries::adf.test(x, k = 0)$p.value)
##  [1] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
## [11] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
## [21] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
## [31] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
## [41] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

(These p-values are computed by linear interpolation on a very coarse grid. You should take 0.01 to mean some number between 0 and 0.01, and 0.99 to mean some number between 0.99 and 1.)

like image 153
Mikael Jagan Avatar answered Nov 26 '25 13:11

Mikael Jagan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!