ARIMA

Identifying the numbers of AR or MA terms in an ARIMA model

  1. ACF and PACF plots
  2. AR and MA signatures
  3. A model for the UNITS series--ARIMA(2,1,0)
  4. Mean versus constant
  5. Alternative model for the UNITS series--ARIMA(0,2,1)
  6. Which model should we choose?
  7. Mixed models
  8. Unit roots

ACF, PACF plots

Partial Correlation

두 변수 간의 partial correlation "부분적" 상관관계는 다른 변수 집합과의 상호 상관관계로 설명되지 않는 두 변수 간의 순수한 상관관계를 의미한다.

  • 예를 들어, X1,X2,X3X_1,X_2,X_3
  • YYX3X_3

Partial AutoCorrelation 부분자기상관

부분자기상관은 all lower-order lags 보다 낮은 시차로는 설명되지 않는 예측값과 해당 시차의 자기상관 양을 의미한다.

  • 시차 1에서 시계열 YY의 자기상관은 Yt,Yt1Y_t, Y_{t-1}
  • 시차 1의 상관관계가 일정한 경우, 시차2에서의 상관계수는 시차1에서의 상관계수의 제곱일 것이다.

차분된 시계열에 남아있는 자기 상관을 수정하기 위해 AR항 혹은 MA항이 필요한 지 결정하는 데 자기상관 함수(ACF), 부분 자기상관(PACF) 플롯을 활용한다.

  1. 원래의 단위 시계열UNITS series
    • ACF
      큰 시차 (lag)에서도 유의미한 자기상관을 보이지만, 아마 시차 1에서의 자기상관의 전파 propagation 영향으로 시차 2ㅜ이상에서 나타날 것이다.
    • PACF
      시차 1에서만 유의미한 spike를 보이는데, 이는 1보다 큰 고차의 자기상관이 시차1의 자기상관에 의해 충분히 설명됨을 의미한다.
  • 모든 시차에서의 부분자기상관은 AR 모델 피팅 과정에서 계산할 수 있는데, 시차 kk에서의 부분자기상관은 kk항이 있는 자기회귀모델에서 추정된 AR(k)AR(k)의 계수와 동일하다.

즉, Y가 LAG(Y,1), LAG(Y)에 대해 회귀되는 다중 회귀 모델 ,2) 등, 최대 LAG(Y,k). 따라서 PACF를 단순히 검사하여 시계열에서 자기상관 패턴을 설명하는 데 사용해야 하는 AR 항의 수를 결정할 수 있습니다. PACF가 지연 k에서 "절단"하면 차수 k의 자기회귀 모델을 피팅해야 함을 나타냅니다.
PACF 부분자기상관함수가 시차 k에서 cut off 잘리면, 차수 kk의 자기회귀 모델 AR(k)AR(k)이 적합하다고

UNITS 시리즈의 PACF는 차단 현상의 극단적인 예를 제공합니다. 지연 1에서 매우 큰 스파이크가 있고 다른 중요한 스파이크가 없으므로 차이가 없는 경우 AR(1) 모델을 사용해야 함을 나타냅니다. 그러나 이 모델의 AR(1) 항은 추정된 AR(1) 계수(지연 1에서 PACF 스파이크의 높이)가 거의 정확히 1과 같을 것이기 때문에 첫 번째 차이와 동일한 것으로 판명됩니다. 이제 차분 차수가 없는 시리즈 Y에 대한 AR(1) 모델에 대한 예측 방정식은 다음과 같습니다.

Y^t=μ+ϕ1Yt1Ŷ_t = μ + ϕ_1Y_{t-1}

이 방정식의 AR(1) 계수 ϕ1이 1과 같으면 Y의 첫 번째 차이가 일정하다고 예측하는 것과 같습니다. 즉, 성장이 있는 랜덤 워크 모델의 방정식과 같습니다.

Ŷt = μ + Yt-1

UNITS 시리즈의 PACF는 차이가 없으면 첫 번째 차이를 취하는 것과 동일한 것으로 판명되는 AR(1) 모델을 피팅해야 한다고 말합니다. 다시 말해서, UNITS가 정상화되기 위해서는 차분 차수가 정말로 필요하다는 것을 말해주고 있습니다.

AR항, MA항 결정 시그니처: PACF가 급격한 컷오프를 표시하는 반면 ACF는 더 천천히 감쇠하는 경우(즉, 더 높은 지연에서 상당한 스파이크가 있음), 고정된 계열이 "AR 시그니처"를 표시한다고 말하며, 이는 자기상관 패턴을 설명할 수 있음을 의미합니다. MA 용어를 추가하는 것보다 AR 용어를 추가하는 것이 더 쉽습니다. AR 시그니처는 일반적으로 지연 1에서 양의 자기상관과 연관되어 있음을 알게 될 것입니다. 그 이유는 AR 용어가 예측 방정식에서 "부분적 차이"처럼 작용할 수 있기 때문입니다. 예를 들어, AR(1) 모델에서 AR 항은 자기회귀 계수가 1이면 첫 번째 차이처럼 작동하고, 자기회귀 계수가 0이면 아무 일도 하지 않으며, 계수가 다음 사이에 있으면 부분 차이처럼 작동합니다. 0과 1. 따라서 계열이 약간 과소차분하면 양의 자기상관의 비정상 패턴이 완전히 제거되지 않은 경우 AR 서명을 표시하여 부분적 차이를 "요청"합니다. 따라서 AR 용어를 추가할 시기를 결정할 때 다음과 같은 경험 법칙이 있습니다.

규칙 6: 차분 계열의 PACF가 급격한 컷오프를 표시하거나 시차-1 자기상관이 양수인 경우(즉, 계열이 약간 "과소차분된" 것으로 표시되는 경우) 모델에 AR 항을 추가하는 것을 고려하십시오. PACF가 차단되는 시차는 표시된 AR 항 수입니다.

원칙적으로 자기 상관 패턴은 예측 방정식에 충분한 자기회귀 항(정상화된 계열의 시차)을 추가하여 고정된 계열에서 제거할 수 있으며 PACF는 그러한 항이 얼마나 필요할지 알려줍니다. 그러나 이것이 항상 주어진 자기상관 패턴을 설명하는 가장 간단한 방법은 아닙니다. 때로는 대신 MA 항(예측 오차의 지연)을 추가하는 것이 더 효율적입니다. 자기상관 함수(ACF)는 PACF가 AR 항에 대해 수행하는 것과 동일한 역할을 MA 항에 대해 수행합니다. 즉, ACF는 차분 계열에서 나머지 자기상관을 제거하는 데 필요한 MA 항의 수를 알려줍니다. 자기상관이 시차 k에서 유의하지만 더 높은 시차에서는 중요하지 않은 경우(즉, ACF가 시차 k에서 "컷오프"되는 경우) 이는 정확히 k MA 항이 예측 방정식에 사용되어야 함을 나타냅니다. 후자의 경우, 정상화된 급수가 "MA 서명"을 표시한다고 말합니다. 즉, 자기상관 패턴은 AR 항을 추가하는 것보다 MA 항을 추가하는 것이 더 쉽게 설명될 수 있음을 의미합니다.
Introduction to ARIMA: nonseasonal models

ARIMA(p,d,q) forecasting equation
ARIMA(1,0,0) = first-order autoregressive model
ARIMA(0,1,0) = random walk
ARIMA(1,1,0) = differenced first-order autoregressive model
ARIMA(0,1,1) without constant = simple exponential smoothing
ARIMA(0,1,1) with constant = simple exponential smoothing with growth
ARIMA(0,2,1) or (0,2,2) without constant = linear exponential smoothing
ARIMA(1,1,2) with constant = damped-trend linear exponential smoothing
Spreadsheet implementation

ARIMA(p,d,q) forecasting equation

nonseasonal ARIMA model is classified as an "ARIMA(p,d,q)" model, where:
P = # of seasonal autoregressive terms
D = # of seasonal differences
Q = # of seasonal moving-average terms

p is the number of autoregressive terms,
d is the number of nonseasonal differences needed for stationarity, and
q is the number of lagged forecast errors in the prediction equation.
The forecasting equation is constructed as follows. First, let y denote the dth difference of Y, which means:

If d=0: yt = Yt

If d=1: yt = Yt - Yt-1

If d=2: yt = (Yt - Yt-1) - (Yt-1 - Yt-2) = Yt - 2Yt-1 + Yt-2

Note that the second difference of Y (the d=2 case) is not the difference from 2 periods ago. Rather, it is the first-difference-of-the-first difference, which is the discrete analog of a second derivative, i.e., the local acceleration of the series rather than its local trend.

In terms of y, the general forecasting equation is:

ŷt = μ + ϕ1 yt-1 +…+ ϕp yt-p - θ1et-1 -…- θqet-q

Here the moving average parameters (θ’s) are defined so that their signs are negative in the equation, following the convention introduced by Box and Jenkins. Some authors and software (including the R programming language) define them so that they have plus signs instead. When actual numbers are plugged into the equation, there is no ambiguity, but it’s important to know which convention your software uses when you are reading the output. Often the parameters are denoted there by AR(1), AR(2), …, and MA(1), MA(2), … etc..

To identify the appropriate ARIMA model for Y, you begin by determining the order of differencing (d) needing to stationarize the series and remove the gross features of seasonality, perhaps in conjunction with a variance-stabilizing transformation such as logging or deflating. If you stop at this point and predict that the differenced series is constant, you have merely fitted a random walk or random trend model. However, the stationarized series may still have autocorrelated errors, suggesting that some number of AR terms (p ≥ 1) and/or some number MA terms (q ≥ 1) are also needed in the forecasting equation.

The process of determining the values of p, d, and q that are best for a given time series will be discussed in later sections of the notes (whose links are at the top of this page), but a preview of some of the types of nonseasonal ARIMA models that are commonly encountered is given below.

ARIMA(1,0,0) = first-order autoregressive model: if the series is stationary and autocorrelated, perhaps it can be predicted as a multiple of its own previous value, plus a constant. The forecasting equation in this case is

Ŷt = μ + ϕ1Yt-1

…which is Y regressed on itself lagged by one period. This is an “ARIMA(1,0,0)+constant” model. If the mean of Y is zero, then the constant term would not be included.

If the slope coefficient ϕ1 is positive and less than 1 in magnitude (it must be less than 1 in magnitude if Y is stationary), the model describes mean-reverting behavior in which next period’s value should be predicted to be ϕ1 times as far away from the mean as this period’s value. If ϕ1 is negative, it predicts mean-reverting behavior with alternation of signs, i.e., it also predicts that Y will be below the mean next period if it is above the mean this period.

In a second-order autoregressive model (ARIMA(2,0,0)), there would be a Yt-2 term on the right as well, and so on. Depending on the signs and magnitudes of the coefficients, an ARIMA(2,0,0) model could describe a system whose mean reversion takes place in a sinusoidally oscillating fashion, like the motion of a mass on a spring that is subjected to random shocks.

ARIMA(0,1,0) = random walk: If the series Y is not stationary, the simplest possible model for it is a random walk model, which can be considered as a limiting case of an AR(1) model in which the autoregressive coefficient is equal to 1, i.e., a series with infinitely slow mean reversion. The prediction equation for this model can be written as:

Ŷt - Yt-1 = μ

or equivalently

Ŷt = μ + Yt-1

...where the constant term is the average period-to-period change (i.e. the long-term drift) in Y. This model could be fitted as a no-intercept regression model in which the first difference of Y is the dependent variable. Since it includes (only) a nonseasonal difference and a constant term, it is classified as an "ARIMA(0,1,0) model with constant." The random-walk-without-drift model would be an ARIMA(0,1,0) model without constant

ARIMA(1,1,0) = differenced first-order autoregressive model: If the errors of a random walk model are autocorrelated, perhaps the problem can be fixed by adding one lag of the dependent variable to the prediction equation--i.e., by regressing the first difference of Y on itself lagged by one period. This would yield the following prediction equation:

Ŷt - Yt-1 = μ + ϕ1(Yt-1 - Yt-2)

Ŷt - Yt-1 = μ

which can be rearranged to

Ŷt = μ + Yt-1 + ϕ1 (Yt-1 - Yt-2)

This is a first-order autoregressive model with one order of nonseasonal differencing and a constant term--i.e., an ARIMA(1,1,0) model.

ARIMA(0,1,1) without constant = simple exponential smoothing: Another strategy for correcting autocorrelated errors in a random walk model is suggested by the simple exponential smoothing model. Recall that for some nonstationary time series (e.g., ones that exhibit noisy fluctuations around a slowly-varying mean), the random walk model does not perform as well as a moving average of past values. In other words, rather than taking the most recent observation as the forecast of the next observation, it is better to use an average of the last few observations in order to filter out the noise and more accurately estimate the local mean. The simple exponential smoothing model uses an exponentially weighted moving average of past values to achieve this effect. The prediction equation for the simple exponential smoothing model can be written in a number of mathematically equivalent forms, one of which is the so-called “error correction” form, in which the previous forecast is adjusted in the direction of the error it made:

Ŷt = Ŷt-1 + αet-1

Because et-1 = Yt-1 - Ŷt-1 by definition, this can be rewritten as:

Ŷt = Yt-1 - (1-α)et-1

   =  Yt-1  - θ1et-1

which is an ARIMA(0,1,1)-without-constant forecasting equation with θ1 = 1-α. This means that you can fit a simple exponential smoothing by specifying it as an ARIMA(0,1,1) model without constant, and the estimated MA(1) coefficient corresponds to 1-minus-alpha in the SES formula. Recall that in the SES model, the average age of the data in the 1-period-ahead forecasts is 1/α, meaning that they will tend to lag behind trends or turning points by about 1/α periods. It follows that the average age of the data in the 1-period-ahead forecasts of an ARIMA(0,1,1)-without-constant model is 1/(1-θ1). So, for example, if θ1 = 0.8, the average age is 5. As θ1 approaches 1, the ARIMA(0,1,1)-without-constant model becomes a very-long-term moving average, and as θ1 approaches 0 it becomes a random-walk-without-drift model.

What’s the best way to correct for autocorrelation: adding AR terms or adding MA terms? In the previous two models discussed above, the problem of autocorrelated errors in a random walk model was fixed in two different ways: by adding a lagged value of the differenced series to the equation or adding a lagged value of the forecast error. Which approach is best? A rule-of-thumb for this situation, which will be discussed in more detail later on, is that positive autocorrelation is usually best treated by adding an AR term to the model and negative autocorrelation is usually best treated by adding an MA term. In business and economic time series, negative autocorrelation often arises as an artifact of differencing. (In general, differencing reduces positive autocorrelation and may even cause a switch from positive to negative autocorrelation.) So, the ARIMA(0,1,1) model, in which differencing is accompanied by an MA term, is more often used than an ARIMA(1,1,0) model.

ARIMA(0,1,1) with constant = simple exponential smoothing with growth: By implementing the SES model as an ARIMA model, you actually gain some flexibility. First of all, the estimated MA(1) coefficient is allowed to be negative: this corresponds to a smoothing factor larger than 1 in an SES model, which is usually not allowed by the SES model-fitting procedure. Second, you have the option of including a constant term in the ARIMA model if you wish, in order to estimate an average non-zero trend. The ARIMA(0,1,1) model with constant has the prediction equation:

Ŷt = μ + Yt-1 - θ1et-1

The one-period-ahead forecasts from this model are qualitatively similar to those of the SES model, except that the trajectory of the long-term forecasts is typically a sloping line (whose slope is equal to mu) rather than a horizontal line.

ARIMA(0,2,1) or (0,2,2) without constant = linear exponential smoothing: Linear exponential smoothing models are ARIMA models which use two nonseasonal differences in conjunction with MA terms. The second difference of a series Y is not simply the difference between Y and itself lagged by two periods, but rather it is the first difference of the first difference--i.e., the change-in-the-change of Y at period t. Thus, the second difference of Y at period t is equal to (Yt - Yt-1) - (Yt-1 - Yt-2) = Yt - 2Yt-1 + Yt-2. A second difference of a discrete function is analogous to a second derivative of a continuous function: it measures the "acceleration" or "curvature" in the function at a given point in time.

The ARIMA(0,2,2) model without constant predicts that the second difference of the series equals a linear function of the last two forecast errors:

Ŷt - 2Yt-1 + Yt-2 = - θ1et-1 - θ2et-2

which can be rearranged as:

Ŷt = 2 Yt-1 - Yt-2 - θ1et-1 - θ2et-2

where θ1 and θ2 are the MA(1) and MA(2) coefficients. This is a general linear exponential smoothing model, essentially the same as Holt’s model, and Brown’s model is a special case. It uses exponentially weighted moving averages to estimate both a local level and a local trend in the series. The long-term forecasts from this model converge to a straight line whose slope depends on the average trend observed toward the end of the series.

ARIMA(1,1,2) without constant = damped-trend linear exponential smoothing:

Ŷt = Yt-1 + ϕ1 (Yt-1 - Yt-2 ) - θ1et-1 - θ1et-1

This model is illustrated in the accompanying slides on ARIMA models. It extrapolates the local trend at the end of the series but flattens it out at longer forecast horizons to introduce a note of conservatism, a practice that has empirical support. See the article on "Why the Damped Trend works" by Gardner and McKenzie and the "Golden Rule" article by Armstrong et al. for details.

It is generally advisable to stick to models in which at least one of p and q is no larger than 1, i.e., do not try to fit a model such as ARIMA(2,1,2), as this is likely to lead to overfitting and "common-factor" issues that are discussed in more detail in the notes on the mathematical structure of ARIMA models.

Spreadsheet implementation: ARIMA models such as those described above are easy to implement on a spreadsheet. The prediction equation is simply a linear equation that refers to past values of original time series and past values of the errors. Thus, you can set up an ARIMA forecasting spreadsheet by storing the data in column A, the forecasting formula in column B, and the errors (data minus forecasts) in column C. The forecasting formula in a typical cell in column B would simply be a linear expression referring to values in preceding rows of columns A and C, multiplied by the appropriate AR or MA coefficients stored in cells elsewhere on the spreadsheet.

Go to next topic: Identifying the order of differencing

좋은 웹페이지 즐겨찾기