LoginSignup
0
1

More than 3 years have passed since last update.

Applying an ARIMA Based Prediction Model on S&P500 ETF(SPY)

Last updated at Posted at 2020-07-13

What is ARIMA?

ARIMA is short for Autoregressive Integrated Moving Average. An ARIMA model takes in three parameters:

  • p : the number of time lags
  • d : the degree of differencing
  • q : the order of the moving average

It is usually made fit to a historical time series data and used to forecast future points in that series. Common applications of an ARIMA model may involve product demand predictions, price predictions, visitor estimations, or any of the above but with a seasonal trend.

Source Code

https://github.com/andy971022/SP500-ARIMA

When to Use Arima?

  • When you have a time series data.
  • When your time series data is stationary with respect to a certain order of differencing.
  • When you want to predict the near future.

What Do You Mean by Stationary?

  • Definition
  • In Short: Your data does not show a time-depending trend.

Examples

The following images will show up again in our tutorial, but we'll use them to explain what a stationary process is. The first image down below is S&P500 ETF's(SPY) historical stock prices, and the second image is the second differentiation of that stock price from August 2019 to November 2019.

image.png

image.png

The process in the first image is not stationary because it has an obviously up-going trend correlating with time. The second image, however, shows a stationary process because the data points revolve around a constant value, meaning that they do not vary and correlate with time.

Steps to Creating an ARIMA Model

  1. Collect a set of time series data.
  2. Obtain the parameter d : do an ADF(augmented Dickey–Fuller test) test on the dataset with respect to various orders of differencing to obtain a stationary process.
  3. Obtain the parameter p : do an Autocorrelation test on the dataset with respect to the selected order d.
  4. Obtain the parameter q : do a Partial Autocorrelation test on the dataset with respect to the selected order d.
  5. Create the ARIMA(p,d,q) model with the deduced parameters.
  6. Fit and forecast.

Step 1

image.png
We'll get the historical closing stock prices for S&P500 from yfinance and predict the stock prices for November 2019.
image.png
We plotted the graphs and saw that the stock prices are far from being stationary.
So, let's cherry-pick just the data three months prior to our target of prediction.

Step 2

image.png
image.png
We'll create the first and second differences and compute the ADF tests following on.
The p-value of the first differentiation, -8.17674265403581, is less than -2.8853397507076006, the 5% alpha value, and even less than the 1% alpha value. That is, we can reject the null hypothesis and state that the data is stationary.

image.png
We see that the first difference is well enough for the ARIMA model. Therefore, we'll set the parameter d to 1.

Step 3

image.png
We have computed the autocorrelation test, and the result shows that 1 is a suitable value for p. Why? Because we need to truncate the first lag in order to have the rest of the autocorrelations fall within the 95% confidence interval, represented by the blue shades.

Step 4

image.png

With similar reasons mentioned in Step 3, q is selected as one.

Step 5

image.png
Now we have deduced an ARIMA(1,1,1) model and will fit the closing stock prices into the model. Be careful not to throw in the first difference of the stock price into the model.

Step 6

image.png
image.png

We have now fed data three months prior to our prediction into the model. The second last image is our prediction while the last one is the actual data. We see that our crude model can still be able to predict an accurate trend.

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1