Time Series Analysis Dashboard

Interactive Time Series Plot
Learn about differencing and transformation

Why do we difference time series data?

Differencing is a technique used to make a non-stationary time series stationary by removing trends and seasonality:

  • First Differencing: Removes trend by taking the difference between consecutive observations. The formula is:
    \[ \nabla Y_t = Y_t - Y_{t-1} \]
    This helps stabilize the mean by removing linear trends.
  • Seasonal Differencing: Removes seasonality by taking the difference between an observation and the same observation from the previous season. The formula is:
    \[ \nabla_s Y_t = Y_t - Y_{t-s} \]
    where \(s\) is the seasonal period (e.g., 4 for quarterly data, 12 for monthly data).
  • Log Returns: For financial time series, taking logarithmic returns can help stabilize variance:
    \[ r_t = \log(Y_t) - \log(Y_{t-1}) \approx \frac{Y_t - Y_{t-1}}{Y_{t-1}} \]

Stationarity is important because most time series forecasting methods assume that the data is stationary.

Stationarity Test Demo

Test your time series data for stationarity using the Augmented Dickey-Fuller (ADF) test.

Test Results:

ADF Test Statistic: -

p-value: -

Is Stationary? -

Learn about stationarity tests

What is stationarity?

A time series is stationary if its statistical properties (mean, variance, autocorrelation) do not change over time. Most time series models assume stationarity, so we often need to transform non-stationary data.

Augmented Dickey-Fuller (ADF) Test

The ADF test checks for a unit root in the time series. The presence of a unit root indicates non-stationarity.

Null Hypothesis (H₀): The time series has a unit root (non-stationary)

Alternative Hypothesis (H₁): The time series does not have a unit root (stationary)

If the p-value is less than the significance level (typically 0.05), we reject the null hypothesis and conclude that the time series is stationary.

ADF test regression model:

\[ \Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \delta_1 \Delta y_{t-1} + \cdots + \delta_p \Delta y_{t-p} + \epsilon_t \]

The test statistic is the t-statistic for the \(\gamma\) coefficient. If \(\gamma\) is significantly less than zero, the series is stationary.

ARIMA/SARIMA Builder

ARIMA Parameters

Seasonal Components

ARIMA/SARIMA Model Formula:
\[ \text{ARIMA}(2,1,2) \times (1,1,1)_4 \]
\[ (1 - \phi_1 B - \phi_2 B^2)(1 - \Phi_1 B^4)(1 - B)(1 - B^4)y_t = (1 + \theta_1 B + \theta_2 B^2)(1 + \Theta_1 B^4)\epsilon_t \]
Learn about ARIMA and SARIMA models

ARIMA Models

ARIMA (AutoRegressive Integrated Moving Average) models are used for forecasting time series data. They combine three components:

  • AR (AutoRegressive): Uses the relationship between an observation and a number of lagged observations.
    \[ \text{AR}(p): y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \cdots + \phi_p y_{t-p} + \epsilon_t \]
  • I (Integrated): Represents differencing of observations to make the time series stationary.
    \[ \text{I}(d): (1 - B)^d y_t \]
    where B is the backshift operator: \(B y_t = y_{t-1}\)
  • MA (Moving Average): Uses the dependency between an observation and residual errors from previous observations.
    \[ \text{MA}(q): y_t = c + \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q} \]

SARIMA Models

SARIMA (Seasonal ARIMA) models add seasonal components to ARIMA models.

A SARIMA model is denoted as SARIMA(p,d,q)(P,D,Q)s, where:

  • (p,d,q): Non-seasonal components
  • (P,D,Q): Seasonal components
  • s: The number of observations per season

The general form of a SARIMA model is:

\[ \Phi_P(B^s)\phi_p(B)(1 - B)^d(1 - B^s)^D y_t = \Theta_Q(B^s)\theta_q(B)\epsilon_t \]

Where:

  • \(\phi_p(B)\): Non-seasonal AR component
  • \((1-B)^d\): Non-seasonal differencing
  • \(\theta_q(B)\): Non-seasonal MA component
  • \(\Phi_P(B^s)\): Seasonal AR component
  • \((1-B^s)^D\): Seasonal differencing
  • \(\Theta_Q(B^s)\): Seasonal MA component
Model Comparison Dashboard

Compare the forecasting performance of different time series models.

Model RMSE MAE MAPE (%)
ARIMA - - -
SARIMA - - -
Holt-Winters - - -
Learn about model evaluation

Model Evaluation Metrics

To compare time series forecasting models, we use several metrics:

Root Mean Square Error (RMSE): Measures the square root of the average squared differences between predicted and actual values.

\[ \text{RMSE} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2} \]

Mean Absolute Error (MAE): Measures the average absolute differences between predicted and actual values.

\[ \text{MAE} = \frac{1}{n}\sum_{i=1}^{n}|y_i - \hat{y}_i| \]

Mean Absolute Percentage Error (MAPE): Measures the average percentage difference between predicted and actual values.

\[ \text{MAPE} = \frac{100\%}{n}\sum_{i=1}^{n}\left|\frac{y_i - \hat{y}_i}{y_i}\right| \]

Lower values of these metrics indicate better model performance.

About the Models

  • ARIMA: Good for non-seasonal data with trends.
  • SARIMA: Adds seasonal components to ARIMA, making it suitable for seasonal data.
  • Holt-Winters: Uses exponential smoothing to capture level, trend, and seasonality.