Time Series Analysis Dashboard

Interactive Time Series Plot

Learn about differencing and transformation

Why do we difference time series data?

Differencing is a technique used to make a non-stationary time series stationary by removing trends and seasonality:

First Differencing: Removes trend by taking the difference between consecutive observations. The formula is:
\[ \nabla Y_t = Y_t - Y_{t-1} \]
This helps stabilize the mean by removing linear trends.
Seasonal Differencing: Removes seasonality by taking the difference between an observation and the same observation from the previous season. The formula is:
\[ \nabla_s Y_t = Y_t - Y_{t-s} \]
where \(s\) is the seasonal period (e.g., 4 for quarterly data, 12 for monthly data).
Log Returns: For financial time series, taking logarithmic returns can help stabilize variance:
\[ r_t = \log(Y_t) - \log(Y_{t-1}) \approx \frac{Y_t - Y_{t-1}}{Y_{t-1}} \]

Stationarity is important because most time series forecasting methods assume that the data is stationary.

Stationarity Test Demo

Test your time series data for stationarity using the Augmented Dickey-Fuller (ADF) test.

Test Results:

ADF Test Statistic: -

p-value: -

Is Stationary? -

Learn about stationarity tests

What is stationarity?

A time series is stationary if its statistical properties (mean, variance, autocorrelation) do not change over time. Most time series models assume stationarity, so we often need to transform non-stationary data.

Augmented Dickey-Fuller (ADF) Test

The ADF test checks for a unit root in the time series. The presence of a unit root indicates non-stationarity.

Null Hypothesis (H₀): The time series has a unit root (non-stationary)

Alternative Hypothesis (H₁): The time series does not have a unit root (stationary)

If the p-value is less than the significance level (typically 0.05), we reject the null hypothesis and conclude that the time series is stationary.

ADF test regression model:

\[ \Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \delta_1 \Delta y_{t-1} + \cdots + \delta_p \Delta y_{t-p} + \epsilon_t \]

The test statistic is the t-statistic for the \(\gamma\) coefficient. If \(\gamma\) is significantly less than zero, the series is stationary.

ARIMA/SARIMA Builder

ARIMA Parameters

p (Autoregressive order): 2

d (Differencing order): 1

q (Moving average order): 2

Seasonal Components

P (Seasonal AR order): 1

D (Seasonal differencing): 1

Q (Seasonal MA order): 1

s (Seasonal period): 4

ARIMA/SARIMA Model Formula:

\[ \text{ARIMA}(2,1,2) \times (1,1,1)_4 \]

\[ (1 - \phi_1 B - \phi_2 B^2)(1 - \Phi_1 B^4)(1 - B)(1 - B^4)y_t = (1 + \theta_1 B + \theta_2 B^2)(1 + \Theta_1 B^4)\epsilon_t \]

Learn about ARIMA and SARIMA models

ARIMA Models

ARIMA (AutoRegressive Integrated Moving Average) models are used for forecasting time series data. They combine three components:

AR (AutoRegressive): Uses the relationship between an observation and a number of lagged observations.
\[ \text{AR}(p): y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \cdots + \phi_p y_{t-p} + \epsilon_t \]
I (Integrated): Represents differencing of observations to make the time series stationary.
\[ \text{I}(d): (1 - B)^d y_t \]
where B is the backshift operator: \(B y_t = y_{t-1}\)
MA (Moving Average): Uses the dependency between an observation and residual errors from previous observations.
\[ \text{MA}(q): y_t = c + \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q} \]

SARIMA Models

SARIMA (Seasonal ARIMA) models add seasonal components to ARIMA models.

A SARIMA model is denoted as SARIMA(p,d,q)(P,D,Q)s, where:

(p,d,q): Non-seasonal components
(P,D,Q): Seasonal components
s: The number of observations per season

The general form of a SARIMA model is:

\[ \Phi_P(B^s)\phi_p(B)(1 - B)^d(1 - B^s)^D y_t = \Theta_Q(B^s)\theta_q(B)\epsilon_t \]

Where:

\(\phi_p(B)\): Non-seasonal AR component
\((1-B)^d\): Non-seasonal differencing
\(\theta_q(B)\): Non-seasonal MA component
\(\Phi_P(B^s)\): Seasonal AR component
\((1-B^s)^D\): Seasonal differencing
\(\Theta_Q(B^s)\): Seasonal MA component

Model Comparison Dashboard

Compare the forecasting performance of different time series models.

Model	RMSE	MAE	MAPE (%)
ARIMA	-	-	-
SARIMA	-	-	-
Holt-Winters	-	-	-

Learn about model evaluation

Model Evaluation Metrics

To compare time series forecasting models, we use several metrics:

Root Mean Square Error (RMSE): Measures the square root of the average squared differences between predicted and actual values.

\[ \text{RMSE} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2} \]

Mean Absolute Error (MAE): Measures the average absolute differences between predicted and actual values.

\[ \text{MAE} = \frac{1}{n}\sum_{i=1}^{n}|y_i - \hat{y}_i| \]

Mean Absolute Percentage Error (MAPE): Measures the average percentage difference between predicted and actual values.

\[ \text{MAPE} = \frac{100\%}{n}\sum_{i=1}^{n}\left|\frac{y_i - \hat{y}_i}{y_i}\right| \]

Lower values of these metrics indicate better model performance.

About the Models

ARIMA: Good for non-seasonal data with trends.
SARIMA: Adds seasonal components to ARIMA, making it suitable for seasonal data.
Holt-Winters: Uses exponential smoothing to capture level, trend, and seasonality.