Week 7 Recap: Differencing

Definition

Differencing is a technique used to transform a non-stationary time series into a stationary one by computing the differences between consecutive observations.

First Difference: Δy_t = y_t - y_t-1
Second Difference: Δ²y_t = Δy_t - Δy_t-1

Removes trend and seasonality from time series data
Essential preprocessing step for ARIMA models
Helps achieve stationarity required for forecasting

Week 7 Recap: ARIMA Models

ARIMA (AutoRegressive Integrated Moving Average)

ARIMA combines autoregression (AR), differencing (I), and moving average (MA) components to model and forecast time series data.

ARIMA(p,d,q) where:
• p = order of autoregression
• d = degree of differencing
• q = order of moving average

AR(p): Current value depends on previous p values
I(d): Differencing applied d times for stationarity
MA(q): Current error depends on previous q forecast errors

Week 7 Recap: SARIMA Models

SARIMA (Seasonal ARIMA)

SARIMA extends ARIMA by adding seasonal components to handle periodic patterns in time series data.

SARIMA(p,d,q)(P,D,Q)_s where:
• (P,D,Q) = seasonal AR, differencing, MA orders
• s = seasonal period (e.g., 12 for monthly data)

Captures both non-seasonal and seasonal patterns
Ideal for data with recurring seasonal behavior
Commonly used for monthly, quarterly sales forecasting

Week 7 Recap: Unit Root Testing (ADF)

Augmented Dickey-Fuller (ADF) Test

Statistical test to determine if a time series has a unit root (is non-stationary) or is stationary.

H₀: Time series has unit root (non-stationary)
H₁: Time series is stationary

If p-value < 0.05: Reject H₀ (series is stationary)

Interpretation Example

ADF Statistic = -4.2, p-value = 0.001
→ Reject H₀: Series is stationary
ADF Statistic = -1.5, p-value = 0.523
→ Fail to reject H₀: Series is non-stationary

Key Point

Always perform stationarity testing before applying ARIMA models. Non-stationary data can lead to spurious regression results.

Quiz 1: Differencing and Stationarity

Question: What is the primary purpose of differencing in time series analysis?

A) To increase the data size

B) To make the time series stationary

C) To remove outliers from the data

D) To improve data visualization

Quiz 2: ARIMA Components

Question: In ARIMA(2,1,3), what does the middle parameter '1' represent?

A) Moving average order

B) Degree of differencing

C) Autoregressive order

D) Seasonal period

Quiz 3: ADF Test Interpretation

Question: If ADF test p-value = 0.001, what can we conclude?

A) The series is non-stationary

B) The series is stationary

C) We need more data

D) The test is inconclusive

Week 8 Agenda: From Time Series to Relationships

Today's Learning Objectives

Understand the difference between correlation and causation
Learn about Granger causality testing
Introduction to Vector Autoregressive (VAR) models
Apply VAR models for multivariate forecasting

Connection to Previous Week

Last week we learned how to model and forecast single time series using ARIMA/SARIMA. This week, we extend our analysis to multiple interconnected time series and examine how they influence each other over time.

Class Structure

Part 1: Correlation vs. Causation concepts
Part 2: Granger causality testing
Part 3: VAR models and applications

Understanding Correlation

Correlation

Correlation measures the strength and direction of a linear relationship between two variables. It ranges from -1 to +1.

Positive Correlation (+0.8)

As X increases, Y increases

No Correlation (0.0)

No linear relationship

Negative Correlation (-0.7)

As X increases, Y decreases

Correlation Case Study

Case Study: Student Study Time vs. Exam Scores

Correlation coefficient: r = +0.85
Strong positive correlation observed
Students who study more hours tend to achieve higher exam scores
However, this doesn't prove that studying causes higher scores

Important Note

High correlation does not imply causation. There might be other factors influencing both variables.

Spurious Correlation: When Correlation Misleads

Example: Real Estate Sales and Baldness

Observation: Areas with higher percentage of bald people tend to have higher real estate sales.

The Reality

Both variables are correlated with a hidden third variable: age
Older populations → more baldness
Older populations → higher income → more real estate transactions
Baldness itself doesn't cause real estate sales!

Cautions in Correlation Analysis

Common Misunderstandings

Confounding variables: Third factors affecting both variables
Sample size effects: Small samples can show misleading correlations
Non-linear relationships: Correlation only measures linear relationships
Temporal effects: Correlation can change over time

Famous Spurious Correlations

Ice cream sales and drowning deaths (both peak in summer)
Shoe size and reading ability in children (both increase with age)
Number of firefighters and fire damage (larger fires need more firefighters)

Consequences of Misinterpreting Correlation

Business and Policy Implications

Wasted Resources: Investing in factors that don't actually drive outcomes
Wrong Strategic Decisions: Basing strategies on spurious relationships
Missed Opportunities: Ignoring true causal factors
Regulatory Issues: Creating policies based on false assumptions

Real-World Example

Scenario: A company notices high correlation between employee coffee consumption and productivity.

Wrong Action: Providing free coffee to all employees expecting productivity boost.

Reality: Productive employees work longer hours → drink more coffee. Coffee itself may not cause productivity.

Introduction to Causation

Causation

Causation implies that one event (cause) directly produces or brings about another event (effect). It establishes a directional relationship where changes in the cause lead to changes in the effect.

Correlation

X ↔ Y

Bidirectional association

Causation

X → Y

Unidirectional influence

Temporal precedence: Cause must precede effect in time
Covariation: Changes in cause must relate to changes in effect
Non-spuriousness: Relationship not due to third variables

Causation Example: Weather and Ice Cream Sales

Causal Relationship Example

Temperature → Ice Cream Sales

Temporal precedence: Temperature changes precede sales changes
Mechanism: Hot weather increases desire for cooling products
Intervention test: Artificially cooling an area would reduce ice cream sales

Correlation vs. Causation: The Key Difference

Understanding the Directionality

Correlation: Symmetric, bidirectional relationship (X ↔ Y)
Causation: Asymmetric, unidirectional influence (X → Y)

Correlation Example

Height and Weight

Taller people tend to weigh more, and heavier people tend to be taller. The relationship works in both directions due to shared genetic and developmental factors.

Causation Example

Exercise and Fitness

Regular exercise causes improved fitness. While fitness might motivate more exercise, the primary causal direction is exercise → fitness improvement.

Another Causation Example

Education Investment and Economic Growth

Causal mechanism: Education → Skilled workforce → Higher productivity → Economic growth
Time lag: Education investments show economic returns after several years
Direction matters: While economic growth can fund more education, the primary causal flow is education → growth

Quiz 4: Correlation vs. Causation

Question: Which statement best describes the difference between correlation and causation?

A) Correlation is stronger than causation

B) Causation implies direction, correlation does not

C) Correlation requires more data than causation

D) They are essentially the same concept

Determining Causation: Counterfactual Reasoning

Counterfactual Analysis

Counterfactual reasoning expresses what has not happened. Causal claims can be explained in terms of counterfactual conditionals of the form: "If A had not occurred, C would not have occurred"

Example

"If kangaroos had no tails, they would not be upright."

This statement implies that kangaroo tails cause their upright posture. We test causation by imagining the absence of the proposed cause.

Two Types of Reasoning for Determining Causation:

"But for" reasoning (counterfactual): But for the cause, the effect would not have occurred
"Substantial factor" reasoning: Was one event a substantial factor in causing another event?

Determining Causation with Evidence

Understanding Causation Through Evidence

Watch: Causation vs Correlation Explained

Click the link above to view the explanatory video

Evidence-Based Approaches

Temporal sequence: Cause must precede effect
Controlled experiments: Manipulate cause, observe effect
Natural experiments: Observe quasi-experimental conditions
Statistical controls: Account for confounding variables

Data-Driven Control Experiments

Randomized Controlled Trial Example

Research Question: Does a new teaching method improve student performance?

Control Group: Traditional teaching method (Average: 72%)
Treatment Group: New teaching method (Average: 84%)
Random assignment eliminates selection bias
Conclusion: New method causes 12-point improvement

Current Data-Driven Causation Methods

Modern Approaches to Overcome Traditional Limitations

A/B Testing: Randomized experiments for digital products
Granger Causality: Time-series based causation testing
Synthetic Control Methods: Google's CausalImpact algorithm

Synthetic Control Example

Scenario: Measuring impact of a marketing campaign

Blue line: Actual sales with campaign
Red dashed: Predicted sales without campaign (synthetic control)
Difference shows causal impact of campaign

Quiz 5: Causation Methods

Question: Which method is best for establishing causation in observational data?

A) Simple correlation analysis

B) Descriptive statistics

C) Randomized controlled experiments

D) Data visualization

Clive Granger (1934-2009)

Nobel Prize Winner in Economics (2003)

British econometrician who revolutionized time series analysis
Key contribution: Developed methods for analyzing causal relationships in time series data
Granger Causality: Statistical concept of causality based on prediction improvement
Co-integration theory: Methods for analyzing long-run relationships between variables

Impact on Time Series Analysis

Granger's work bridged the gap between pure statistical correlation and meaningful economic causality, providing practical tools for economists and data scientists to identify directional relationships in time series data.

Granger Causality

Granger Causality Definition

Variable X is said to "Granger-cause" variable Y if past values of X provide statistically significant information about future values of Y, beyond what is already contained in past values of Y alone.

Key Principle:

If including lagged values of X improves the prediction of Y compared to using only lagged values of Y, then X Granger-causes Y.

Mathematical Framework

Model 1: Y_t = α + β₁Y_t-1 + β₂Y_t-2 + ... + ε_t
Model 2: Y_t = α + β₁Y_t-1 + β₂Y_t-2 + ... + γ₁X_t-1 + γ₂X_t-2 + ... + ε_t
Test: If Model 2 significantly outperforms Model 1, then X Granger-causes Y

Clive Granger (1934-2009)

Nobel Prize Winner in Economics (2003)

British econometrician who revolutionized time series analysis
Key contribution: Developed methods for analyzing causal relationships in time series data
Granger Causality: Statistical concept of causality based on prediction improvement
Co-integration theory: Methods for analyzing long-run relationships between variables

Impact on Time Series Analysis

Granger's work bridged the gap between pure statistical correlation and meaningful economic causality, providing practical tools for economists and data scientists to identify directional relationships in time series data.

Granger Causality

Granger Causality Definition

Variable X is said to "Granger-cause" variable Y if past values of X provide statistically significant information about future values of Y, beyond what is already contained in past values of Y alone.

Key Principle:

If including lagged values of X improves the prediction of Y compared to using only lagged values of Y, then X Granger-causes Y.

Mathematical Framework

Model 1: Y_t = α + β₁Y_t-1 + β₂Y_t-2 + ... + ε_t
Model 2: Y_t = α + β₁Y_t-1 + β₂Y_t-2 + ... + γ₁X_t-1 + γ₂X_t-2 + ... + ε_t
Test: If Model 2 significantly outperforms Model 1, then X Granger-causes Y

Granger Causality Example

Example: Money Supply and Inflation

Testing Process:

Model 1: Inflation_t = α + β₁Inflation_t-1 + β₂Inflation_t-2 + ε_t
Model 2: Inflation_t = α + β₁Inflation_t-1 + β₂Inflation_t-2 + γ₁MoneySupply_t-1 + γ₂MoneySupply_t-2 + ε_t
F-test result: p-value = 0.023 < 0.05
Conclusion: Money supply Granger-causes inflation

Using Granger Causality

Applications in Forecasting

Forecasting performance improvement: Identify which variables help predict others
Variable selection: Choose relevant predictors for multivariate models
Policy analysis: Understand lead-lag relationships between economic indicators

Prerequisites

Data must be stationary: Apply differencing if necessary
Appropriate lag selection: Use information criteria (AIC, BIC)
Sample size considerations: Need sufficient observations for reliable tests

Practical Steps

Test for stationarity (ADF test)
Select optimal lag length
Perform Granger causality test
Interpret results in context

Granger Causality vs. Correlation in Forecasting

Correlation-Based Forecasting

Uses contemporary relationships
X_t and Y_t at same time period
Good for nowcasting
Limited predictive power

Granger Causality Forecasting

Uses temporal relationships
X_t-1, X_t-2 predict Y_t
True forecasting capability
Can predict future values

Key Insight

Granger causality provides actionable forecasting relationships, while simple correlation may not have predictive power for future periods.

Vector Autoregressive (VAR) Models

VAR Model Definition

A Vector Autoregressive (VAR) model is a multivariate time series model where each variable is modeled as a linear function of past values of itself and past values of all other variables in the system.

VAR(p) Model Structure:

Y_t = c + Φ₁Y_t-1 + Φ₂Y_t-2 + ... + Φ_pY_t-p + ε_t

Where:

Y_t = vector of endogenous variables at time t
c = vector of constants
Φᵢ = coefficient matrices for lag i
ε_t = vector of error terms

Simple Example: 2-variable VAR(1)

GDP_t = α₁ + β₁GDP_t-1 + γ₁Inflation_t-1 + ε₁_t

Inflation_t = α₂ + β₂GDP_t-1 + γ₂Inflation_t-1 + ε₂_t

Understanding the Lag Parameter 'p'

What does 'p' mean? The lag parameter 'p' tells us how many past time periods to include in our model.

VAR(1): Uses only 1 period back (yesterday's values predict today)
VAR(2): Uses 2 periods back (yesterday's + day before yesterday's values)
VAR(4): Uses 4 periods back (last 4 periods predict current period)

How to Select Best 'p' Value:

Test multiple values: Try p = 1, 2, 3, 4, 5...
Use Information Criteria: Calculate AIC or BIC for each model
Choose lowest BIC: The model with lowest BIC value wins
Balance complexity: Higher p = more parameters = need more data

Rule of thumb: Start with p = 1 to 4 for most economic data. Monthly data might need p = 12 for seasonality.

Characteristics of VAR Models

Key Properties

Endogenous variables: All variables are treated as interdependent
Symmetric treatment: Each equation has the same lag structure
Reduced form: No contemporaneous relationships among variables
Dynamic system: Captures evolving relationships over time

Advantages of VAR Models

Capture complex interdependencies between multiple time series
Useful for forecasting multiple variables simultaneously
Can analyze impulse responses and variance decomposition
Less restrictive than structural equation models

Limitations

Require large sample sizes due to many parameters
All variables must be stationary (or co-integrated)
Interpretation can be challenging with many variables
Curse of dimensionality with increasing variables

How VAR Models Calculate Forecasts: Step-by-Step

Economic Indicators VAR Model

Variables: GDP Growth, Inflation Rate, Interest Rate

Step-by-Step VAR Forecasting Process

Step 1 - Collect Historical Data:
Gather past values for all variables (e.g., last 20 quarters of GDP, Inflation, Interest rates)
Step 2 - Estimate the Model:
The computer learns how each variable depends on past values of ALL variables using statistical methods
Step 3 - Create Forecast Equations:
• GDP_next = 0.8×GDP_current + 0.3×GDP_previous - 0.2×Inflation_current + ...
• Inflation_next = 0.6×Inflation_current + 0.1×GDP_current + 0.2×Interest_current + ...
• Interest_next = 0.7×Interest_current + 0.3×Inflation_current + ...
Step 4 - Make First Forecast:
Plug in the most recent actual values to predict next period's values for all three variables
Step 5 - Continue Forecasting:
Use the forecasted values from Step 4 as inputs to predict the period after that, and so on

Why VAR is Powerful

Key Advantage: Unlike ARIMA (which looks at one variable at a time), VAR considers how GDP growth today affects inflation tomorrow, which then affects interest rates the day after, which feeds back to affect GDP growth again!

Model Selection: BIC and Schwarz Criterion

Bayesian Information Criterion (BIC)

BIC, also known as Schwarz Criterion (SC), is a model selection criterion that balances model fit with model complexity. It helps determine the optimal number of lags in VAR models.

BIC Formula:

BIC = -2ln(L) + k×ln(n)

Where:

L = likelihood of the model
k = number of parameters
n = number of observations

Simple Explanation

Think of BIC as a "smart judge" that evaluates models on two criteria:

How well does it fit? (Lower error = better)
How complex is it? (Simpler models preferred)

Rule: Choose the model with the lowest BIC value

Why BIC for VAR Models?

BIC penalizes complexity more heavily than AIC, making it ideal for VAR models where the number of parameters grows quickly with additional lags.

Final Quiz: VAR and Granger Causality

Question: What is the main advantage of VAR models over univariate ARIMA models?

A) VAR models are simpler to interpret

B) VAR models capture interdependencies between multiple variables

C) VAR models require less data

D) VAR models are always more accurate

Course Summary

Week 7: Individual time series analysis (ARIMA, SARIMA)
Week 8: Relationships between time series (Correlation, Causation, VAR)
Next week: Advanced forecasting techniques and model validation