Slide 1 of 25

Week 7 Recap: Differencing

Definition

Differencing is a technique used to transform a non-stationary time series into a stationary one by computing the differences between consecutive observations.

First Difference: Δyt = yt - yt-1
Second Difference: Δ²yt = Δyt - Δyt-1

Week 7 Recap: ARIMA Models

ARIMA (AutoRegressive Integrated Moving Average)

ARIMA combines autoregression (AR), differencing (I), and moving average (MA) components to model and forecast time series data.

ARIMA(p,d,q) where:
• p = order of autoregression
• d = degree of differencing
• q = order of moving average

Week 7 Recap: SARIMA Models

SARIMA (Seasonal ARIMA)

SARIMA extends ARIMA by adding seasonal components to handle periodic patterns in time series data.

SARIMA(p,d,q)(P,D,Q)s where:
• (P,D,Q) = seasonal AR, differencing, MA orders
• s = seasonal period (e.g., 12 for monthly data)

Week 7 Recap: Unit Root Testing (ADF)

Augmented Dickey-Fuller (ADF) Test

Statistical test to determine if a time series has a unit root (is non-stationary) or is stationary.

H₀: Time series has unit root (non-stationary)
H₁: Time series is stationary

If p-value < 0.05: Reject H₀ (series is stationary)

Interpretation Example

  • ADF Statistic = -4.2, p-value = 0.001
    → Reject H₀: Series is stationary
  • ADF Statistic = -1.5, p-value = 0.523
    → Fail to reject H₀: Series is non-stationary

Key Point

Always perform stationarity testing before applying ARIMA models. Non-stationary data can lead to spurious regression results.

Quiz 1: Differencing and Stationarity

Question: What is the primary purpose of differencing in time series analysis?

A) To increase the data size
B) To make the time series stationary
C) To remove outliers from the data
D) To improve data visualization

Quiz 2: ARIMA Components

Question: In ARIMA(2,1,3), what does the middle parameter '1' represent?

A) Moving average order
B) Degree of differencing
C) Autoregressive order
D) Seasonal period

Quiz 3: ADF Test Interpretation

Question: If ADF test p-value = 0.001, what can we conclude?

A) The series is non-stationary
B) The series is stationary
C) We need more data
D) The test is inconclusive

Week 8 Agenda: From Time Series to Relationships

Today's Learning Objectives

  • Understand the difference between correlation and causation
  • Learn about Granger causality testing
  • Introduction to Vector Autoregressive (VAR) models
  • Apply VAR models for multivariate forecasting

Connection to Previous Week

Last week we learned how to model and forecast single time series using ARIMA/SARIMA. This week, we extend our analysis to multiple interconnected time series and examine how they influence each other over time.

Class Structure

Understanding Correlation

Correlation

Correlation measures the strength and direction of a linear relationship between two variables. It ranges from -1 to +1.

Positive Correlation (+0.8)

As X increases, Y increases

No Correlation (0.0)

No linear relationship

Negative Correlation (-0.7)

As X increases, Y decreases

Correlation Case Study

Case Study: Student Study Time vs. Exam Scores

  • Correlation coefficient: r = +0.85
  • Strong positive correlation observed
  • Students who study more hours tend to achieve higher exam scores
  • However, this doesn't prove that studying causes higher scores

Important Note

High correlation does not imply causation. There might be other factors influencing both variables.

Spurious Correlation: When Correlation Misleads

Example: Real Estate Sales and Baldness

Observation: Areas with higher percentage of bald people tend to have higher real estate sales.

The Reality

  • Both variables are correlated with a hidden third variable: age
  • Older populations → more baldness
  • Older populations → higher income → more real estate transactions
  • Baldness itself doesn't cause real estate sales!

Cautions in Correlation Analysis

Common Misunderstandings

  • Confounding variables: Third factors affecting both variables
  • Sample size effects: Small samples can show misleading correlations
  • Non-linear relationships: Correlation only measures linear relationships
  • Temporal effects: Correlation can change over time

Famous Spurious Correlations

  • Ice cream sales and drowning deaths (both peak in summer)
  • Shoe size and reading ability in children (both increase with age)
  • Number of firefighters and fire damage (larger fires need more firefighters)

Consequences of Misinterpreting Correlation

Business and Policy Implications

  • Wasted Resources: Investing in factors that don't actually drive outcomes
  • Wrong Strategic Decisions: Basing strategies on spurious relationships
  • Missed Opportunities: Ignoring true causal factors
  • Regulatory Issues: Creating policies based on false assumptions

Real-World Example

Scenario: A company notices high correlation between employee coffee consumption and productivity.

Wrong Action: Providing free coffee to all employees expecting productivity boost.

Reality: Productive employees work longer hours → drink more coffee. Coffee itself may not cause productivity.

Introduction to Causation

Causation

Causation implies that one event (cause) directly produces or brings about another event (effect). It establishes a directional relationship where changes in the cause lead to changes in the effect.

Correlation

X ↔ Y

Bidirectional association

Causation

X → Y

Unidirectional influence

Causation Example: Weather and Ice Cream Sales

Causal Relationship Example

Temperature → Ice Cream Sales

  • Temporal precedence: Temperature changes precede sales changes
  • Mechanism: Hot weather increases desire for cooling products
  • Intervention test: Artificially cooling an area would reduce ice cream sales

Correlation vs. Causation: The Key Difference

Understanding the Directionality

  • Correlation: Symmetric, bidirectional relationship (X ↔ Y)
  • Causation: Asymmetric, unidirectional influence (X → Y)

Correlation Example

Height and Weight

Taller people tend to weigh more, and heavier people tend to be taller. The relationship works in both directions due to shared genetic and developmental factors.

Causation Example

Exercise and Fitness

Regular exercise causes improved fitness. While fitness might motivate more exercise, the primary causal direction is exercise → fitness improvement.

Another Causation Example

Education Investment and Economic Growth

  • Causal mechanism: Education → Skilled workforce → Higher productivity → Economic growth
  • Time lag: Education investments show economic returns after several years
  • Direction matters: While economic growth can fund more education, the primary causal flow is education → growth

Quiz 4: Correlation vs. Causation

Question: Which statement best describes the difference between correlation and causation?

A) Correlation is stronger than causation
B) Causation implies direction, correlation does not
C) Correlation requires more data than causation
D) They are essentially the same concept

Determining Causation: Counterfactual Reasoning

Counterfactual Analysis

Counterfactual reasoning expresses what has not happened. Causal claims can be explained in terms of counterfactual conditionals of the form: "If A had not occurred, C would not have occurred"

Example

"If kangaroos had no tails, they would not be upright."

This statement implies that kangaroo tails cause their upright posture. We test causation by imagining the absence of the proposed cause.

Two Types of Reasoning for Determining Causation:

  • "But for" reasoning (counterfactual): But for the cause, the effect would not have occurred
  • "Substantial factor" reasoning: Was one event a substantial factor in causing another event?

Determining Causation with Evidence

Understanding Causation Through Evidence

Watch: Causation vs Correlation Explained

Click the link above to view the explanatory video

Evidence-Based Approaches

  • Temporal sequence: Cause must precede effect
  • Controlled experiments: Manipulate cause, observe effect
  • Natural experiments: Observe quasi-experimental conditions
  • Statistical controls: Account for confounding variables

Data-Driven Control Experiments

Randomized Controlled Trial Example

Research Question: Does a new teaching method improve student performance?

  • Control Group: Traditional teaching method (Average: 72%)
  • Treatment Group: New teaching method (Average: 84%)
  • Random assignment eliminates selection bias
  • Conclusion: New method causes 12-point improvement

Current Data-Driven Causation Methods

Modern Approaches to Overcome Traditional Limitations

  • A/B Testing: Randomized experiments for digital products
  • Granger Causality: Time-series based causation testing
  • Synthetic Control Methods: Google's CausalImpact algorithm

Synthetic Control Example

Scenario: Measuring impact of a marketing campaign

  • Blue line: Actual sales with campaign
  • Red dashed: Predicted sales without campaign (synthetic control)
  • Difference shows causal impact of campaign

Quiz 5: Causation Methods

Question: Which method is best for establishing causation in observational data?

A) Simple correlation analysis
B) Descriptive statistics
C) Randomized controlled experiments
D) Data visualization

Clive Granger (1934-2009)

Nobel Prize Winner in Economics (2003)

  • British econometrician who revolutionized time series analysis
  • Key contribution: Developed methods for analyzing causal relationships in time series data
  • Granger Causality: Statistical concept of causality based on prediction improvement
  • Co-integration theory: Methods for analyzing long-run relationships between variables

Impact on Time Series Analysis

Granger's work bridged the gap between pure statistical correlation and meaningful economic causality, providing practical tools for economists and data scientists to identify directional relationships in time series data.

Granger Causality

Granger Causality Definition

Variable X is said to "Granger-cause" variable Y if past values of X provide statistically significant information about future values of Y, beyond what is already contained in past values of Y alone.

Key Principle:

If including lagged values of X improves the prediction of Y compared to using only lagged values of Y, then X Granger-causes Y.

Mathematical Framework

  • Model 1: Yt = α + β₁Yt-1 + β₂Yt-2 + ... + εt
  • Model 2: Yt = α + β₁Yt-1 + β₂Yt-2 + ... + γ₁Xt-1 + γ₂Xt-2 + ... + εt
  • Test: If Model 2 significantly outperforms Model 1, then X Granger-causes Y

Clive Granger (1934-2009)

Nobel Prize Winner in Economics (2003)

  • British econometrician who revolutionized time series analysis
  • Key contribution: Developed methods for analyzing causal relationships in time series data
  • Granger Causality: Statistical concept of causality based on prediction improvement
  • Co-integration theory: Methods for analyzing long-run relationships between variables

Impact on Time Series Analysis

Granger's work bridged the gap between pure statistical correlation and meaningful economic causality, providing practical tools for economists and data scientists to identify directional relationships in time series data.

Granger Causality

Granger Causality Definition

Variable X is said to "Granger-cause" variable Y if past values of X provide statistically significant information about future values of Y, beyond what is already contained in past values of Y alone.

Key Principle:

If including lagged values of X improves the prediction of Y compared to using only lagged values of Y, then X Granger-causes Y.

Mathematical Framework

  • Model 1: Yt = α + β₁Yt-1 + β₂Yt-2 + ... + εt
  • Model 2: Yt = α + β₁Yt-1 + β₂Yt-2 + ... + γ₁Xt-1 + γ₂Xt-2 + ... + εt
  • Test: If Model 2 significantly outperforms Model 1, then X Granger-causes Y

Granger Causality Example

Example: Money Supply and Inflation

Testing Process:

  • Model 1: Inflationt = α + β₁Inflationt-1 + β₂Inflationt-2 + εt
  • Model 2: Inflationt = α + β₁Inflationt-1 + β₂Inflationt-2 + γ₁MoneySupplyt-1 + γ₂MoneySupplyt-2 + εt
  • F-test result: p-value = 0.023 < 0.05
  • Conclusion: Money supply Granger-causes inflation

Using Granger Causality

Applications in Forecasting

  • Forecasting performance improvement: Identify which variables help predict others
  • Variable selection: Choose relevant predictors for multivariate models
  • Policy analysis: Understand lead-lag relationships between economic indicators

Prerequisites

  • Data must be stationary: Apply differencing if necessary
  • Appropriate lag selection: Use information criteria (AIC, BIC)
  • Sample size considerations: Need sufficient observations for reliable tests

Practical Steps

  1. Test for stationarity (ADF test)
  2. Select optimal lag length
  3. Perform Granger causality test
  4. Interpret results in context

Granger Causality vs. Correlation in Forecasting

Correlation-Based Forecasting

  • Uses contemporary relationships
  • Xt and Yt at same time period
  • Good for nowcasting
  • Limited predictive power

Granger Causality Forecasting

  • Uses temporal relationships
  • Xt-1, Xt-2 predict Yt
  • True forecasting capability
  • Can predict future values

Key Insight

Granger causality provides actionable forecasting relationships, while simple correlation may not have predictive power for future periods.

Vector Autoregressive (VAR) Models

VAR Model Definition

A Vector Autoregressive (VAR) model is a multivariate time series model where each variable is modeled as a linear function of past values of itself and past values of all other variables in the system.

VAR(p) Model Structure:

Yt = c + Φ₁Yt-1 + Φ₂Yt-2 + ... + ΦpYt-p + εt

Where:

  • Yt = vector of endogenous variables at time t
  • c = vector of constants
  • Φᵢ = coefficient matrices for lag i
  • εt = vector of error terms

Simple Example: 2-variable VAR(1)

GDPt = α₁ + β₁GDPt-1 + γ₁Inflationt-1 + ε₁t

Inflationt = α₂ + β₂GDPt-1 + γ₂Inflationt-1 + ε₂t

Understanding the Lag Parameter 'p'

What does 'p' mean? The lag parameter 'p' tells us how many past time periods to include in our model.

  • VAR(1): Uses only 1 period back (yesterday's values predict today)
  • VAR(2): Uses 2 periods back (yesterday's + day before yesterday's values)
  • VAR(4): Uses 4 periods back (last 4 periods predict current period)

How to Select Best 'p' Value:

  1. Test multiple values: Try p = 1, 2, 3, 4, 5...
  2. Use Information Criteria: Calculate AIC or BIC for each model
  3. Choose lowest BIC: The model with lowest BIC value wins
  4. Balance complexity: Higher p = more parameters = need more data

Rule of thumb: Start with p = 1 to 4 for most economic data. Monthly data might need p = 12 for seasonality.

Characteristics of VAR Models

Key Properties

  • Endogenous variables: All variables are treated as interdependent
  • Symmetric treatment: Each equation has the same lag structure
  • Reduced form: No contemporaneous relationships among variables
  • Dynamic system: Captures evolving relationships over time

Advantages of VAR Models

  • Capture complex interdependencies between multiple time series
  • Useful for forecasting multiple variables simultaneously
  • Can analyze impulse responses and variance decomposition
  • Less restrictive than structural equation models

Limitations

  • Require large sample sizes due to many parameters
  • All variables must be stationary (or co-integrated)
  • Interpretation can be challenging with many variables
  • Curse of dimensionality with increasing variables

How VAR Models Calculate Forecasts: Step-by-Step

Economic Indicators VAR Model

Variables: GDP Growth, Inflation Rate, Interest Rate

Step-by-Step VAR Forecasting Process

  1. Step 1 - Collect Historical Data:
    Gather past values for all variables (e.g., last 20 quarters of GDP, Inflation, Interest rates)
  2. Step 2 - Estimate the Model:
    The computer learns how each variable depends on past values of ALL variables using statistical methods
  3. Step 3 - Create Forecast Equations:
    • GDPnext = 0.8×GDPcurrent + 0.3×GDPprevious - 0.2×Inflationcurrent + ...
    • Inflationnext = 0.6×Inflationcurrent + 0.1×GDPcurrent + 0.2×Interestcurrent + ...
    • Interestnext = 0.7×Interestcurrent + 0.3×Inflationcurrent + ...
  4. Step 4 - Make First Forecast:
    Plug in the most recent actual values to predict next period's values for all three variables
  5. Step 5 - Continue Forecasting:
    Use the forecasted values from Step 4 as inputs to predict the period after that, and so on

Why VAR is Powerful

Key Advantage: Unlike ARIMA (which looks at one variable at a time), VAR considers how GDP growth today affects inflation tomorrow, which then affects interest rates the day after, which feeds back to affect GDP growth again!

Model Selection: BIC and Schwarz Criterion

Bayesian Information Criterion (BIC)

BIC, also known as Schwarz Criterion (SC), is a model selection criterion that balances model fit with model complexity. It helps determine the optimal number of lags in VAR models.

BIC Formula:

BIC = -2ln(L) + k×ln(n)

Where:

  • L = likelihood of the model
  • k = number of parameters
  • n = number of observations

Simple Explanation

Think of BIC as a "smart judge" that evaluates models on two criteria:

  • How well does it fit? (Lower error = better)
  • How complex is it? (Simpler models preferred)

Rule: Choose the model with the lowest BIC value

Why BIC for VAR Models?

BIC penalizes complexity more heavily than AIC, making it ideal for VAR models where the number of parameters grows quickly with additional lags.

Final Quiz: VAR and Granger Causality

Question: What is the main advantage of VAR models over univariate ARIMA models?

A) VAR models are simpler to interpret
B) VAR models capture interdependencies between multiple variables
C) VAR models require less data
D) VAR models are always more accurate

Course Summary

  • Week 7: Individual time series analysis (ARIMA, SARIMA)
  • Week 8: Relationships between time series (Correlation, Causation, VAR)
  • Next week: Advanced forecasting techniques and model validation