M1 · Lesson 1 — Math Notation Literacy

Greek Letters &
Common Symbols

Every RS paper is written in two languages: English and math symbols.
This lesson builds your symbol vocabulary — the alphabet of equations.

01
M1 · L1 — Why This Matters

The cost of symbol blindness

90% of equation confusion
comes from unfamiliar symbols

When you hit an unfamiliar symbol mid-equation, your reading stops. You lose the thread. The math starts to feel impossible — but it's not the math that's hard, it's the alphabet.

Once you build symbol fluency, you'll realise most RS equations are built from the same 20–30 symbols in different arrangements.

The test: You should be able to read the symbols in this equation out loud, even before understanding what it means:
\( \mathcal{L} = -\sum_{(u,i,j)\in\mathcal{D}} \ln\sigma(\hat{y}_{ui} - \hat{y}_{uj}) + \lambda\|\theta\|^2 \)

  • Greek letters are used as variables, constants, and hyperparameters
  • Each letter has a conventional meaning — not fixed, but strongly conventional in RS
  • Capital vs lowercase of the same letter means something completely different
  • Always check the paper's notation table — usually in Section 3 (Preliminaries)
02
M1 · L1 — Greek Letters

The core vocabulary — part 1

Greek letters most used
in RS optimisation

α
alpha
Learning rate
\(\theta \leftarrow \theta - \alpha \nabla\mathcal{L}\)
β
beta
Reg. weight or
secondary hyperparam
γ
gamma
Discount factor
in RL-based RS
λ
lambda
Regularisation strength
\(\lambda\|\theta\|^2\)
θ
theta
All model parameters
(the learnable set)
σ
sigma (lower)
Sigmoid function
\(\sigma(x)=\frac{1}{1+e^{-x}}\)
μ
mu
Mean of a
distribution
ε
epsilon
Noise term or
exploration rate
03
M1 · L1 — Greek Letters

The core vocabulary — part 2

More Greek letters —
functions and structures

φ
phi
Feature mapping
or encoder function
ψ
psi
Decoder or
projection function
η
eta
Step size or
learning rate (alt.)
ρ
rho
Correlation or
density parameter
Σ
Sigma (CAPITAL)
Summation
\(\sum_{i=1}^n x_i\)
Δ
Delta (CAPITAL)
Change in quantity
or difference
Π
Pi (CAPITAL)
Product notation
\(\prod_{i=1}^n x_i\)
Ω
Omega (CAPITAL)
Sample space or
parameter domain

Critical: Σ (capital, summation) vs σ (lowercase, sigmoid) look similar and mean completely different things. Always check case.

04
M1 · L1 — Common Confusion

The most confusing pairs in RS papers

Same letter, completely
different meaning

Σ

Capital Sigma
Summation operator
\(\sum_{i=1}^n x_i\)

σ

Lowercase sigma
Sigmoid function
\(\sigma(x) = \frac{1}{1+e^{-x}}\)

Π

Capital Pi
Product notation
\(\prod_{i=1}^n p_i\)

π

Lowercase pi
Policy in RL-RS or
probability vector

θ

Theta (plain)
All model parameters
as a set

Θ

Theta (capital)
Big-Theta notation
(complexity)

φ

Phi
Feature mapping
\(\phi: \mathcal{X} \to \mathbb{R}^d\)

Φ

Phi (capital)
Full parameter matrix
or CDF of Gaussian

05
M1 · L1 — Operators

The logical operators

Operators that appear
in every RS paper

"in" / "element of"u ∈ 𝒰 → u is a user in the user set
"not in"j ∉ 𝒩_u → j is not in u's neighbourhood
"for all"∀u ∈ 𝒰 → for every user
"there exists"∃i such that r_{ui} > 0
"proportional to"P(u|i) ∝ P(i|u)·P(u)
"defined as"Introduces a new definition
"approximately"Used when simplifying or approximating
"real numbers"ℝ^{m×n} = real matrix, m rows × n cols
"natural numbers"k ∈ ℕ → k is a positive integer
"subset of"𝒮 ⊆ ℐ → 𝒮 is a subset of items
"is assigned"θ ← θ - α∇ℒ → update rule
·"dot product"p_u · q_i = scalar similarity score
06
M1 · L1 — Font Conventions

The font conventions every RS paper follows

What the font tells you
before you read the symbol

Calligraphic Capital 𝒰 𝒜 ℐ

= A Set

𝒰 = user set, ℐ = item set, 𝒟 = training data, ℒ = loss function, 𝒢 = graph, 𝒩 = neighbourhood

LaTeX: \mathcal{U}

Bold Uppercase P Q R

= A Matrix

P ∈ ℝ^{m×d} = user embedding matrix, Q = item matrix, R = rating matrix, A = adjacency matrix

LaTeX: \mathbf{P}

Bold Lowercase p q e

= A Vector

p_u = user u's embedding vector, q_i = item i's vector, e = embedding, h = hidden state

LaTeX: \mathbf{p}

Plain lowercase r ŷ x

= A Scalar (a single number)

r_{ui} = one rating value, ŷ_{ui} = one predicted score, x_k = one feature value. Plain font = scalar. Always.

07
M1 · L1 — Worked Example

Decoding a real paper sentence

Reading a parameter
definition from scratch

From a Matrix Factorisation paper
"Let θ = {P, Q} denote the model parameters, where P ∈ ℝ^{m×d} and Q ∈ ℝ^{n×d}, and λ > 0 controls the regularisation strength."
θ
All learnable parameters — both embedding matrices together
{P, Q}
θ is the set containing exactly two matrices: user embeddings and item embeddings
P ∈ ℝ^{m×d}
P is a real-valued matrix with m rows (one per user) and d columns (latent dims)
Q ∈ ℝ^{n×d}
Q is same shape but n rows (one per item)
λ > 0
λ (lambda) is a positive scalar hyperparameter controlling regularisation strength
Plain English
Our model has two learnable matrices — one row per user, one per item — plus a regularisation strength hyperparameter.
08
M1 · L1 — RS Conventions

Conventions specific to recommender systems

The RS-specific
symbol vocabulary

SymbolNameMeaning in RSExample
𝒰User setSet of all users in the system|𝒰| = m users total
Item setSet of all items in the system|ℐ| = n items total
𝒪Observed setAll known user-item interactions(u,i) ∈ 𝒪 means u interacted with i
r_{ui}RatingScore user u gave item ir_{ui} ∈ {1,2,3,4,5} or {0,1}
ŷ_{ui}Predicted scoreModel's predicted preferenceŷ_{ui} = p_u^⊤ q_i
ℐ⁺_uPositive itemsItems user u has interacted withℐ⁺_u = {i : r_{ui} > 0}
𝒩_uNeighbourhoodUsers or items connected to uIn graph-based RS: direct neighbours
dLatent dimensionSize of embedding vectorsp_u ∈ ℝ^d, typically d=64 or 128
09
M1 · L1 — Important Warning

The most important habit to build

Same symbol, different
meaning in different papers

Conventions are strong but not fixed. These are all valid usages in different RS papers:

SymbolPaper APaper B
βRegularisation weightKL penalty in β-VAE
kNumber of items to recommendLatent dimension size
LLoss functionNumber of GNN layers
dEmbedding dimensionDegree of a graph node
𝒩_uNeighbour usersNeighbour items
Rule #1

Always find the notation section

Usually Table 1 or the start of Section 3. Read it first before reading equations.

Rule #2

First use = definition

The first time a symbol appears in the text, the paper defines it. If you're confused mid-paper — search backwards for the first occurrence.

The habit: Before reading equations, spend 5 minutes reading the notation table and the first paragraph of Section 3. It saves you 30 minutes of confusion later.

10
M1 · L1 — Key Takeaways

What to remember

01 · Greek letters

α, β, γ, λ, θ, σ, μ, ε

Learn their conventional RS meaning. α = learning rate. λ = regularisation. θ = all parameters. σ = sigmoid.

02 · Font = type

𝒰 = set · P = matrix · p = vector · r = scalar

The font tells you what kind of mathematical object you're looking at, before you read what it represents.

03 · Context first

Read the notation table before equations

Same symbol, different paper, different meaning. Check every time. First occurrence = definition.

Next: M1 · L2 — Subscripts, Superscripts & Indexing

11
← → arrow keys to navigate