Every RS paper is written in two languages: English and math symbols.
This lesson builds your symbol vocabulary — the alphabet of equations.
When you hit an unfamiliar symbol mid-equation, your reading stops. You lose the thread. The math starts to feel impossible — but it's not the math that's hard, it's the alphabet.
Once you build symbol fluency, you'll realise most RS equations are built from the same 20–30 symbols in different arrangements.
The test: You should be able to read the symbols in this equation out loud, even before understanding what it means:
\( \mathcal{L} = -\sum_{(u,i,j)\in\mathcal{D}} \ln\sigma(\hat{y}_{ui} - \hat{y}_{uj}) + \lambda\|\theta\|^2 \)
Critical: Σ (capital, summation) vs σ (lowercase, sigmoid) look similar and mean completely different things. Always check case.
Capital Sigma
Summation operator
\(\sum_{i=1}^n x_i\)
Lowercase sigma
Sigmoid function
\(\sigma(x) = \frac{1}{1+e^{-x}}\)
Capital Pi
Product notation
\(\prod_{i=1}^n p_i\)
Lowercase pi
Policy in RL-RS or
probability vector
Theta (plain)
All model parameters
as a set
Theta (capital)
Big-Theta notation
(complexity)
Phi
Feature mapping
\(\phi: \mathcal{X} \to \mathbb{R}^d\)
Phi (capital)
Full parameter matrix
or CDF of Gaussian
| ∈ | "in" / "element of" | u ∈ 𝒰 → u is a user in the user set |
| ∉ | "not in" | j ∉ 𝒩_u → j is not in u's neighbourhood |
| ∀ | "for all" | ∀u ∈ 𝒰 → for every user |
| ∃ | "there exists" | ∃i such that r_{ui} > 0 |
| ∝ | "proportional to" | P(u|i) ∝ P(i|u)·P(u) |
| ≜ | "defined as" | Introduces a new definition |
| ≈ | "approximately" | Used when simplifying or approximating |
| ℝ | "real numbers" | ℝ^{m×n} = real matrix, m rows × n cols |
| ℕ | "natural numbers" | k ∈ ℕ → k is a positive integer |
| ⊆ | "subset of" | 𝒮 ⊆ ℐ → 𝒮 is a subset of items |
| ← | "is assigned" | θ ← θ - α∇ℒ → update rule |
| · | "dot product" | p_u · q_i = scalar similarity score |
𝒰 = user set, ℐ = item set, 𝒟 = training data, ℒ = loss function, 𝒢 = graph, 𝒩 = neighbourhood
LaTeX: \mathcal{U}
P ∈ ℝ^{m×d} = user embedding matrix, Q = item matrix, R = rating matrix, A = adjacency matrix
LaTeX: \mathbf{P}
p_u = user u's embedding vector, q_i = item i's vector, e = embedding, h = hidden state
LaTeX: \mathbf{p}
r_{ui} = one rating value, ŷ_{ui} = one predicted score, x_k = one feature value. Plain font = scalar. Always.
| Symbol | Name | Meaning in RS | Example |
|---|---|---|---|
| 𝒰 | User set | Set of all users in the system | |𝒰| = m users total |
| ℐ | Item set | Set of all items in the system | |ℐ| = n items total |
| 𝒪 | Observed set | All known user-item interactions | (u,i) ∈ 𝒪 means u interacted with i |
| r_{ui} | Rating | Score user u gave item i | r_{ui} ∈ {1,2,3,4,5} or {0,1} |
| ŷ_{ui} | Predicted score | Model's predicted preference | ŷ_{ui} = p_u^⊤ q_i |
| ℐ⁺_u | Positive items | Items user u has interacted with | ℐ⁺_u = {i : r_{ui} > 0} |
| 𝒩_u | Neighbourhood | Users or items connected to u | In graph-based RS: direct neighbours |
| d | Latent dimension | Size of embedding vectors | p_u ∈ ℝ^d, typically d=64 or 128 |
Conventions are strong but not fixed. These are all valid usages in different RS papers:
| Symbol | Paper A | Paper B |
|---|---|---|
| β | Regularisation weight | KL penalty in β-VAE |
| k | Number of items to recommend | Latent dimension size |
| L | Loss function | Number of GNN layers |
| d | Embedding dimension | Degree of a graph node |
| 𝒩_u | Neighbour users | Neighbour items |
Usually Table 1 or the start of Section 3. Read it first before reading equations.
The first time a symbol appears in the text, the paper defines it. If you're confused mid-paper — search backwards for the first occurrence.
The habit: Before reading equations, spend 5 minutes reading the notation table and the first paragraph of Section 3. It saves you 30 minutes of confusion later.
Learn their conventional RS meaning. α = learning rate. λ = regularisation. θ = all parameters. σ = sigmoid.
The font tells you what kind of mathematical object you're looking at, before you read what it represents.
Same symbol, different paper, different meaning. Check every time. First occurrence = definition.
Next: M1 · L2 — Subscripts, Superscripts & Indexing