Display equations and inline math interleave with prose. KaTeX or MathJax may render math after widget mount; the walker should treat math containers as no-mark zones (they don't contain natural-language prose).
Per plan/01_renderer.md, math blocks render as <div class="math-display" data-block-id="…"> and inline math is wrapped in <span class="math-inline">. Both should be skipped by the term walker.
Gradients and Jacobians
The gradient of a scalar function \htmlClass:Rn→R is the vector of partial derivatives:
The Hessian is the matrix of second-order partials. In the LaTeX source above, the words "gradient", "Jacobian", and "Hessian" do not appear — they're only in the surrounding prose, which is where they should be marked.
Eigenvalues and SVD
An eigenvalue\htmlClass and eigenvector\htmlClass satisfy \htmlClass\htmlClass=\htmlClass\htmlClass. The SVD factors any matrix \htmlClass as:
\htmlClass=\htmlClass\htmlClass\htmlClass⊤
where \htmlClass holds the singular values. PCA is the special case where we use the SVD of a centered data matrix to find directions of maximum variance.
Norms
The L2 norm of v is ∥v∥2=∑ivi2. The L1 norm is ∥v∥1=∑i∣vi∣. The generic norm in prose refers to whichever is in scope.
The inverse is the logit. Cross-entropy loss between a target distribution p and a prediction q is H(p,q)=−∑ipilogqi. KL divergence is DKL(p∥q)=∑ipilogqipi.
Attention
In self-attention, queries \htmlClass, keys \htmlClass, and values \htmlClass combine via scaled dot product:
The numerator's normalization is cosine similarity when Q and K are unit-norm. The whole operation acts on tensors of shape (batch, heads, seq, d_k). einsum notation compresses the math: bhqd, bhkd -> bhqk.
Positional encoding adds a sinusoidal vector to each token embedding so the dot-product attention can distinguish positions:
Inline math in dense paragraphs: when \htmlClass(\htmlClass)=\htmlClass2 and g(\htmlClass)=e\htmlClass, the chain rule gives d\htmlClassdg(\htmlClass(\htmlClass))=g′(\htmlClass(\htmlClass))\htmlClass′(\htmlClass)=e\htmlClass2⋅2\htmlClass. The terms "gradient" and "attention" appear in this very paragraph and should be marked in the prose, but the math expressions f, g, ex2 should NOT be touched by the walker.