easy research

RMSNorm

Implement RMSNorm from “Root Mean Square Layer Normalization” (Zhang & Sennrich, 2019).

RMSNorm is a simpler alternative to LayerNorm used in LLaMA, Gemma, and other modern LLMs. It normalizes by the root mean square without centering:

$$\text{RMSNorm}(x) = \frac{x}{\text{RMS}(x) + \epsilon} \cdot \gamma$$

where $\text{RMS}(x) = \sqrt{\frac{1}{d} \sum_{i=1}^{d} x_i^2}$

Input:

  • x: shape (batch, d)
  • gamma: shape (d,) — learnable scale parameter
  • eps: float, small constant (default 1e-6)

Output: Tensor of shape (batch, d).

Hints

rmsnorm normalization zhang-sennrich-2019 llm
Detecting runtime...