Implement LoRA (Low-Rank Adaptation) from “LoRA: Low-Rank Adaptation of Large Language Models” (Hu et al., 2021).
LoRA freezes the pre-trained weight matrix W and adds a low-rank update: $$h = (W + \frac{\alpha}{r} B \cdot A) \cdot x$$
Where:
W: shape (d_out, d_in) — frozen pre-trained weights A: shape (r, d_in) — low-rank down-projection (trainable) B: shape (d_out, r) — low-rank up-projection (trainable) alpha: scaling factor r: rank of the adaptation x: shape (d_in,) — input
Output: Tensor of shape (d_out,).