hard research

Prefix Tuning

Implement prefix tuning from “Prefix-Tuning: Optimizing Continuous Prompts for Generation” (Li & Liang, 2021).

In prefix tuning, learnable prefix vectors are prepended to the key and value sequences in attention. The original model weights are frozen; only prefix parameters are trained.

Given:

  • Q: shape (seq_len, d_k) — queries (from frozen model)
  • K: shape (seq_len, d_k) — keys (from frozen model)
  • V: shape (seq_len, d_k) — values (from frozen model)
  • prefix_K: shape (prefix_len, d_k) — learnable prefix keys
  • prefix_V: shape (prefix_len, d_k) — learnable prefix values

Steps:

  1. Prepend prefix_K to K: full_K has shape (prefix_len + seq_len, d_k)
  2. Prepend prefix_V to V: full_V has shape (prefix_len + seq_len, d_k)
  3. Compute standard attention: softmax(Q @ full_K^T / sqrt(d_k)) @ full_V

Output: Tensor of shape (seq_len, d_k).

Hints

prefix-tuning li-liang-2021 parameter-efficient fine-tuning
Detecting runtime...