We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
hard
research
Prefix Tuning
Implement prefix tuning from “Prefix-Tuning: Optimizing Continuous Prompts for Generation” (Li & Liang, 2021).
In prefix tuning, learnable prefix vectors are prepended to the key and value sequences in attention. The original model weights are frozen; only prefix parameters are trained.
Given:
-
Q: shape(seq_len, d_k)— queries (from frozen model) -
K: shape(seq_len, d_k)— keys (from frozen model) -
V: shape(seq_len, d_k)— values (from frozen model) -
prefix_K: shape(prefix_len, d_k)— learnable prefix keys -
prefix_V: shape(prefix_len, d_k)— learnable prefix values
Steps:
-
Prepend prefix_K to K: full_K has shape
(prefix_len + seq_len, d_k) -
Prepend prefix_V to V: full_V has shape
(prefix_len + seq_len, d_k) - Compute standard attention: softmax(Q @ full_K^T / sqrt(d_k)) @ full_V
Output: Tensor of shape (seq_len, d_k).
Hints
prefix-tuning
li-liang-2021
parameter-efficient
fine-tuning
Sign in to attempt this problem and view the solution.