Rotary Position Embeddings

Implement Rotary Position Embeddings (RoPE) from “RoFormer: Enhanced Transformer with Rotary Position Embedding” (Su et al., 2021).

RoPE encodes position by rotating pairs of dimensions. For a vector at position pos, pair dimensions (2i, 2i+1) with frequency theta_i = 1 / 10000^(2i/d):

$$x'_{2i} = x_{2i} \cos(pos \cdot \theta_i) - x_{2i+1} \sin(pos \cdot \theta_i)$$ $$x'_{2i+1} = x_{2i} \sin(pos \cdot \theta_i) + x_{2i+1} \cos(pos \cdot \theta_i)$$