hard primitives

Implement Positional Encoding

Implement sinusoidal positional encoding as described in “Attention Is All You Need”.

$$PE_{(pos, 2i)} = \sin\left(\frac{pos}{10000^{2i/d_{model}}}\right)$$ $$PE_{(pos, 2i+1)} = \cos\left(\frac{pos}{10000^{2i/d_{model}}}\right)$$

where $pos$ is the position and $i$ is the dimension index.

Input:

seq_len: length of the sequence (integer)
d_model: dimension of the model (integer, must be even)

Output: A tensor of shape (seq_len, d_model) with positional encodings

Hints

transformer positional-encoding attention

Sign in to attempt this problem and view the solution.

Detecting runtime...