Implement sinusoidal positional encoding as described in “Attention Is All You Need”.
$$PE_{(pos, 2i)} = \sin\left(\frac{pos}{10000^{2i/d_{model}}}\right)$$ $$PE_{(pos, 2i+1)} = \cos\left(\frac{pos}{10000^{2i/d_{model}}}\right)$$
where $pos$ is the position and $i$ is the dimension index.
Input:
seq_len: length of the sequence (integer) d_model: dimension of the model (integer, must be even)
Output: A tensor of shape (seq_len, d_model) with positional encodings