easy primitives

Implement Gradient Descent Step

Implement a single step of vanilla gradient descent.

$$w_{t+1} = w_t - \eta \cdot \nabla w_t$$

Input:

weights: current parameter tensor
gradients: gradient tensor (same shape as weights)
lr: learning rate $\eta$

Output: Updated weights after one gradient descent step

Hints

optimization gradient-descent basics

Detecting runtime...