easy primitives

Implement Gradient Descent Step

Implement a single step of vanilla gradient descent.

$$w_{t+1} = w_t - \eta \cdot \nabla w_t$$

Input:

  • weights: current parameter tensor
  • gradients: gradient tensor (same shape as weights)
  • lr: learning rate $\eta$

Output: Updated weights after one gradient descent step

Hints

optimization gradient-descent basics
Detecting runtime...