medium primitives

Implement Momentum Update

Implement a single step of SGD with momentum.

$$v_{t+1} = \mu \cdot v_t + \nabla w_t$$ $$w_{t+1} = w_t - \eta \cdot v_{t+1}$$

where $\mu$ is the momentum coefficient and $\eta$ is the learning rate.

Input:

weights: current parameters
gradients: current gradients
velocity: current velocity (momentum buffer)
lr: learning rate
momentum: momentum coefficient

Output: A map/tuple with new_weights and new_velocity

Hints

optimization momentum sgd

Sign in to attempt this problem and view the solution.

Detecting runtime...