We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
easy
end_to_end
Gradient Clipping
Implement gradient clipping by norm.
Given a list of gradient tensors, compute the global norm and clip if it
exceeds max_norm:
-
Global norm: $\|g\| = \sqrt{\sum_i \sum_j g_{ij}^2}$ (L2 norm of all gradients concatenated)
-
Clip: If $\|g\| > \text{max\_norm}$, scale all gradients by $\frac{\text{max\_norm}}{\|g\|}$
Input:
-
gradients: a list of tensors (the gradients) -
max_norm: maximum allowed norm (float)
Output: A dict with:
- “clipped_gradients”: list of clipped gradient tensors (same shapes)
- “global_norm”: the original global norm (scalar)
Hints
gradient-clipping
optimization
training
stability
Sign in to attempt this problem and view the solution.