hard framework

Custom Activation with Gradient

Implement a custom activation function f(x) = x * sigmoid(x) (SiLU/Swish) and compute its gradient at given input values.

Return a tuple of (output, gradient) where:

  • output = x * sigmoid(x)
  • gradient = d/dx [x * sigmoid(x)]

The derivative is: sigmoid(x) + x * sigmoid(x) * (1 - sigmoid(x))

Input: A 1D tensor x.

Output: A dict with keys “output” and “gradient”, each a 1D tensor of the same shape as input.

API Reference:

  • PyTorch: custom autograd or torch.autograd.grad
  • JAX: jax.value_and_grad or jax.grad

Hints

custom-activation silu swish autograd jax.grad
Detecting runtime...