Custom Activation with Gradient

Implement a custom activation function f(x) = x * sigmoid(x) (SiLU/Swish) and compute its gradient at given input values.

Return a tuple of (output, gradient) where:

The derivative is: sigmoid(x) + x * sigmoid(x) * (1 - sigmoid(x))

Input: A 1D tensor x.

Output: A dict with keys “output” and “gradient”, each a 1D tensor of the same shape as input.

API Reference: