Skip Connection Block

Implement a residual (skip connection) block.

A residual block computes: $$\text{output} = \text{ReLU}(x + F(x))$$

where $F(x)$ is a two-layer transform: $$F(x) = W_2 \cdot \text{ReLU}(W_1 \cdot x + b_1) + b_2$$

This is the core building block of ResNets. The skip connection adds the input x directly to the output of the transform, helping gradients flow.

Input:

Output: Tensor of shape (batch, dim).