Squeeze-and-Excitation Block

Implement the Squeeze-and-Excitation (SE) block from “Squeeze-and-Excitation Networks” (Hu et al., 2018).

SE blocks adaptively recalibrate channel-wise feature responses by:

Squeeze: Global average pooling across spatial dimensions $z_c = \frac{1}{H \times W} \sum_{i,j} x_{c,i,j}$
Excitation: Two FC layers to produce channel attention weights $s = \sigma(W_2 \cdot \text{ReLU}(W_1 \cdot z))$
Scale: Multiply original features by attention weights $\tilde{x}_c = s_c \cdot x_c$

For simplicity, input is (channels, spatial) (flattened spatial dims).

Input:

Output: Tensor of shape (C, S) — recalibrated features.