hard primitives

Transposed Conv2d

Implement transposed 2D convolution (also called fractionally-strided convolution, or colloquially “deconvolution”).

Transposed convolution is the gradient of regular convolution. Where regular conv gathers a kernel-sized patch into one output value, transposed conv scatters one input value across a kernel-sized patch in the output. It is used to upsample spatial feature maps in decoders, GANs, and segmentation networks (Long et al., FCN, 2015).

“Deconvolution” is a common nickname but a misnomer — this is not a true mathematical inverse of convolution; it is simply a forward op that increases spatial dimensions.

Weight convention (matches PyTorch): The weight tensor w has shape (C_in, C_out, kH, kW) — input channels first. This is the opposite of regular conv2d which has (C_out, C_in, kH, kW).

Output shape:

$$H' = (H - 1) \cdot s + k_H$$ $$W' = (W - 1) \cdot s + k_W$$

Algorithm (scatter / accumulate):

$$\text{out}[n, c_{out}, i \cdot s + k_h,\; j \cdot s + k_w] \mathrel{+}= x[n, c_{in}, i, j] \cdot w[c_{in}, c_{out}, k_h, k_w]$$

for every $(n, c_{in}, i, j, c_{out}, k_h, k_w)$.

Inputs:

  • x: shape (N, C_in, H, W).
  • w: shape (C_in, C_out, kH, kW).
  • stride: int (default 1) — applied to both H and W.

Output: shape (N, C_out, H', W'). No padding, no bias.

Hints

cnn upsampling

Sign in to attempt this problem and view the solution.