medium research

Stochastic Depth

Implement Stochastic Depth from “Deep Networks with Stochastic Depth” (Huang et al., 2016).

During training, each residual block is randomly dropped with probability drop_prob. During inference, the block output is scaled by (1 - drop_prob) to match expected values.

Given:

  • x: shape (batch, d) — input (residual connection)
  • block_output: shape (batch, d) — output of the residual block
  • drop_prob: float — probability of dropping the block
  • training: bool — whether in training mode

During inference (training=False): $$\text{out} = x + (1 - \text{drop\_prob}) \cdot \text{block\_output}$$

Note: For deterministic testing, we only test inference mode (training=False).

Output: Tensor of shape (batch, d).

Hints

stochastic-depth huang-2016 regularization residual dropout
Detecting runtime...