easy research

Label Smoothing

Implement label smoothing from “Rethinking the Inception Architecture” (Szegedy et al., 2016).

Label smoothing replaces hard one-hot targets with soft targets to prevent overconfidence:

$$y_i' = (1 - \epsilon) \cdot y_i + \frac{\epsilon}{K}$$

Where:

  • $y_i$ is the original one-hot label
  • $\epsilon$ is the smoothing factor
  • $K$ is the number of classes

Given:

  • labels: shape (batch,) — integer class labels
  • n_classes: integer K
  • epsilon: float smoothing factor

Output: Tensor of shape (batch, n_classes) — smoothed label distribution.

Hints

label-smoothing szegedy-2016 regularization classification
Detecting runtime...