easy research

Label Smoothing

Implement label smoothing from “Rethinking the Inception Architecture” (Szegedy et al., 2016).

Label smoothing replaces hard one-hot targets with soft targets to prevent overconfidence:

$$y_i' = (1 - \epsilon) \cdot y_i + \frac{\epsilon}{K}$$

Where:

$y_i$ is the original one-hot label
$\epsilon$ is the smoothing factor
$K$ is the number of classes

Given:

labels: shape (batch,) — integer class labels
n_classes: integer K
epsilon: float smoothing factor

Output: Tensor of shape (batch, n_classes) — smoothed label distribution.

Hints

label-smoothing szegedy-2016 regularization classification

Sign in to attempt this problem and view the solution.

Detecting runtime...