Nucleus (Top-P) Sampling

Implement the nucleus filtering step of Top-P sampling from “The Curious Case of Neural Text Degeneration” (Holtzman et al., 2020).

Nucleus sampling restricts the vocabulary to the smallest set of tokens whose cumulative probability exceeds a threshold p. This produces more natural text than top-k or pure sampling.

Given:

probs: shape (vocab_size,) — probability distribution over vocabulary
p: float — cumulative probability threshold

Steps:

Sort probabilities in descending order
Compute cumulative sum
Find the smallest set where cumsum >= p
Zero out all probabilities outside this nucleus
Renormalize

Output: Tensor of shape (vocab_size,) — filtered and renormalized distribution.

Nucleus (Top-P) Sampling

Hints