medium primitives

Implement KL Divergence

Implement the Kullback-Leibler (KL) divergence between two probability distributions.

$$D_{KL}(P \| Q) = \sum_{i} P(i) \log \frac{P(i)}{Q(i)}$$

Input: Two 1D tensors p and q representing probability distributions (sum to 1, all positive)

Output: A scalar representing the KL divergence (always non-negative)

Note: Add a small epsilon (1e-10) to avoid log(0) and division by zero.

Hints

information-theory divergence probability
Detecting runtime...