Implement the Dropout regularization technique.
During training, randomly zero out elements with probability $p$ and scale the remaining elements by $\frac{1}{1-p}$:
$$y_i = \frac{x_i \cdot m_i}{1 - p}$$
where $m_i \sim \text{Bernoulli}(1 - p)$ is a binary mask.
For this problem, you are given the binary mask directly.
Input:
x: input tensor of any shape mask: binary tensor of the same shape (1 = keep, 0 = drop) p: dropout probability (fraction of elements to drop) Output: Tensor with dropout applied and properly scaled