Build a binary classifier that outputs a probability using sigmoid.
The forward pass computes:
Also compute the Binary Cross-Entropy (BCE) loss: $$\text{BCE} = -\frac{1}{N}\sum [y \log(p) + (1-y)\log(1-p)]$$
Input:
x: input tensor of shape (batch, features) W: weight matrix of shape (features, 1) b: bias scalar of shape (1,) y: target labels of shape (batch, 1) with values 0 or 1
Output: A dict with “prediction” (shape (batch, 1)) and “loss” (scalar).