medium end_to_end

Train Binary Classifier End-to-End

Train a binary linear classifier end-to-end using sigmoid + binary cross-entropy (BCE) loss and full-batch gradient descent.

The model

Given a feature matrix x of shape (N, d) and binary labels y of shape (N,), the model computes class probabilities via:

$$p_i = \sigma(x_i^\top w) = \frac{1}{1 + e^{-x_i^\top w}}$$

Gradient of mean BCE

The gradient of the mean BCE loss w.r.t. w has a clean closed form:

$$\nabla_w \mathcal{L} = \frac{1}{N} X^\top (p - y)$$

This is the residual (p - y) back-projected through the feature matrix โ€” one line of code.

Algorithm

w = w0
for epoch in range(n_epochs):
    z    = x @ w
    p    = sigmoid(z)
    grad = (1/N) * x.T @ (p - y)
    w    = w - lr * grad
return w

Note: w0 is passed explicitly as an argument so tests are deterministic โ€” no random initialisation inside the function.

Inputs:

  • x: shape (N, d) โ€” feature matrix.
  • y: shape (N,) โ€” binary labels in {0.0, 1.0}.
  • w0: shape (d,) โ€” initial weights (deterministic).
  • lr: float โ€” learning rate.
  • n_epochs: int โ€” number of full-batch gradient steps.

Output: final weight vector w of shape (d,).

Edge cases: lr=0 or n_epochs=0 โ†’ return w0 unchanged.

Hints

classification training logistic

Sign in to attempt this problem and view the solution.