easy end_to_end

Logistic Regression From Scratch

Implement logistic regression trained by gradient descent from scratch.

Logistic regression is the simplest linear classification model. It outputs class probabilities through a sigmoid-transformed linear combination of features:

$$p_i = \sigma(x_i^\top w) = \frac{1}{1 + e^{-x_i^\top w}}$$

The model is trained by minimising the binary cross-entropy (BCE) loss:

$$\mathcal{L} = -\frac{1}{N}\sum_{i=1}^{N} \left[y_i \log p_i + (1 - y_i) \log(1 - p_i)\right]$$

The gradient of this loss with respect to $w$ has a clean closed form:

$$\nabla_w \mathcal{L} = \frac{1}{N} X^\top (p - y)$$

Each gradient descent step is then:

$$w \leftarrow w - \eta \cdot \nabla_w \mathcal{L}$$

where $\eta$ is the learning rate (lr).

No bias term is fit separately. If you want a bias, augment x by prepending a column of ones โ€” the first weight will act as the intercept.

Algorithm:

  1. Initialise $w = \mathbf{0}_d$
  2. For n_steps iterations:
    • $p = \sigma(Xw)$
    • $\text{grad} = X^\top (p - y) / N$
    • $w \leftarrow w - \text{lr} \cdot \text{grad}$
  3. Return final $$\mathcal{L} = -\frac{1}{N}\sum_{i=1}^{N} \left[y_i \log p_i + (1 - y_i) \log(1 - p_i)\right]$$0

Inputs:

  • x: feature matrix of shape (N, d) โ€” augment with a column of 1s yourself for a bias term
  • y: binary labels of shape (N,) โ€” values in $$\mathcal{L} = -\frac{1}{N}\sum_{i=1}^{N} \left[y_i \log p_i + (1 - y_i) \log(1 - p_i)\right]$$1
  • lr: float โ€” learning rate (step size)
  • n_steps: int โ€” number of gradient steps

Output: weight vector w of shape (d,).

Hints

regression classification training

Sign in to attempt this problem and view the solution.