easy primitives

Ridge Regression

Implement ridge regression โ€” L2-regularized least squares โ€” in closed form.

Ridge regression minimises the regularised objective:

$$\hat{w} = \arg\min_w \|y - Xw\|^2 + \lambda \|w\|^2$$

The closed-form solution adds a scaled identity to $X^\top X$ before inverting:

$$\hat{w} = (X^\top X + \lambda I)^{-1} X^\top y$$

Why ridge?

  • Prevents overfitting by penalising large weights (bias toward zero).
  • Fixes the singular-matrix problem when $N < d$ (underdetermined) or when columns of $X$ are highly correlated โ€” the $\lambda I$ term always makes the system positive-definite and therefore uniquely solvable.

Algorithm:

  1. Let $d$ = number of features (x.shape[1]).
  2. Build the $d \times d$ matrix $A = X^\top X + \lambda I$.
  3. Build the $d$-dim vector $X^\top X$0.
  4. Solve $X^\top X$1 for $X^\top X$2 using torch.linalg.solve (do not explicitly invert $X^\top X$3 โ€” solve is more numerically stable and faster).

Numerical note: use torch.linalg.solve(A, b) / jnp.linalg.solve(A, b), not torch.linalg.inv(A) @ b. Both are mathematically equivalent but solve avoids forming the explicit inverse.

Inputs:

  • x: feature matrix of shape (N, d)
  • y: target vector of shape (N,)
  • lam: float โ€” regularisation strength ($X^\top X$4)

Output: weight vector w of shape (d,).

Hints

regression regularization

Sign in to attempt this problem and view the solution.