Implement a simple linear regression model with MSE loss.
The model predicts: $\hat{y} = x \cdot w + b$
The MSE loss is: $L = \frac{1}{N} \sum (y - \hat{y})^2$
Input:
x: input tensor of shape (N, 1) w: weight scalar of shape (1, 1) b: bias scalar of shape (1,) y: target values of shape (N, 1)
Output: A dict with “prediction” (shape (N, 1)) and “loss” (scalar).