Implement a simple SGD training loop for linear regression.
Given initial weight w and bias b, perform n_steps of gradient descent:
For each step:
Input:
x: shape (N, 1), y: shape (N, 1) w_init: shape (1, 1), b_init: shape (1,) lr: learning rate (float), n_steps: number of steps (int)
Output: A dict with final “w” (shape (1,1)), “b” (shape (1,)), and “final_loss” (scalar).