We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Eval Loop with Metrics
Run a binary-classifier evaluation loop over a list of batches and return both mean BCE loss and accuracy.
Why eval is different from training
During training you call loss.backward() and update parameters.
During evaluation you skip all of that โ no gradients, no updates. You
just forward-pass each batch, accumulate metrics, and report aggregates.
In PyTorch you would wrap the whole loop in torch.no_grad() for
efficiency; here the weights are already frozen so the result is the same.
Critical: average over all examples, not over batches
If you average the per-batch averages you get the wrong answer whenever
batches are unequal in size. The correct approach is to keep running sums
(total_loss, total_correct, total_examples) and divide once at the
end.
Numerical stability for BCE with logits
The naive form log(sigmoid(z)) overflows when z is large. Use the
stable form instead:
loss_per_example = max(z, 0) - z*y + log1p(exp(-|z|))
where z = x @ weights. This is equivalent to binary_cross_entropy_with_logits
but safe for large positive or negative logits.
Output format
Return a 1-D tensor of shape (2,) containing [mean_bce_loss, accuracy].
Inputs
-
weights: shape(d,)โ frozen linear-classifier weights. -
xs: shape(num_batches, B, d)โ stacked batch inputs. -
ys: shape(num_batches, B)โ stacked binary labels in {0.0, 1.0}.
Hints
Sign in to attempt this problem and view the solution.