We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Bidirectional RNN
Implement a bidirectional vanilla RNN that runs two passes over a sequence โ one forward, one backward โ and concatenates their hidden states at every timestep.
Why bidirectional?
A standard (unidirectional) RNN can only see past context: at time
step t the hidden state summarises x_0, โฆ, x_t. A bidirectional
RNN fixes this by also running a second RNN from t = T-1 down to
t = 0. Concatenating the two hidden states gives each output token
access to both past and future context โ crucial for tasks like
named-entity recognition, part-of-speech tagging, and the encoder side
of classic seq2seq models.
Cell rule (vanilla RNN)
$$h_t = \tanh\!\left([x_t;\, h_{t-1}]\, W\right)$$
where $[x_t; h_{t-1}]$ is the concatenation of $x_t \in \mathbb{R}^{d_{in}}$ and $h_{t-1} \in \mathbb{R}^{d_h}$, giving a vector of length $d_{in} + d_h$, and $W \in \mathbb{R}^{(d_{in}+d_h) \times d_h}$.
The same rule applies for both directions, each with its own weight
matrix (w_fwd and w_bwd).
Algorithm
Forward pass: iterate $t = 0, 1, \ldots, T-1$, maintaining
h_fwd. Store each h_fwd_t.
Backward pass: iterate $t = T-1, T-2, \ldots, 0$, maintaining
h_bwd. Store each h_bwd_t.
Output: for each timestep $t$, concatenate
[h_fwd_t, h_bwd_t] along the last dimension. Return a tensor of
shape (N, T, 2 * d_h).
Inputs
-
x: shape(N, T, d_in)โ batch of sequences. -
w_fwd: shape(d_in + d_h, d_h)โ forward direction weights. -
w_bwd: shape(d_in + d_h, d_h)โ backward direction weights. -
h0_fwd: shape(N, d_h)โ initial hidden state for forward pass. -
h0_bwd: shape(N, d_h)โ initial hidden state for backward pass.
Output: shape (N, T, 2 * d_h).
No bias (simplifies the interface; a bias can always be baked into the weight matrix by appending a constant feature).
Hints
Sign in to attempt this problem and view the solution.