A Deep Recurrent Neural Network (RNN) extends a basic single-layer RNN into multiple layers of hidden states, effectively incorporating deep learning into the RNN architecture.
How does a Deep RNN work?
Walkthrough
[1] Given
↳ A sequence of four inputs X1, X2, X3, X4 ⬛️
↳ Recurrent weights and biases for hidden layers a 🟩, b 🟧, c 🟪, and the output layer y 🟦.
[2] Initialize Hidden States
↳ Set a0, b0, c0 to zeros
— Process X1 (t = 1)—
[3] First Hidden Layer (a) 🟩: a0 → a1
↳ The transformation matrix is horizontal concatenation of input weights, hidden state weights and biases, visualized as [⬛️ | 🟩 | ⬜️] .
↳ The state matrix is vertical concatenation of input X1, previous hidden state a0, and an extra 1, visualized as [⬛️ ; 🟩 ; 1].
↳ Multiply the two matrices to obtain new hidden state a1 = [0 ; 1].
[4] Second Hidden Layer (b) 🟪: b0 → b1
↳ First layer a1 🟩 becomes the input.
↳ The transformation matrix is visualized as [🟩 | 🟪 | ⬜️].
↳ The state matrix is the combination of a1, b0, and 1, visualized as [🟩; 🟪 ; 1].
↳ Multiply the two matrices to obtain new hidden state b1 = [1; -1].
[5] Third Hidden Layer (c) 🟧: c0 → c1
↳ Second layer b 🟪 becomes the input.
↳ The transformation matrix is visualized as [🟪 | 🟧 | ⬜️].
↳ The state matrix is the combination of a1, b0, and 1, visualized as [🟪; 🟧; 1].
↳ Multiply the two matrices to obtain new hidden state b1 = [1; -1].
[6] Output Layer (Y) 🟦
↳ The transformation matrix is visualized as [🟧 | ⬜️].
↳ The state matrix is the combination of c0 and , visualized as [🟧; 1].
↳ Multiply the two matrices to obtain output Y1 = [3; 0; 3].
— Process X2 (t = 2)—
[7] Previous Hidden States
↳ Copy the values of a1, b1, c1.
[8] Hidden 🟩🟪🟧 + Output 🟦
↳ Repeat [3]-[6] to obtain output Y2 = [5; 0; 4]
— Process X3 (t = 3)—
[9] Previous Hidden States
↳ Copy the values of a2, b2, c2.
[10] Hidden 🟩🟪🟧 + Output 🟦
↳ Repeat [3]-[6] to obtain output Y3 = [13; -1; 9]
— Process X4 (t = 4)—
[11] Previous Hidden States
↳ Copy the values of a3, b3, c3.
[12] Hidden 🟩🟪🟧 + Output 🟦
↳ Repeat [3]-[6] to obtain output Y4 = [15; 7; 2]
Movie
Download