Last week, we looked at doing

classification using texts and trying to

train and understand

positive and negative sentiment in movie reviews.

We finished by looking at the effect of tokenizing words,

and saw that our classifier failed

to get any meaningful results.

The main reason for this was that the context

of words was hard to follow when

the words were broken down into

sub-words and the sequence in which

the tokens for the sub-words appear

becomes very important in understanding their meaning.

Let's take a look at that now.

The neural network is like

a function that when you feed it in data and labels,

it infers the rules from these,

and then you can use those rules.

So it could be seen as a function a little bit like this,

you take the data and you take the labels,

and you get the rules.

But this doesn't take any kind of sequence into account.

To understand why sequences can be important,

consider this set of numbers.

If you've never seen them before,

they're called the Fibonacci sequence.

So let's replace the actual values

with variables such as n_0,

n_1 and n_2, etc., to denote them.

Then the sequence itself can be derived

where a number is the sum of the two numbers before it.

So 3 is 2 plus 1,

5 is 2 plus 3,

8 is 3 plus 5, etc.

Our n_x equals n_x minus 1,

plus n_x minus 2,

where x is the position in the sequence.

Visualized, it might also look like this,

one and two feed into

the first function and three comes out.

Two gets carried over to the next,

where it's fed in along with the three to give us a five.

The three is carried on to the next where it's fed into

the function along with the five to

get an eight and so on.

This is similar to the basic idea of

a recurrent neural network or RNN,

which is often drawn a little like this.

You have your x as in input and your y as an output.

But there's also an element that's fed

into the function from a previous function.

That becomes a little more clear when

you chain them together like this,

x_0 is fed into the function returning y_0.

An output from the function is

then fed into the next function,

which gets fed into the function along

with x_2 to get y_2,

producing an output and continuing the sequence.

As you can see,

there's an element of x_0

fed all the way through the network,

similar with x_1 and x_2 etc.

This forms the basis of

the recurrent neural network or RNN.

I'm not going to go into detail and how they work,

but you can learn much more about

them at this course from Andrew.