0:30

Let's begin by asking what can a linear recurrent network do?

Â Here's the equation for a linear recurrent network.

Â If there are N output neurons, then the output vector v is going to be N by 1,

Â and the feedforward input to these ouput neurons is given by W times U, and that's

Â again going to be an N by 1 vector. And we can call this N by 1 vector h, so

Â we don't have to write W times U each time.

Â And the feedback to the output neurons, is given by M x v, where M is the

Â recurrent connection matrix. What we want to find out, is how the

Â output of the network v(t) behaves for different values of M, the recurrent

Â connection matrix M. This is where eigenvectors come to our

Â rescue. Here is the differential equation that we

Â are trying to solve, to understand how v{t} behaves.

Â And this equation, as you can see, contains a mix of vectors and this matrix

Â times a vector. So that's a pretty complicated equation

Â to solve. Fortunately, we can use, eigenvectors to

Â solve this particular differential equation.

Â How do we do that? Well, suppose the connection matrix, the

Â N by N vector and connection matrix, is symmetric.

Â What does that mean? It means that for any particular pair of

Â output neurons, so if this is neuron number one and this is neuron number two.

Â Then the fact that the recurrent connection matrix is symmetric, just

Â means that if one connects to two with some particular value or strength A, then

Â two connects to one also, with the same value array.

Â So in other words, M 1, 2 is equal to M 2, 1, is equal to the value A.

Â And that's what it means for this matrix M, to be symmetric.

Â Now why is it useful to have the connection matrix M symmetric?

Â Well it turns out that if M is symmetric, then M has N different orthogonal

Â eigenvectors, and corresponding eigenvalues would satisfy the standard

Â eigenvector, eigenvalue equation shown here.

Â Now what does it mean for these eigenvectors to be orthogonal?

Â Well, if you take any two of these eigenvectors ei and ej, as long as i is

Â not equal to j, the fact that they're orthogonal just means that he dot product

Â of these two eigenvectors is going to be, you guessed it, 0.

Â Now we can further make these eigenvectors orthonormal, so, orthonormal

Â means, that these eigenvectors are not only orthogonal, but also, they have a

Â length of 1. And, we can do that, by dividing each of

Â these item vectors by their length, then we have the fact that ei.ei is going to

Â equal 1. And if that's satisfied, then we say that

Â we have a set of vectors, these eigenvectors, which are orthonormal to

Â each other. Why is it useful to have these

Â eigenvectors of them, which are orthonormal to each other?

Â Well, it turns out that we can now write any n-dimensional vector, including our

Â output vector, v{t}, as simply a linear combination of our orthonormal

Â eigenvectors. So these eigenvectors now form a new

Â basis or a new coordinate system for expressing n-dimensional vectors such as

Â v{t}. To drive home the point let's look at the

Â special case of a three dimensional space.

Â So here's x, y, and z, and lets suppose that this is our vector v{t}.

Â All we're doing now is expressing this vector vt in a new coordinate system

Â given by our orthonormal eigenvectors e1, e2, and e3.

Â And in the xyz system we were writing v(t), as simply the linear combination of

Â the first component of v times 1,0,0. This was our vector for x.

Â And v2 times 0,1,0, this is our y component.

Â And finally for the z component v3 times 0,0,1.

Â So all we are doing now is instead of expressing vt in the coordinate system

Â given by the x, y, and z vectors, we are now writing v as a different linear

Â combination, c1 times e1, plus c2 times e2, plus c3 times e3.

Â 5:14

Now why go through all this trouble? Well it turns out that if you substitute

Â the equation for v(t) in terms of the ei's into the differential equation for

Â v, and then further we use the eigenvector equation, as well as the

Â orthonormality of ei, then we can solve for ci as a function of time.

Â And so here is the equation for ci as a function of time, and once you have a

Â closed form expression for ci as the function of time, we can substitute that

Â value for ci into our equation for v. And therefore, we have solved the

Â differential equation, and we now have a complete expression, that characterizes

Â how v changes as a function of time. And if you want to get into all the

Â mathematical detail of how we derived this expression for ci(t), I would

Â encourage you to go to the supplementary materials on the course website.

Â We can now show that the eigenvalues of the recurrent connection matrix,

Â determine whether the network is stable or not.

Â To see this, suppose one of lambda I is bigger than 1.

Â Well, what happens to the output of the network, given by v(t), which is a linear

Â combination of the item vectors weighted by these coefficient ci?

Â Well if one of the lambda I's is bigger than 1, lets say that this lambda I here

Â is equal to 2, which is bigger than 1. Then this term ends up being an

Â exponential function of time. And so as time goes on you're going to

Â have this term becoming larger and larger, and therefore ci of t is also

Â going to become larger and larger. And so the output of the network then,

Â also grows without any bound, which means that v(t) explodes, and so what you end

Â up getting is an unstable network. On the other hand if all the eigenvalues

Â are less than 1, then you should be able to convince yourself, by plugging in

Â values of lambda I less than 1, in our equation for ci(t), that the network is

Â stable because v(t) is going to converge to some steady state value.

Â Which is given simply by the linear combination of all of these coefficients

Â which are conversed now to this particular value, multiplied by each of

Â the corresponding eigenvectors. Now we can answer the question that we

Â posed earlier in the lecture. What can a recurrent network do?

Â One thing that a linear recurrent network can do, is amplify its inputs.

Â To see this, suppose that all the lambda I, the eigenvalues are less than 1.

Â So we showed in the previous slide that the output of the network in the steady

Â state is going to look like this. And if one of these eigenvalues, let's

Â say lambda 1 is very close to 1, and all the other eigenvalues are much much

Â smaller. Then the lambda 1 term, is going to

Â dominate the sum, and so the steady state output of the network, is going to be

Â basically the projection of the input onto the, first item vector, divided by

Â 1, minus lambda 1, multiplied by e 1. So, what we have then, is a network that

Â is amplifying it's input projection. So, if lambda 1, for example, is equal to

Â 0.9, which is close to 1, then 1 over 1 minus lambda 1 is going to be 10.

Â And so, we have an amplification factor of this projection of the input on to e1

Â of 10. Now let's look an example of a Linear

Â Recurrent Network. So, let's assume that each of these

Â output neurons codes for some angle between minus 180 degrees to plus 180

Â degrees. So instead of labeling these neurons with

Â 1, 2, 3, 4 and 5, we can label them according to some angles.

Â So for example, this could be minus 180 degrees, this neuron could be minus 90.

Â This neuron could be labeled with 0, this with plus 90, and this with 180.

Â Now, why are we labeling neurons with angles?

Â It's because we can now define the connection matrix M, as a cosine

Â function, for example, of the relative angle labeling the neurons.

Â So in other words, m of theta, theta prime, could be proportional to cosine of

Â theta minus theta prime. What does this type of connectivity look

Â like? Well it results in neurons exciting other

Â neurons that are nearby, and inhibiting other neurons that are further away.

Â And here's a graphical depiction of the cosine based connectivity function.

Â So, for neurons that are close to any given neuron, you have excitation, and

Â for neurons that are further away, you have inhibition.

Â Now let's ask the question, isn't M, defined by such a connectivity function,

Â symmetric? In other words, is M theta, theta prime

Â equal to M theta prime theta? Well, that's the same as asking whether

Â cosine of x is equal to cosine of minus x, which we know is true.

Â Which means that yes, the connectivity matrix is indeed symmetric.

Â Now this type of a connectivity function's interesting because there's

Â some evidence that such connectivity is also found in the cerebral cortex.

Â Neurons in the cerebral cortex tend to excite other neurons that are near them,

Â and inhibit neurons that are further away.

Â 10:44

Now suppose we choose the connectivity matrix of a linear recurrent network to

Â be proportional to the cosin function, such that all the eigenvalues are 0

Â except one eigenvalue, which is equal to 0.9.

Â Then as we showed earlier, we would expect to see amplification.

Â And we'd expect to see an amplification of the input by a factor of 10.

Â So, let's see if that really happens when we simulate such a network.

Â And, not surprisingly, the answer is yes. When we present the network with a noisy

Â input, we do get an output that is an amplified version of the input, where the

Â peak of this noisy input has been amplified, and the smaller peaks have

Â been suppressed. So what else can a linear recurrent

Â network do? Well the earlier remark that if all the

Â eigenvalues are less than 1, then the network is stable.

Â Now suppose one of these eigenvalues, lets say Lambda 1 is exactly equal to 1.

Â In that case, we can show that we have a different kind of equation for how the

Â coefficient for c1 evolves. It's given by this differential equation.

Â And here's something interesting that happens.

Â So suppose that the input was initially 0, and then it was turned on and then it

Â was turned off. So we have the input h, which was

Â initially 0, and then it was turned on to some value and then turned off again.

Â Then here's what happens, even after the input has been turned off, so even after

Â h is equal to 0, the network maintains an output.

Â So the network now maintains a memory of the integral of the past inputs, as given

Â by this integral shown here. Interestingly there's evidence for

Â integrator neurons in the brain. In particular in the medial vestibular

Â nucleus, there are these neurons that maintain a memory for eye position.

Â So when the input to these neurons comes in the form of bursts, so here's one

Â burst spikes that changes the eye position.

Â Here's another burst of spikes from a different neuron, that decreases the eye

Â position. We note that the integrated neuron

Â maintains persistent activity, or a memory of the I position by changing its

Â firing rate. And this is very similar to what we had

Â in the previous slide. Where we had the neuron maintaining a

Â memory of the integral of past inputs. So what this goes to show, once again, is

Â that the brain can do calculus. In this case, we've shown that it can do

Â integration. And we already showed that it can do

Â differentiation in the previous lecture. So once again, sorry Newton and Liebowitz

Â ,looks like the brain has beaten you to the punch.

Â 13:36

Let's conclude our tour of recurrent networks, by looking at nonlinear

Â recurrent networks. And we can make the network nonlinear by

Â applying a nonlinear function F to the sum of the input and recurrent feedback.

Â And perhaps the simplest kind of non-linearity is the rectification

Â non-linearity, which takes any input x and sets it equal to x, if x is greater

Â than 0 and sets it equal to 0 otherwise. This non-linearity is quite useful

Â because if you recall, the vector v represents the firing rates of neurons.

Â And so the rectification non-linearity makes sure that the firing rates never go

Â below 0. So what can non-linear recurrent networks

Â do? They can perform amplification, similar

Â to linear recurrent networks. So here is the input to the non-linear

Â network. Which is a noisy input with a peak near

Â 0. And here is the output of the nonlinear

Â network. And you can see how the network has

Â amplified the input, but it has also cleaned up the input, and it has

Â suppressed some of the other peaks in the input.

Â 14:44

Now the interesting thing here is that the recurring connections, although they

Â were again the cosine type recurrent connections, with excitation nearby and

Â inhibition further away, the eigenvalues, in this case are all 0, but one of the

Â eigenvalues was actually bigger than 1. So lambda 1 was actually 1.9.

Â So in the linear recurrent network case, this would have led to an unstable

Â network. But since we have the rectification

Â non-linearity, it saves the day, and the network is in fact, stable and gives us

Â this kind of amplification. Now here's something else that the

Â non-linear recurrent network can do. It can perform selective attention, which

Â is it can select one part of the input, and suppress the other part.

Â So here's an input that contains 2 peaks. And if you look at the output of the

Â non-linear network, it has essentially focused only on the peak at minus 90

Â degrees, and it has suppressed the other peak.

Â So the network is performing a type of winner takes all input selection.

Â Some might say that the network is implementing the capitalist credo, of the

Â rich get richer, and the poor get poorer. And some people might even say that the

Â moral of the story here, is that you have to be non-linear to be a capitalist.

Â But I think we digress. The same non-linear network can also

Â perform something called gain modulation. What does that mean?

Â Well if the inputs look like this. Where you're adding a constant amount to

Â a particular input. Which basically means you're shifting the

Â input additively from one level to the other.

Â The effect on the output is multiplicative.

Â So the change in the input multiplies the output, and so you get this type of

Â modulation. Also called, Gain Modulation, of the

Â output firing rate of the neuron. Now, this is interesting because, this

Â type of Gain modulation of neuro responses, has also been observed in the

Â brain. Specifically in area 7A of the parietal

Â cortex. Finally, the same non-linear network also

Â maintains a memory of past inputs, just like the linear recurrent network that we

Â considered a while ago. Here is the input for the non-linear

Â network, it's basically a bump center around 0.

Â That's the local input, along with some background input, which is about 0.

Â The output of the network, as you might expect, is just an amplified version of

Â the input, with the background suppressed.

Â What happens to this output, when we turn off the local input?

Â Here's what we get. So when the local input is turned off,

Â you still have an output in this network, and the output has a peak at 0, which is

Â exactly where the peak of the local input was.

Â So this memory of the input, is being maintained in this network by rigorant

Â activity. So what we have here then, is a network

Â that maintains a memory of past activity, when the input has been turned off.

Â And this is quite similar to the short term memory or working memory of past

Â inputs, that is maintained by neurons in the pre-frontal cortex in the brain.

Â We have been so far looking at networks with symmetric recurrent connections,

Â what about non-symmetric recurrent networks?

Â Well the simplest form of non-symmetric recurrent networks, would be a network of

Â excitatory and inhibitory neurons. So for example, if you had one excitatory

Â neuron, and one inhibitory neuron. You could have the excitatory neuron

Â exciting the inhibitory neuron, and the inhibitory neuron then inhibiting, the

Â excitatory neuron. And perhaps there is also connection from

Â the neuron onto itself. These are called autapses, and so this

Â will again be excititory, this will be inhibitory.

Â So you can see why, the connections cannot be symmetric, because you cannot

Â have excititory connection be plus, and the inhibitory connection also be plus.

Â It has to be a negative, or an inhibitory connection.

Â 19:01

Here are the differential equations for our two neurons.

Â So here is the differential equation for the firing rate of the excitatory neuron,

Â here is the differential equation for the firing rate of the inhibitory neuron.

Â And these are all the different parameters.

Â The Excitatory connection from the neuron onto itself.

Â Here is the connection from the inhibitory neuron onto the excitatory

Â neuron, and so on. And you also see that we've added these

Â parameters for thresholds that we apply And then that in turn is passed through a

Â non-linearity, which is the rectification non-linearity.

Â And just to make things concrete, let's assign some values.

Â So these are some values for each of these parameters, for the connections and

Â the threshold. And then finally we will leave one

Â particular parameter. We're calling that tau i.

Â That is the time constant for the inhibitory neuron.

Â We will leave that unassigned, and we will vary this parameter to study the

Â behavior of this non-linear and non-symmetric recurrent network.

Â So how do we analyze the dynamics of such a non-linear and non-symmetric network?

Â Well, hold on to your eigenhats, because we're going to need to use eigenvectors

Â and eigenvalues again. To understand the dynamic behavior of

Â this network, we can perform linear stability analysis.

Â What does that mean? It means we can how stable the network is

Â near a fixed point. The fixed point is basically obtained by

Â looking at one of the values for vE and vI that make DvEdt and DvIDT go to 0.

Â So, when both of these are 0, then we have values for vE and vI which are

Â fixed, and which do not change as the function of time, and that would give you

Â a fixed point for this network. So, how do we perform Linear Stability

Â Analysis? Well we take the derivatives of the

Â right-hand side of both of these equations, with respect to vE and vI.

Â What we get then is a matrix, which is called the stability matrix, or if you

Â want to be cool, you can call it the Jacobian matrix.

Â Since the Jacobian matrix is not symmetric, the eigenvalues of the matrix

Â can have both real and imaginary parts. So the eigenvalues can be complex, and

Â these real and imaginary parts of the eigenvalues, in turn, determine the

Â dynamics of the nonlinear network near a fixed point.

Â So they determine whether the network is stable or not.

Â Now we've assigned values for all of the parameters except for tau I.

Â So what we can do now is choose different values for tau I, and this will in turn

Â cause different eigenvalues for J. And then we can look at the effect of the

Â different eigenvalues for J, on the stability and the behavior of this

Â nonlinear network. First, let's look at what happens when we

Â set tau I equal to 30 milliseconds. This makes the real part of the 2

Â eigenvalues for the stability matrix, negative.

Â And, as we show in the supplementary materials for this lecture, on the course

Â website, the real part being negative causes the network to be stable near the

Â fixed point. So here's a pictorial depiction of what

Â happens when we set tau I equal to 30 milliseconds.

Â So the x axis is vE, the y axis is vI. And so if we start out at some particular

Â location, which is some particular value for vE and vI.

Â Then the network essentially converges to the fixed point, which is the point at

Â which dve dt equal to 0, and dv1 dt equal to 0.

Â So both vE and vI are not changing at this location, in this particular plot.

Â Now if we look at what's happening as a function of time, you can see that both

Â vE and vI oscillate. And the oscillations are damped, and

Â eventually the oscillations are no longer there, and the network has converged to a

Â specific value for vE, and a specific value for vI, and that is the stable

Â fixed point of the network. This stable fixed point is also called a

Â point attractor in the terminology of dynamical systems.

Â Now look at what happens when you choose tau I to be 50 milliseconds.

Â That makes the real part of the eigenvalues for the stability matrix

Â positive. And as we show in the supplementary

Â materials for this lecture, when the real part of the eigenvalues turn out to be

Â positive, then the network is unstable. And so if you start out, in this plot of

Â vE and vI at some location, near the fixed points, so here is the fixed point.

Â And if you start out here with some value for vE and vI, then the network moves

Â away from the fixed point, and so the network is unstable, and diverges away

Â from the fixed point. But luckily, the rectification of

Â linearity comes to the rescue. How is that?

Â Well, as the value for vE tends to go negative, the rectification on linearity

Â stops it from going negative, and it puts it back on track.

Â And so we have the network looping around on this limit cycle.

Â 24:28

Here's another way to look at this limit cycle.

Â So if you plug vE and vI as a function of time, then you'll observe that initially

Â the vE and vI values start to increase. But then, once you hit this rectification

Â non-linearity, then you have a stable oscillation.

Â So, both vE and vI start to oscillate in a stable manner, and that corresponds to

Â a going around on this limit cycle. So let's summarize what we saw in the

Â previous slide and in this slide. So when you change the parameter tau I

Â from 30 to 50 milliseconds, the nonlinear network made a transition, from having a

Â stable fixed point, to becoming unstable and resulting in a limit cycle.

Â In dynamical systems theory, such a transition is known as a half

Â bifurcation. Well, I think it's time now for our own

Â half bifurcation. That wraps up our journey into the land

Â of networks. Next week, we learn about how the brain

Â learns, by changing the connections between neurons in its networks.

Â Until then, adios and goodbye.

Â