So, that's try number two now, okay? The ideas that we get to normalize by the
strata of importance. It turns out that by this point, we're
sort of close to what Google does. In the next of three modules, we will have
to quantify this intuition and then, provide two more enhancements about what
are getting there. The idea is that, look at all the
importance scores, add them up across the links pointing to you, but make sure you
normalize by the outgoing number of links. Now, you may wonder, if I have a node
here, number two with important square pi two, I can write pi two as some
combination of the important scores normalized from the incoming neighbors.
And each of these, say, this involves pi one, can also be written as the summation
of normalized important scores from its incoming neighbors.
So, the question is, on the left-hand side, I will have pi one, pi two, and all
the pis. On the right-hand side, I also have all
these pis, divided by their outgoing number of neighbors.
So, I've got pis both on the left-hand side and the right-hand side of this, this
linear equation system. Can they actually make sense collectively?
Is there is a set of consistent scores pi? Is there a vector pi that, can satisfy all
these linear equations simultaneously? If so, then we have found a particular
meaningful and self-consistent way to assign importance scores.
If not, then we're in trouble. So, the question now is how can we modify
this basic intuition in our try number two now, in order to guarantee both the
existence of such a set of consistent scores and an efficient way to compute
these scores? But before we go into the general space,
let's look at a simple, very small example.
There are only four nodes and six directed links.
Nodes one and two points to three, three points to four, and then four points to
back to one, two, and three, okay? Now, what is the intuition of the
importance score vector pi, which should be a vector of length four here?
If we assume that such a pi consistent set of scores actually exist.
Here's one particular view. We can think of nodes three and four as
collectively acting like a supernode, okay?
Let's call this supernode three+4. And here's node one.
Here's node two. One points to supernode three + five, so
does two, and a supernode points back, okay?
So now, intuitively, you look at the supernode and say, this supernode got two
links pointing inwards. But each of these nodes, one and two, only
has one length pointing inward. So, intuitively, you'll think that nodes
one and two should each, has a smaller important score than the supernode three +
four. And furthermore, by symmetry of this graph
here, nodes one and two should have the same importance score.
And then, the supernode three + four. Need to distribute its score in some way
between nodes three and four. So, at least we know intuitively that pi
one should equal to pi two, and each of them should be smaller than pi3 + four,
the supernode score, okay? So, intuitively that's what we should.
Anticipate. Now, we can write down the math a little
further and calculate in this small example very easily the result.
Now, first of all you'll also notice that, in this node of three pointing to node
four, it says, that basically all the important score of pi three that we get
goes into node four. And node four has only got this one
incoming link. So, pi four should equal to pi three, or
pi three divided by one, cause there's only one outgoing link from node three.
And plus what? There's no other terms.
There's only one link pointing inwards to node four.
So, we can just say pi four equals pi three, and pi one equals pi two.
Now, once we know that, the job is very simple now.
Let's, say, pi one equals pi two, this number is x.
Pi three equals pi four, this number is y, okay?
Now, we know that there is one link from node four pointing to node one, right?