So just like in the Hopcroft Ullman analysis we're going to think about the
total amount of work done by our union fine data structure as constituting 2
pieces, work done visiting good nodes and work done visiting bad nodes.
The previous quiz says that we have a bound on visits to good nodes, even on a
per operation basis. At most inverse acronym for N, good nodes
are visited every single operation. What remains is to have a global bound on
the number of visits to bad nodes. The argument will be to show, that over
an arbitrary sequence of m find and union operations, the total number of visits to
bad nodes is bounded above by big O of n, times, alpha of n.
So here is the crux of the argument. Here is why, when you do a visit to a bad
node, the subsequent path compression massively increases the gap between that
object's rank and the rank of the parent. So, let's freeze the data structure at
the moment where we found the operation and makes a visit to a bad object,
call that Bad Object X. Let's think about what the world with the
data structure must look like in this scenario.
So we've got our bad object X, it may or may not have nodes pointing to
it. We're not going to care.
By virtue of it being bad and meeting the third criterion we know the rank of X is
at least 2. It's delta value could be anything.
Whatever it is, let's call it. Okay.
So because x is not a root, it has some parent, call that parent p.
And not only does x have ancestors, but because it meets the fourth criterion, we
actually know it has an ancestor y who has the same delta value as x.
That is it has an ancestor y With delta of y equal to k.
It is possible that p and y are the same object ir they could be different.
It's not going to matter in the analysis. So we're calling that the statistic delta
is only definied for non roots, we can conclude that y is also not the
root. It must then have some parent p prime.
P prime might be a root or it might not, we don't care.
[SOUND] So, now let's understand of the effect of the past compression, which is
going to happen subsequent to this find operation.
X's parent points is going to be rewired to the root of this tree.
The root of this tree is either at P prime, or even higher than that.
Given that fact let's understand how much bigger the rank of X's new parent P
primer higher is compared to the rank of it's old parent P.
So the rank of x's new parent is at least the rank of P prime.
So if p' is in fact the root, then the rank of x's new parent is just the rank
of p'. Otherwise, this new parnet is even higher
than P prime because its ranks only increase going up the tree, that means it
would only be higher than the rank of P prime.
How does the rank of P prime compare to its child, that of its child y? Well and
here is a key point, because delta of y is equal to k.
Remember what the definition of delta is, it quantifies the gap between the rank of
an object, and that of its parent. We are going to use that here for y, and
its parent P prime. It means the rank of P prime is so big
It's at least the function, a sub k applied to y's rank.
So our third and final inequality is the same as the first one.
So it could be the case that y actually is the same as p, it actually is x's old
parent. In that case the rank of x's old parent
is precisely the rank of y. Otherwise, y is even higher up X's old
parent P and therefore, by the monotonicity of rank, the rank of Y is
even bigger, than the rank of X's old parent.
So now in this chain of inequalities, I want you to focus on the left-most
quantity, and the right-most quantity. What does this say, at the end of the
day? It says when you apply path compression to X, it acquires a new
parent. And the rank of that new parent is at
least the A sub K function applied to the rank of it's old parent.
Again, path compression at the very least applies the A sub K function to the rank
of X's parent. So now let's suppose you visit some bad
object X, not just once, but the same object X, while it's bad over and over
and over again. This argument says that every such visit
to the object x while it was bad applies the function A sub k to the rank of its
parent. So in particular, let's use r to denote
the rank of this bad object x and, again, by virtue of it being bad, r has to be at
least 2. Imagine we do a sequence of R visits to
this object X while it's bad. Each of those visits will result in it
acquiring a new parent. And that new parents rank is much bigger
than the rank of the previous parent. It's at least A sub K, applied to the
rank of the previous parent. So, after R visits,
that's applying the function A sub K, R times to the rank of X's original parent
which forces rank at least that of X, at least R.
Well we have another name for applying the function A sub K, R times to R.
Remember this is just by definition of the acronym function A sub K plus 1.
Applied to r. So what does this mean? Well this means,
that after r visits, to a bad object x, that has rank r, the rank of x's parent
has to have grown, so much, that that growth is reflected in our statistic
delta, that measures the gap in the rank, between objects, and the objects parent.
So remember how we defined delta of x. It's the largest value of k so that the
rank of x's parent is even bigger than a sub k applied to the rank of x.
So what this inequality shows is that every r Parent pointer updates to x allow
you to take k to be even bigger than before.
You can bump up k by 1 and still have this inequality be satisfied.
That is, every r visits to a bad object of rank r, delta has to increase by at
least 1. But there's only so many distinct values
of the statistic delta of X can take on. It's always non negative, it's always an
integer, it can only increase, and it's never bigger than the inverse Ackermann
function alpha of N. So therefore, the total number of visits
to an object X, while it's bad, over an arbitrary sequence, can not be more than
R, the number of visits needed to increment delta of X,
times the number of distinct values, which is alpha of N, the inverse
function. So now we've done all the hard work.
We're almost there, we just need 1 final slide putting it all together.
All right, so to see how all of this comes together, let's first recall that
all that we need to do is bound the total number of visits to bad objects over this
entire sequence of union and find operation.
So in the previous slide, we showed that for a bad object x, with rank r, the
total number of times it's visited while it's bad, is bounded above by r times the
inverse Ackermann function of n. So lets sum that over all of the objects
to get our initial bound on the total number of visits to bad objects.
So now, we have to be a little careful here, because there are n objects x.
and ranks can be as large as log n. So a naive-bound would give us something
like n times LogN times alpha of n, and that's not what we want.
We really want n times alpha of n. So we need to use the fact that not all
big nodes can have big rank. And that's exactly what the rank Lemma
says. So to rewrite this, in a way that we can
apply the rank Lemma, bounding a number of objects with a given rank, lets bucket
the objects according to their rank. So rather than summing over the objects,
we're going to sum over possible ranks r, and then we're going to multiply times
the number of objects that have that rank R.
While I'm at it, let me pull the alpha of n factor, which doesn't depend on the
object out in front of the sum. For every value of R, the rank Lemma
tells us there are at most N/(2^R) objects with that rank.
So this factor of N is independent of the rank R, so let me pull that out in front
of the sum. And when the dust settles we get n times
the inverse acronym function of n times the sum of non-negative integer r of r
divided by 2 to the r. So, we've seen this kind of sum without
the r in the numerator, just a geometric sum 1/2^r.
We know that's bounded by a constant, and more generally throwing an r in the
numerator, well that's no match for this exponentially decreasing denominator.
So, this sum still evaluates to, a universal constant.
I'll let you check that in the privacy of your own home.
And so that's, give us a bound of O(n) * the inverse Ackermann function of n, on
the total number of visits to bad objects, since we also have a per
operation bound of alpha(n) on the number of visits to good nodes.
Combining the work of the two cases we get O(n) m+n times inverse Ackermann
function of n. And again, the interesting case here is
when m is omega of n. Otherwise, you can just do this analysis
separately for each 3 in the data structure.
And, there you have it. You now understand in full detail one of
the most amazing results in the history of algorithms and data structures.
So, Tarjans bound is unimagenably close to being linear without actually being
linear, it's off by this inverse Acromon function factor.
Now, from a practical perspective for any imagenable value of N, alpha of N is
almost 4, so this gives you a linear time bound for Imaginable values of n.
In fact, even the Hopcroft-Ullman log starbound, logs stars at most 5, for any
imaginable value of n. So that also is in, in essence linear
time bound for all practical purposes. But theorotically speaking, we have not
proven a linear time bound, on the amount of work done by the union find Data
structures. So the Pavlovian response, of any
algorithms researcher worth their salt, would be to ask, can we do better? And,
we could ask that question in a couple different senses.
The first sense would be; well maybe we could have a sharper analysis, of this
exact data structure. Maybe, union by rank, and path
compression is sufficient To gaurantee a linear amount of work.
After all, we didn't need to change the data structure, we only needed to change
the analysis, to sharpen the log* bound, to an alpha of n bound.
So this question, remarkably, Tarjan answered in the negative, in his original
paper. He showed that, if you use this exact
data structure, union by rank and path compression, then they are arbitrarily
large Sequences of unions and finds of arbitrarily large collections of objects,
so that this data structure actually performs asymptotically, M the number of
operations times the inverse Ackermann function, N the number of objects amount
of work. That is keeping the data structure fixed,
this analysis cannot be asymptotically improved.
This data structure fundamentally has Worst case performance, governed by this
insane inverse Ackermann function. So this is already a mind boggling fact
that indeed Tarjan in the conclusion of his paper, notes that it's remarkable to
have such a simple algorithm with such a complicated running time.
But you could also ask the question, could we do better perhaps by improving
or changing the data structure. After all, by adding path compression, we
got a qualitative improvement in the average operation time.
That dropped from log to alpha of n. Perhaps there's yet another optimization
out there waiting to be discovered that would drop the amount of work per
operation over a sequence down to constant time per operation, linear work
overall. Tarjan, in his paper made the bold
conjecture that there is no linear time method, no matter how clever you are.
And remarkably, that conjecture has since been proved.
It was proved for certain classes of data structures both in Tarjan in his original
paper and in And a follow-up paper in '79.
But the final version, of this conjecture, was proven by Friedman and
Sachs, back in 1989. No matter how clever you are, no matter
what union-find data structure you come up with, there will be arbitrarily long
sequences of operations, so that the average amount of work you do per
operation Is, the inverse Ackerman function of, the number of objects, in
the data structure. Pretty unbelivable,
so let me just wrap up with a historical comment.
So, it's a full disclosure, I wasn't quite alive when this result came out
but, in talking to, in reading about it, and talking to senior researchers about,
my sense is that it was really sort of a water-shed moment, for the field of data
structures and And algorithms. Specifically it confirmed the belief of,
people working in algorithms and data structures, that the field posessed
surprising depth, and beauty. There had, of course, been earlier
glimpses of this, we mentioned in the optional, material in part 1, Kanuth's
analysis of linear probing, back in the early 1960's.
But this was really something for the worst case analysis, of algorithms and
data structures. So the fact that such a practical and
naturally arising problem, in algorithms and data structures, requires,
necessarily, the understanding of the Ackermann function and it's inverse.
A function, mind you, which was first Proposed and defined back in 1920s in the
service of recursion theory, almost 10 years before Tarjan was doing his work on
models of computation. It was a real eye opener and it showed
that this field is something that will keep many generations of scientists quite
busy for many years to come.