The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

From the course by Stanford University

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

513 ratings

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

From the lesson

Week 2

Kruskal's MST algorithm and applications to clustering; advanced union-find (optional).

- Tim RoughgardenProfessor

Computer Science

The guarantees states that if you implement a union-find data structure

using lazy unions, union by rank, and path impression,

then you get a pretty amazing guarantee.

Do any sequence you want of m union and find operations.

The tool that I worked on, that is data structures, is bound and above by

times log star of n, where n is the number of objects in the data structure.

So on the one hand I hope you're suitably impressed by this result.

So the proof of it that we gave in the last video is frankly totally brilliant.

And secondly the guarantee itself is excellent, what with

log star of n being bounded above by five for any imaginable value of n.

On the other hand, I do hope there's sort of a little voice inside you wondering if

this log star bound could really be optimal.

Can we do better maybe by a sharper analysis of the data structure

we've been discussing, or perhaps by adding further ingenuity,

further optimizations to the data structure?

For all we know, you might be able to get away with a linear amount of work.

So O of M worked to process an arbitrary sequence of finds and unions.

It turns out you can do better than the guarantee.

There is a sharper analysis of the exact same data structure.

That analysis was given by Tarjan, and here is the statement of his guarantee.

So the improved bound states that for an arbitrary sequence of m union and

find operations, the total work done by this data structure union by rank and

path compression, is big O of m times alpha of n.

Now what, you may ask, is alpha of n?

That's a function known as the inverse Ackermann function.

So what then is the inverse Ackermann function?

Well the short answer is that it's a function that,

incredibly, grows still more slowly, in fact, much, much,

much more slowly, than the log star function we discussed in the bound.

The precise definition is non-trivial, and that is the subject of this video.

In the next video, we'll prove this theorem and

explain how the inverse Ackerman function arises

in an optimal analysis of union by rank with path compression.

So I'll first define the Ackerman function,

and then I'll describe its inverse.

One thing I should warn you is you will see slight variations on these two

definitions out there in the literature.

Different authors define them in ways convenient for their own purposes.

That's also what I'm going to do here, I'm going to take one particular definition

convenient for the union find data structure analysis.

All of the variants that you'll see out there in the literature exhibit roughly

the same ridiculous growth.

So the details aren't really important for the analysis of data structures and

algorithms.

So you can think of the Ackermann function as having two arguments,

which I'm going to denote by k and r, both of these are integers.

k should be at least zero, r should be at least one.

The Ackermann function is defined recursively.

The base case is when k equals zero.

For all r, we define a sub zero of r as the successor function,

that is it takes its input r and it outputs r plus one.

In general, when k is strictly positive,

the argument r tells you how many times to apply the operator,

the function a k minus 1, starting from the input r.

That is, it's the r fold composition of a sub k minus 1.

And again applied to the argument r.

So in some sense describing the algorithm function isn't that hard right?

I didn't really need that much of a slide to actually write down its description,

but getting a feel for what on earth this function is, takes some work.

So the first thing,

maybe this is a sanity check, is note that it is a well-defined function.

So, you know, you're all programmers, so

you could easily imagine writing a program which took as input, k and

r, and at least in principle, given enough time, computed this number.

It would be a very simple recursive algorithm with a pseudo-code just

following the mathematical definition.

So it is some function, given k and r,

there is some number that's the result of this definition.

Now, in the next sequence of quizzes, let's get a feel for

exactly how this function behaves.

And so let's start on the first quiz by fixing k to be one.

And now, viewing the Ackermann function with k fixed at one,

is a function purely of r.

So, what function does a1 of r correspond to.

All right, so the correct answer is b, at least a1 is quite simple to understand.

All it does is double the argument.

So if you give it r, it's going to spit out 2r.

So, why is that?

Well, any one of our by definition is just the function a0 applied to r,

r times, a0 by definition is the successor function as one, so if you

apply the successor function r times, starting from r you get 2r as a result.

So let's now move from a1 to a2.

So let's think about exactly the same question, fixing k at 2 and

regarding it as a function of r only.

What function is it?

So answer here is c for

every positive integer are a2 of r is equal to r times 2 to the r.

And the reasoning is the same as before.

So remember a2 of r just means you apply a1 r times.

In the last quiz we discovered that a1 was doubling.

So if you double r times,

it has the effect of multiplying by a 2 to the r factor.

So let's take it up a notch yet again.

Let's bump up k to three and let's think about A sub 3.

But let's not yet think about exactly what a sub 3 is as a function of r.

Let's keep things simple.

What is just a sub 3 evaluated when r equals 2?

So k equals 3, r equals 2, that's a number.

What number is it?

Okay, so the correct answer is the third one.

It's 2048, also known as 2 to the 11th.

Let's see why, well a sub 3 of 2 that just means we apply a sub 2 twice to 2.

Now in the last quiz we evaluated not only a sub 2 of 2 but

a sub 2 of all integers r.

We discovered that it's r times 2 to the r.

So that means that it's a simple matter to evaluate this

application of a sub 2 twice.

First a sub 2 of 2 that's just 2 times 2 to the 2 that's just 8.

Then a sub 2 of 8 is going to be 8 times 2 to the 8, also known as 2 to the 11 or

2048.

So what about in general?

How does the function a sub 3 behave as a function of a general parameter r?

When we answer this question we're going to see for the first time

the ridiculously explosive growth that the Ackerman function exhibits and

this will just be the tip of the iceberg right?

This is just when k equals 3.

So by definition a sub 3 of r you start from r and

you apply the function a sub 2 r times.

So let's remind ourselves what the function a sub 2 is.

We computed that exactly.

That's r times 2 to the r.

So to make out life simple, let's just think about the 2 to the r part.

So in the real function a sub 2,

you also multiply by r, but let's just think about the 2 to the r part.

So imagine you applied that function over and over and over again.

What would you get?

Well now you get a tower of 2's where

the height of the tower is the number of times that you apply that function.

But if you apply it once, you get 2 to the r, If you apply it twice you're going to

get two raised to the 2 to the r, 3 times 2 to 2 to the 2 of the r and so on.

So all applications of a sub 2 gives you something that's even bigger than a tower

of r2's.

So, let's move on to a4, and

this is the point in which personally my brain begins to hurt.

But let's just push a little bit further so

that we appreciate the ridiculous growth of the Ackermann function.

And as with a3, let's punt for the moment on understanding the dependence for

a general r.

Let's just figure out what a sub 4 of 2 is.

So, k equals 4, r equals 2.

Well, so this by definition, you just take 2 and you apply a sub 3 twice.

We computed a sub three of 2 in the previous quiz.

We discovered that was 2,048.

So that's the result of the first application of a3.

And now we find ourselves applying a3 to 2048, so in effect r now is 2048.

And remember we concluded the last slide by saying well a sub 3,

whatever it is, it's at least as big as a tower of r2's.

And here r is 2048.

So, this of course now is the land of truly ridiculous numbers that we're

dealing with.

I encourage you to take a calculator or computer and

see how big of a tower of 2's you can compute, and

it's not going to be anywhere close to 2,000, I can promise you that.

And hey that is just a4 of 2, what about the function a4 for a general value of r?

Well, of course, in general,

a4 of r is going to be r applications of a3 starting at r.

Now, in the last slide,

we argued that a sub 3, this function is bounded below by a tower function.

So a sub 3 takes in some input, some integer,

and it spits out some tower with height equal to that integer, so

what happens when apply that tower function a3 over and over again?

Now you get what's called an iterated tower function.

So when you apply the function a sub 3 for the time the r is going to spit out

a number which is bounded below by a tower of height r, tower of 2's, all of them.

Now we apply a3 a second time, it's output is going to be a tower of 2's,

whose height is lowerbounded by a tower of 2's of height r, and so on.

Every time you apply a sub 3, you're just going to iterate this tower function.

You will sometimes see the iterated tower function referred to as a yowzer function.

You probably think I'm pulling your leg, but I'm not kidding.

You can look it up.

So that is the Ackerman function and

a bunch of examples to appreciate just how ridiculously quickly it is growing.

So now let's move on to define its inverse.

Now that Ackerman function has two parameters k and r excuse me.

So to define an inverse I'm going to fix one of those.

Specifically I'm going to fix r equal to 2.

So the inverse Ackermann function is going to be denoted by alpha.

For simplicity, I'll define it only for values of n that are at least 4 and

it's defined for giving value of n as the smallest k so

that if you apply a sub k to 2, to r equals 2, then the result is at least n.

So again it's simple enough to kind of write down a definition like this, but

you really have to plug in some concrete numbers to get a feel for what's going on.

So lets just write out, you know,

what are the values of n that will give you alpha of n equal 1.

That will give you alpha of n equal 2, alpha of n equal 3, and so on.

Okay so for starters, alpha of 4 is going to be equal to 1, right.

So if you applied a 0 to 2, that's the successor function so you only get 3, so

that's not at least as big as 4.

On the other hand, if you apply a sub 1 to 2,

that's the doubling function, so if you apply it to 2 you get 4.

So a sub 1 does give you a number which is at least as big as n when n is 4.

So next I claim that the values of n for

which alpha of n equals two are the values 5, 6, 7 and 8.

Why is that true?

Well, if you start from 2 and you apply the function a sub 1,

the doubling function you only get 4.

So, a sub 1 is not sufficient to achieve these values of n's.

On the other hand if you apply a sub 2 to 2,

remember that's the function which given an r outputs r times 2 to the r, so

when you apply it to 2, you get 2 times 2 to the 2, also known as 8.

So a sub 2 is sufficient to map 2 to a number at least as big as 8.

By the same reasoning, recall that we computed a sub 3 of 2 and

we computed that to be 2048.

So therefore for all of the bigger values of n starting at 9 and going all the way

up to 2048, their alpha value is going to be equal to 3 because that is

the smallest value of k such that a sub k of 2 is at least those values of n.

And then things start getting a little bit ridiculous.

So remember we also lower bounded a sub 4 of 2.

We said that's at least a tower of 2's of height 2048.

So for any n up to that value, up to a tower of 2's of height 2048,

alpha of those n's is going to be equal to 4.

This of course implies that the inverse Ackermann

function value of any imaginable number is at most 4.

And I don't know about you, but my brain is already hurting too much to think about

which of the values of n, such that alpha is equal to 5.

So as a point of contrast, let's do the same experiment for the log star function,

so that we get some appreciation about how as glacially growing a log star may be,

the inverse Ackermann function is growing still qualitatively slower.

So the log star function remember is the iterated logarithm, so

it's starting from n how many times you need to hit log in your calculator,

let's say base two before the result drops below 1.

So when is log star n going to be equal to 1?

Well that's when n is going to be equal to 2.

Then you hit log once and you drop from 2 to 1 and you're done.

If n equals 3 of 4 then you need more than one application of logarithm to drop below

1, but you only need two applications, so log star of n is going to be 2 for

n equals 3 and 4.

Similarly for n anywhere between 5 and 16 log star n is going to be equal to 3.

Right if we start from 16 that's 2 to the 4, so if you fly a log once you get 4,

a second time you get 2, a third time you get 1.

By analogy, the values of n for which the log star of n equals 4 are going to be

starting at 17 and going up to 2 to the 16, also known as 65536.

So let's just go one more step for which n is log star of n equal to 5.

Well obviously we start at 65537, and we end at 2 raised to

the largest number of the previous block, so 2 raised to the 65536.

So, these numbers are impressive enough on the right hand side, but there is no

imaginable number of importance for which log star is going to be bigger than 5.

Looking at the left and the right hand sides though, we see that there really

is a qualitative difference between the rate of growth of these two functions.

With the inverse Ackermann function,

in fact, growing much, much slower than the log star function.

So perhaps the easiest way to see that is to look at the right hand side, and

on each of the lines in the right hand side, look at the largest value of n, so

that log star is a given number.

So for example, in the third line, the largest n such that log star of n equals

3, is a tower of 2s of height three, 2 to the 2, to the 2, also known as 16.

Similarly on the fourth and

fifth lines the largest n to give values of log star equal to 4 and

5 are a tower of 2s of height 4 and two and a tower of 2s of height 5.

On the other hand look at the fourth line on the left hand side.

So I really when we ask for the largest values of n that have inverse Ackermann

value 4, we're talking about towers of 2s, in fact of height 2048.

So that is the n's which give us alpha of n equal to 4.

We would have to write down 2048 lines on the right hand side before we had values

of n that were equally big.

This indicates how the log star function while growing glacially indeed,

as it never the less moving at light speed compared to the inverse Ackermann

function.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.