0:18

So a lot of times in the study of algorithms,

Â we have two variables that we're working with.

Â One is the size of the problem, and the other is the cost.

Â So we're going to wind up with intricate expressions

Â that involve those two variables.

Â And they might vary independently.

Â So we have some definite challenges.

Â The problem is, as I indicated when talking about doing

Â sums is that the relative values of the variables might matter.

Â And that makes it even more challenging to deal with sums over the whole range of

Â relative values.

Â 1:02

And you'll see what this means when I get to some applications.

Â Fortunately, as was the case with the harmonic numbers and Stirling numbers,

Â where there's a couple of fundamental functions that arise over and over again

Â that take most of the work, similar is true of functions of two variables.

Â There's a couple of fundamental functions that arise.

Â And we'll look at the asymptotics of those, and

Â they are the ones that arise most often.

Â And if it's not that function,

Â it's a function that's similar that we could model the analysis

Â on the analyses that have been developed classically for these functions.

Â 1:45

So first example is the binomial distribution.

Â Again, so k might be related to the cost, N might be related to the problem size.

Â How are we going to compute values like that for [COUGH] how

Â are we going to make computations if we wind up with an expression like that?

Â It turns out that as I just explained for the Catalan numbers,

Â if k is 0, that's close to 1 over square root of pi N, that's two entries in.

Â But if k is large, that's exponentially small, it's a very, very tiny quantity.

Â 2:28

So we're going to have to come up with an analysis that tells us about these things.

Â Another example is called the Ramanujan Q-distribution,

Â that's actually kind of similar.

Â A binomial coefficient is N factorial over N minus k factorial k factorial.

Â So this one's N factorial over N minus k factorial N to the k.

Â And that actually arises in the analysis of several classical algorithms.

Â And that one, if k equals 0,

Â it's just N factorial, N factorial, it's 1.

Â But if k is close to N, that one is exponentially small.

Â N to the N is way smaller than N factorial.

Â So those are the types of functions and types of challenges

Â that we face with analyzing functions of two variables.

Â 3:41

You can't, in a graph, you can't have a variable N, so

Â we make that a variable N by scaling the actual values by a factor of 2N.

Â So the binomial coefficients, so

Â [COUGH] for k equals 2, it's one quarter,

Â one and a half, one quarter, then, or

Â just forgetting about the exponential scaling factor,

Â that's 1, 2, 1, and this is 1, 3, 3, 1.

Â 1, 4, 6, 4, 1.

Â So you can see Pascal's triangle in these plots.

Â And what happens, as many people are familiar, is that as N increases,

Â this discrete two variable distribution converges to the normal distribution,

Â 4:42

to a curve that we can describe with a simple mathematical function.

Â And we're going to see the math to get us to the place that this picture shows us.

Â We ought to be able to have a simple description of this

Â distribution, and we do.

Â This is what Ramanujan Q-distribution looks like.

Â Again, the same type of plot.

Â For different factors of k, we draw a curve.

Â So k goes from 0 to N, when k is small, it's very close to 1,

Â when k gets close to N over 2, it becomes extremely small.

Â And then there's a curve that describes its growth.

Â So for large N, we ought to be able to use some function

Â that defines that curve for any value of k, and

Â that's what we're after with bivariate asymptotics.

Â 5:44

Now let's take a look first at the Ramanujan Q-distribution.

Â The calculation's a little bit simpler.

Â It's an extension of the calculation that we did for the Catalan numbers.

Â It's just that now we're carrying on this variable k.

Â So first step is use the x log technique.

Â N factorial over N minus K factorial N to the k is e to the log of N factorial

Â minus log of N minus k factorial minus log of N to the k, which is k log N.

Â So that's the first step.

Â Now, how we going to expand each one of those?

Â Well, the first two are just handled with Stirling.

Â So we'll use Stirling's approximation to log N factorial and

Â just plug that in in both of these cases.

Â So that says log N factorial is N plus one-half log N minus N

Â plus log of square root of 2 pi plus O of 1 over N.

Â Now, the difference is, so that's what we'll use for the first term.

Â For the next term, we have that value of k that we have to worry about.

Â 6:59

So we're going to have O of 1 over n minus k,

Â which we can cover with O of 1 over N.

Â So that's plugging in Sterling's formula for the first two terms.

Â And then there's the minus k log N term still left there from the N to the k.

Â So again, we've got a bunch of terms, three terms for the log N factorial,

Â three terms for the log N minus k factorial, and then the minus k log N.

Â And we've got to do algebra to deal with those terms.

Â But again, there's some cancellations that are going to help us out.

Â Log square root of 2 pi cancels.

Â The minus n plus n cancels.

Â There's, log of n minus k is log of n plus log of 1 minus k over n,

Â so we get some cancellations there.

Â So if we collect all those terms,

Â with all the cancellations, that's what we're left with,

Â e to the minus N minus k plus one-half log of 1 minus k over N.

Â You might want to check the math on that, but

Â it's all simple algebra that leaves us with that.

Â And the only technique used is log of N minus k equals log N

Â plus log of 1 minus k over N, and we're down to just a few terms.

Â [COUGH] So now we have to expand the log of 1 minus k over N.

Â And that's minus k over N minus k squared over 2N squared plus O(k cubed / N cubed).

Â So now we're carrying both variables in the big-O term,

Â which can be a little tricky.

Â We're trying to make this work for as many values of k as we can, but

Â different values of k, particularly when k is proportional to N,

Â are going to give us different asymptotic accuracy.

Â And that's one of the challenges with bivariate asymptotics.

Â But we'll carry it in this form and see where we go.

Â So just plugging in that formula and doing the math.

Â So there's minus k over N, we're multiplying by N, and again,

Â you can go ahead and do the math, not too much survives.

Â There's k squared over 2N minus k squared over N, so that's minus k squared over 2N.

Â The k's cancel.

Â 9:36

So all we get is e to the minus k squared over 2N in the end.

Â And then what's left over are the two error terms.

Â And those error terms are just doing the math.

Â If you multiply N times a of k cubed over O(k cubed

Â / N cubed) you get k cubed / N squared.

Â And if you multiply k times, sorry, and

Â then what's left is the O(k / N).

Â So those are valid formulas, but

Â the interpretation of the formula's going to depend on the value of k.

Â And it tells us a lot, that for small k, for a relatively small k,

Â it says it's going to be close to e to the minus k squared over 2N.

Â That's the curve that we see.

Â 10:26

So if we just analyze sort of different values, if this,

Â in fact, if k is N to the two-fifths, say, a which is a pretty

Â good sized value, then k over N is going to be 1 over N to the three-fifths,

Â k cubed over N squared is 1 over N to the four-fifths.

Â So that's the error term.

Â Seems like we're shaving into tiny distinctions,

Â but the main point is 1 over N to a positive power.

Â That's going to be a big distinction.

Â When k is square root of N, they're about the same.

Â Now, when k gets bigger, then this term starts to pick up.

Â But we can work with those different ranges, and

Â really the main point of this derivation is to show that for most of the curve,

Â remember, the plot showed up to square root of N, it's got a fine curve.

Â And we have the description of that curve, and we have it.

Â It's e to the minus k squared over 2N.

Â 11:47

And again, this is just an exercise using Stirling's formula,

Â carrying through all the error terms, and

Â making sure that you get the cancellations.

Â And it's definitely worthwhile to close the book, turn off the computer, and

Â try to do this derivation just as an exercise in algebra perhaps.

Â 12:14

Because you can see when you do that how the cancellations happen and

Â simplify the calculations.

Â And everybody who works in this sort of field knows that if you

Â do a long calculation like this and you miss a term, you're going to get something

Â very exciting at the end that maybe is not relevant or

Â wrong, simply by missing a term.

Â Because, your e to a power, and

Â you accidentally forget to cancel out an N term,

Â you've got e to the N'th, which probably is not there at all, just as an example.

Â So we'll do the calculations.

Â I'm not going to explain every term on these

Â calculations, because they're actually very similar to the ones that we just did.

Â You apply Stirling's formula, and so that gives three lines,

Â each with four terms, one for each of the main terms.

Â 13:19

And again, initially we can just use O(1/N) for the whole range.

Â Where we start to have to carry terms that

Â involve O(k/N) is when we expand log N minus

Â k to be log of N plus log of 1 minus k over N and

Â log of N plus k to be log of N plus log of 1 plus k over N.

Â Those k over N's in those asymptotic series are the ones that give

Â us some complication.

Â 13:54

So when we do this and do the algebra, a bunch of things cancel.

Â The 2N cancels with the 2N's, and so forth,

Â and so these are the terms that survive.

Â And again, that's pretty much straightforward algebra,

Â with the additional proviso that we're doing that

Â expansion log N minus k equals log N plus log 1 minus k over N.

Â The terms [INAUDIBLE] log N most of them go.

Â And so we're left with that.

Â And then these two terms are kind of similar.

Â One's got a minus k, one's got a plus k.

Â And just rearranging terms in terms of k times this

Â difference of log 1 minus k over N log 1 plus k over N, and

Â N plus one-half times the sum, those two, that sum and

Â that difference, that's one of the first exercises that we did.

Â And those things collapse down to just one asymptotic term.

Â 15:20

So substituting that in gives us e to the 2N log

Â 2 minus natural log of square root of pi N,

Â that's the only part of pi that's left over from all the math,

Â minus k squared over N plus the big-O term.

Â And then just undoing the explog technique,

Â e to the 2N log 2, that's 4 to the N.

Â e to the minus log square root of pi N, that's 1 over square root of pi N.

Â And then what's left is e to the minus k squared over N.

Â So [COUGH] 1 / 4N (2N choose N- k) is e to

Â the -k squared / N / square root of pi N.

Â And that's the normal approximation to the binomial distribution.

Â And that's accurate for a broad range of values of k.

Â 16:21

For this approximation, it's only when k gets bigger than N

Â to the three-fourths that the approximation doesn't work.

Â And again, remembering from the curves, when k is that big,

Â we can use independent means to prove that the terms are very, very small.

Â 16:41

So, there's certainly a lot of math on this slide.

Â On the other hand, the bottom line is a fundamental

Â classical result in mathematics and the techniques used.

Â What makes it complicated is really just elementary algebra.

Â So it's an exercise in doing elementary algebra well.

Â And it's interesting, these kinds of calculations, still, nowadays,

Â most people do them by hand because it's difficult to have automatic calculations

Â carry through on the bivariate to the extent that we'd like, say.

Â So lots of people still do these kinds of calculations by hand.

Â And there's plenty of other examples in the book, and so

Â I'm not going to go through all those examples.

Â Again, these functions arise very often, and

Â we can go back to a table like this to get the approximations that we need

Â later on when we encounter these functions in applications.

Â Again, different arguments depending on different ranges of k.

Â In most cases, we'll have [COUGH] an approximation

Â that's true no matter what the value of k is.

Â And then other times, we'll have a more refined approximation

Â that'll give us a little more accuracy near the center.

Â So that's the end result for the normal distribution.

Â 18:20

Then there's the so-called Poisson distribution, which I haven't talked about

Â very much, but which falls to the very same kinds of techniques.

Â And I'll talk about it when we come to it in the context of applications.

Â And again, look at the derivation in the book or

Â you can just deal with that by using the exp-log trick.

Â A lot of things cancel, and

Â the end result is a famous result that for

Â [COUGH] that N choose k times that binomial for

Â [COUGH] a probability that's got a small chance

Â of occurring is lambda to the k e to the- lambda / k factorial.

Â And we'll talk about applications of that later on.

Â Right now, just in terms of the math, you should appreciate that.

Â The very same techniques that we did on one slide

Â will give these kinds of approximations.

Â And the Q distribution that I talked about,

Â e to the -k squared / N, so within 1 / square root of N,

Â uniform or, more accurately, near the center.

Â 19:35

Those are the kinds of bivariate approximations that we can develop.

Â And they're extremely important in,

Â not just analysis of algorithms, but in many applications.

Â It's far easier to work with the standard mathematical functions

Â like e to the -k squared / N than it is to work with

Â the factorial representations, which are exact but

Â carry a lot of information that make calculations difficult.

Â 20:09

So most of the time, we'll be coming back to this table and

Â picking out these kinds of approximations

Â when these sorts of functions arise in the analysis of algorithms.

Â And we have similar functions that arise.

Â We'll go back to the derivations and

Â see that they can be handled with the same basic method.

Â Really, the exp-log technique plus the application of

Â Stirling's formula is what gives all these kinds of results.

Â 20:39

So the next challenge that we're going to have now is,

Â though, what do we have when we have a sum that involves one of these functions?

Â And so the example that we'll do is the so called Ramanujan Q-function,

Â and that's a critically important function in the analysis of algorithms.

Â So, it's the Ramanujan distribution summed over all values of k.

Â And it's basically asking, what's the area under the curve?

Â And the challenge is that we're going to need, again, it's nearly 1 for

Â small k and it's negligible for large k, so we're going to need to

Â use different approximations in different parts of the range to get the answer.

Â This is a very typical situation.

Â So that's why we need bivariate asymptotics to make sure that

Â we can have good estimates in the whole range that we can use to estimate the sum.

Â The general method is referred to as the Laplace method.

Â If you have to approximate a sum, and

Â this is just a schematic representation of the situation,

Â what usually winds up being effective is that the tails are going to be small,

Â so we'll restrict the range to an area that has what counts and

Â then find an approximation that works there.

Â And then what we can do is extend the range.

Â The point is that that approximation usually is also going to be very

Â small on the tails, so we can just extend the range back out.

Â So we take out the original tails, we put in the approximation.

Â 22:38

Both of them are exponentially small by comparison with the value

Â of the whole sum, so it doesn't matter which one that we use.

Â And then that gives us a way to get a concise approximation of the whole sum.

Â That's called the Laplace method.

Â And then, usually, one of the advantages of converting

Â from a representation involving factorials to a representation

Â like e to the -k squared / N is that we can use an integral for

Â functions like that and then use that approximation to get the answer,

Â where an integral with discrete doesn't do the job for us, usually.

Â So let's look at how this method works for the Q-function.

Â 23:28

And it's actually very straightforward.

Â So what we're going to do is just pick a value k0 and

Â split into two parts for values less than k0 and values bigger than k0.

Â Now, remember, for small k is where it's significant, for

Â large k, it's going to be negligible.

Â And going from the [COUGH] approximations that we developed before,

Â for example, if you take k0 to be N to the two-thirds,

Â then the tail's going to be exponentially small.

Â I don't want to do the detail of that.

Â And then the tail's exponentially small and not only that,

Â we have a good approximation of the sum n for k less than k0.

Â For all of those values, it's not far from e to the -k squared over 2n.

Â That's the approximation that we just developed.

Â It's very close to that, actually.

Â But that one also for large K is going to be exponentially small.

Â So we might as well just extend the range back to be an infinite sum,

Â because the tail's also exponentially small for that one.

Â And now that sum, we can just approximate with an integral.

Â And if you do that, you get square root of Pi N over 2.

Â And again that's the true value of doing asymptotic calculations.

Â This function Q(N) while it's a precise description of

Â some mathematical quantity that's going to be really difficult to compute.

Â But if you notice square root of pi N over 2,

Â then that's something that you can work with.

Â And we'll be coming back to applications that use this function later on.

Â That's an introduction to bivariate asymptotics.

Â Now I want to finish up by giving a few exercises that you might do before

Â the next lecture to test your understanding of asymptotics.

Â So the first one has to do with where I started with is,

Â how small is exponentially small?

Â And this is just trying for a couple of different values of alpha and beta,

Â comparing the values of alpha vn and beta vn.

Â And really showing that alpha vn,

Â the larger one is going to be really good [COUGH] approximation.

Â So here's another one.

Â An opportunity to go through those calculations that I

Â did it in the last section.

Â There's another Ramanujan function, actually, there's three famous ones.

Â But there's another one called the P function.

Â 26:15

It's also the r function which is discussed in the book.

Â And that's [COUGH] this function, sum over k,

Â N- k to the k, N- k factorial, or n factorial.

Â That one also is approximated by square root of pi, N/2.

Â Going through the steps that I did for the q function for

Â this function will be very instructive if you

Â want to test your understanding of this material.

Â So one thing that you might do just to get started Is

Â write a program that'll print log base 2 of N choose k.

Â And just use what we've talked about in this lecture to get that done.

Â That's an interesting exercise.

Â And write up solutions to the two exercises that I just gave.

Â And again, if you're not comfortable with using

Â tech either on paper or with math checks and HTML,

Â that'll certainly provide some practice, or I wrote it up by hand.

Â I don't write stuff up by hand anymore but some people do.

Â And then read the chapter on asymptotics in the text for

Â much more detail on all the information we've covered.

Â In the next lecture we'll put together all the things that we've talked about

Â in the previous lectures and see how they fit together to form basis of analytic

Â combinatorics, that we can then go on to apply to the analysis of algorithms.

Â