0:03

We're going to finish off by looking at the idea of.

Â Counting directly with generating functions.

Â This is going to be first step to ease us into really important role that

Â generating functions play, in analytic combinatorics.

Â so it's a, so it's a really an alternate view of what generating functions can do

Â for us, and it is very combinatorial. I will be much more formal, thorough and

Â extensive coverage of this, but still it is good to look at the same problem then,

Â we just looked at from this point of view in this lecture.

Â So what we're going to do is define a class of combinatorial objects that have

Â an associated size function. And the generating function's going to be

Â a sum over all members of the class. We'll still get to the same point of

Â getting a, an equation that's, a generating function must satisfy.

Â And from that point on, the analysis will be the same.

Â but getting that equation is much more natural and fundamental with this kind of

Â correspondence. It seems kind of abstract.

Â But, so, let's look at a specific example.

Â So lets say T is a set of all binary trees.

Â then we're going to define a size function.

Â that looks like absolute value. if you have any binary tree, it's going

Â to be the number of internal nodes in that tree.

Â [COUGH] Now, what we're interested in is a counting sequence and that's what we're

Â interested in before. It's the number of binary trees from the

Â set of all binary trees that have size function equal to n.

Â So, the, that's the number of binary trees with n internal nodes.

Â That what we were counting before. This is just a, a formal definition, tree

Â by tree. Now, what's that good for?

Â Well. If we define the generating function to

Â be the sum over all possible trees, of Z to the size of the tree, then that,

Â treats each term individually [COUGH].

Â Each tree contributes individually to a term in the generating function, but it's

Â exactly the same as if the trees of size N were collected together and counted by

Â T sub N. So that's a fundamental idea on, of this

Â equation we, some over all possible trees.

Â but since it's Z to the size of the tree, all the ones of the same size are going

Â to contribute to the coefficient of that size, say N, and give us T sub N.

Â so it's the same generating function looked at a different way.

Â And so and the reason that's helpful is that [COUGH] we can look at the way that

Â we define the combinatorial structures. to give us the equation that we want for

Â the generating function. and so, this is how it works for binary

Â trees so definition of a binary tree. with [COUGH] is that it's a root node

Â with a binary tree on the left and a binary tree on the right.

Â and say on the left there's T sub L internal nodes,

Â on the right there's T sub R internal nodes.

Â if we take that sum of all possible trees and it's got all possible left nodes.

Â All possible right nodes then but we had [COUGH] Z to the size of the whole tree

Â that's the size of the left, the size of the right, +1.

Â So, the decomposition that we use to define what we mean by, what is a binary

Â tree? immediately gives this equation on the

Â generating function. there's, it starts out with a one.

Â That's, for the empty tree that doesn't compose.

Â So the double sum is for non empty trees. so

Â that's a first key step to understand that the,

Â And the decomposition, the recursive decomposition, the way that we define

Â binary trees, immediately translates to an equation on the generated function.

Â so now that equation, those T sub N and T sub R are just afformal variables and

Â they're independent, so, we can immediately split that sum up into Z

Â times sum overall Teesabellsi to the T sub L over all T sub R the T sub R and

Â then get the answer that T of Z equals, one plus Z, T of Z squared.

Â Just the same as we found before by worrying about all the counting, but this

Â is much more direct. Really the equation says that, a binary

Â tree is empty or it's a root connected to two binary trees.

Â And actually we're going to see a very formal way to really go right from the

Â [COUGH] description of what it is right down to that generating function

Â equation. so here's another way to look at it, just

Â to make sure so the generating function is really got a term for each tree.

Â So those are all the possible trees and it just keeps going.

Â So each tree of size 3 is represented. It's a sum over all trees, Z to that tree

Â size. now when we're worrying about the counts,

Â we're just collecting all the terms for the same exponent, we're just doing the

Â algebra. so

Â now but if you multiply TTfz, of Z * T of Z, it's like taking two trees and

Â composing them. Well that's what we mean by binary tree,

Â take two trees and compose them. .

Â so T1+zT(z^2), of Z = 1 + ZT^2, If we write the things out symbolically as

Â trees then you can see immediately that this tree here comes from one of those,

Â and one of those and so forth. it's in fact, some mathematicians prefer

Â to work with, the symbols in, in some cases in, in commonotorics, people, go

Â very far working with completely symbolic representations.

Â It's a good way to think about it, it's, there's a term in that generating

Â function corresponding to each tree, when we're trying to find out how many there

Â were of each size, we're just doing the algebra of, collecting by size.

Â Now, that's, important, but there's, another idea that is related to this,

Â that, I want to introduce, and that's all about cost.

Â So a lot of times, it's not just about the size, it's about some other property

Â of the, of the structure. and so so, we'll use, we'll do two

Â examples. One's a very easy example.

Â How many one bits are there in a random bit stream?

Â Well everybody knows it should be about half.

Â but still that'll be a warm up. We'll get the right answer for that.

Â A more complicated thing is that's [COUGH] if we're using a binary

Â tree in a computer representation, it's important to know how many of the nodes

Â have both leaves external. you can save space in that way in a lot

Â of situations. so if we have a random binary tree, how

Â many leaves are there? then maybe that's not so easy.

Â in fact with generating functions we can have a unified approach to studying

Â perameters. and again that's one of the key benefits

Â of the way, analytic combinatorics way of looking at things.

Â and I want to do these examples to show the advantages of using generating

Â functions to help us do calculations and analysis like this.

Â so again it's the same kind of idea. We're going to have a class it's, the

Â same idea. We're going to have a class of

Â combinatorial objects. our model is going to be that all objects

Â of each size are equally likely. and that's realistic in a lot of

Â situations. so lets look at how the calculations look

Â when we're trying to find the value of a parameter.

Â so let's say the all, set of all objects in the class is p and then we have a size

Â function which, for every object in the class, we have a defined size.

Â then we have the counting function so that's the number of objects that have a

Â size n. so that's what we've been talking about

Â up to this point. but now let's say we also have a cost

Â associated with each object. And then, in that case we're going to be

Â interested in the number of objects that have a given size and a given cost.

Â And, were, want to know that, so we can do the calculation of the average value.

Â So that is the expected cost of an object of size n, is going to be the probability

Â that the object cost of an object of size n is k.

Â which is the number that F costs K divided by the total number.

Â then we're assuming their all equally likely.

Â times k so that's the definition of the expected cost.

Â and so that's all a familiar calculation if we know all these quantities.

Â But what. This point of view buys for us is, well

Â let's notice that we can factor out the p sub n, and we can just count up the total

Â cost. of all objects of size n.

Â That's called the cumulative, the accumulated cost.

Â so every object's got a cost associated with it.

Â and then we take all the objects of size n.

Â Add up all their costs. And divide that by the number of objects

Â of size n that's the expected cost. It's a trivial calculation but still it's

Â an, an important distinction. so the, the idea is we, if we can compute

Â the accumulated costs, then we can get the expected costs just by dividing the

Â accumulated cost by the number of objects.

Â now what's that have to do with generating functions?

Â Well again [COUGH]. we start out with the, same situation.

Â let's take a look at the generating functions.

Â So we have already talked about the counting generating function.

Â If we sum over all objects in the class Z to their size, then just algebra collects

Â their terms to get the [COUGH] coefficient of Z to the N as the number

Â of objects of size N. but we can have a similar.

Â Generating function to compute the cumulated cost.

Â If we look at the function c of z which is the sum for every object in the class.

Â Its cost times Z to the size. Then, those things are going to

Â collect by size. And so the coefficient of z^n in that is

Â going to be nothing other than for all values of k, the sum of k times p and k.

Â It just collects the objects of cross k by size.

Â And it will get them all. So the coefficient of z^n in that

Â function c(z) is exactly the accumulated cost.

Â So that gives us the average cost. We just extract coefficients from those

Â two generating functions. the bottom line is.

Â If we want to compute the expected value of a cost, we have two GF counting

Â problems to solve. and their similar.

Â we know how to solve we've already done examples where we can solve GF counting

Â problems. And now we can get average values of

Â parameters by solving GF counting problems.

Â so again, it's a, it's a. We, at trivial calculation, to say, we're

Â going to compute the average by computing the total and divide by the number.

Â But still, it's fundamental because now everything is counting.

Â And we're going to have very powerful mechanisms for counting.

Â So let's see how it works for the two examples that I mentioned.

Â How many one bits in a random bit stream? Okay so these are sort of orbit strings.

Â the number of bits is our size function so that's going to be the size function

Â for any bit string then in with the cost functional B will call ones B that's the

Â number of one bits in a given bit string. number of bit strings of size n that's

Â besoben. and that's 2n.

Â [COUGH], and then CeceVan, we can use GS to get that, but, no let's no, no hit,

Â and we're happy with that. And then, the'cumulated cost function

Â Cece Van, is the total number of one bits in all bit strings of size N.

Â So counting g f, so that's b of z, sum over all bitstrings z to its size and

Â that's equal to two to the n, z to the n. One over two, one over 1-2z.

Â so that's, b of z. And actually, we can, get that formally

Â just from the definitions. But, I'm sure you believe that one.

Â what about the cumulative cost GF? So, [COUGH].

Â So that's that's the function. so now, what we want to do is similar to

Â what we did for the Catalan. Is to use a, a recursive description of

Â what a bit string is, to get us a, a formula for this function.

Â And well, what's a bit string? It's either a zero or a one, followed by

Â another bit string. So [COUGH], for if we're going to have

Â for all bits strings the number of 1's. That's going to be e-, equal to.

Â for all bit strings you can put a zero or a one in the front and that, that'll give

Â you all possible bit strings. In this whole set of bit strings there's

Â one, one bit. plus there's two.

Â How ever many one bits there are in the bit string b prime.

Â And that's summed over all b, b prime. And we had, length of b.

Â But it's the length of b prime, plus one. So again, that recursive description

Â immediately gives that formula for the accumulated cost, GF.

Â And so now, just doing the math. that the first term gives us a ZV of Z,

Â and the second term gives us a 2Z C of Z. So that's an equation for the accumulated

Â cost function and we've got the what B of Z is.

Â And so we can just solve that equation. It's

Â [COUGH] Zb of Z is one over one minus 2Z. And then C of Z, bring 2Z Cof C over to

Â the left hand side. And so we divide by another factor of 1 -

Â 2Z. and so that's a proof that C of Z is

Â equal to Z over 1 - 2C squared. So now we have explicit expressions for

Â both the enumerating GF and the accumulated cost GF.

Â And all we need to do is extract coefficients from those two functions in

Â order to get the average cost. 2zz/(1-2*z^2) / 1 - 2z^2 is, that's a

Â standard generating function that we've seen several times before.

Â And so the bottom line is, the coefficient of z^n and the accumulated

Â cost is n*2^n. N minus one, coefficient is z to the n,

Â in [COUGH] and the numeration is to the N,

Â and that gives us the result that the expected cost is N over two.

Â Again that's a warm up we kind of knew its n over two.

Â now lets do a problem that you might have a lot of difficulty solving some other

Â way and that's these in binary trees. So again a leaf in a binary tree is an

Â internal node whose children of both are external.

Â so [COUGH] for example, in the trees of size two, each one of them has one, leaf.

Â So the if we wanted to go down to do all the

Â counts, So, t of n's the number of binary trees

Â with n nodes. TNK is the number of n known binary trees

Â with k leaves. So, T2-1 =two.

Â There's two of them that have one leaf. and CN is the average number of leaves in

Â a random n node binary tree. So that's the ratio of those two.

Â So C21. = 1 so for five there's four of them that

Â have one leaf, so I'm sorry for three, there's four of them

Â that have one leaf, so T31 equals four. There's one, the balanced one, has two

Â leaves, so T31 equals one. And so,

Â [COUGH] if you want to compute the average number of leaves in a binary tree

Â with three nodes, it's four plus two times one divided by five, or 1.2.

Â And similarly for fourteen, we find that eight of them have one leaf and six of

Â them will have two leaves and that gives a solution.

Â So what's the average number of leaves in a, in, in a random binary tree?

Â If we treat them all likely how do we get that counting sequence?

Â and again, that's actually a practical problem that plagued programmers when

Â binary trees first came into use because these things were a wasteful of space and

Â people want to know how much space they could save.

Â O.k. So let's [COUGH] use cumulative counting

Â to solve that problem. So.

Â set of all binary trees internal nodes is a size function leaves is our cost

Â function number of leaves in the tree. the number of binary trees is size n of

Â catalyn numbers as c analyses that we did and now what we want to do is develop a

Â generating function for the cumulative cause which is the total number of leaves

Â in all binary trees of size n. so the counting GF, we already did that

Â one. so that's just summarizing that.

Â and the accumulated cost GF, that's for the [COUGH] at c of z sum over all trees,

Â number of leaves times z to the size. [COUGH], and then, the average number of

Â leaves in a random endo binary tree is just the ratio of the coefficients of

Â those. We already know the coefficient of the

Â number of trees, that's the catalyn number.

Â So, we're looking for coefficient of Z to be in in C of C.

Â So, that's the next thing we're going to do is try to find that cumulative cost

Â function. so [COUGH] that's the, that's the

Â function. Now again we're going to use the same.

Â kind of decomposition that we did when we're enumerating trees.

Â And, and actually usually, that's what happens.

Â The, same formal description, that gave us the number of trees, it's going to

Â give us, the cost. And that's why this methods so powerful.

Â we, we, do the work to figure out the composition, of the way to describe it.

Â and then we get to apply it twice. Once to find out the total number, and

Â the other to find out the total cost. and so.

Â [COUGH] that the composition immediately leads to this equation for the cumulated

Â generating function. So there's the tree that is just a leaf.

Â so that's accounts for the Z term. and then for all the other trees there's

Â a left and a right. So that's T sub L nodes on the left and T

Â sub L nodes on the right. and however many leaves they are on the

Â left [COUGH] they're. In the tree, is the total number of

Â leaves on the left, plus the total number of leaves on the right.

Â The leaf can't, if it's got more than one node in it, the leaf can't cross between

Â the two trees. so, this equation here holds exactly.

Â Again, the plus one for the root but that same decomposition gives us this same

Â equation. And again, T sub L and T sub R are

Â independent so that immediately [COUGH]. allows us to break these break these sums

Â up into independent sums. So we have leaves of t sub l, z to the t

Â c l, t sub l. times the sum on T sub R there's, two

Â terms that are similar, one where we, for the T sub L and one for the, T sub R and

Â then, and then we have the double sum, so we distribute over those, so that's just

Â elementary distribution, and then those things, we have expressions for every one

Â of them, leafs of T sub L Z, to the T sub L that's

Â Z of Z. And sum t sub r t z to the t sub r,

Â that's t of z and we have two of those. So that gives us [COUGH] a simple

Â equation for. Z of Z, the in-cumulative generated

Â function. All that's left is to extract

Â co-coefficient's from that generating function.

Â So this is the summary of the deriration so far.

Â we're looking for that CGF. we did that decomposition, and we have an

Â equation for it. we know the number of tree's is the

Â catalan numbers. and so our accumulated generating

Â function is c of z, z plus 2z, t of z c of z, and we solve for z of c.

Â We get z over one minus 2c t of z. catalan numbers multiplied by 2z then

Â subtract one, you get z over square root of one minus 4z.

Â So that's an explicit expression for the cumulated cost function.

Â C of Z equals Z / 1 - 4Z. And we can extract coefficients from that

Â the same way that we did for the Catalan numbers using the binomial theorem.

Â the end result is, 2N minus two choose N minus one.

Â And very much the same calculations. Just with a slightly different result.

Â that's accumulated cost and then the final thing is to divide those two

Â and if you divide those two almost everything cancels.

Â except for an N plus one times NN on the top.

Â 2N times 2N minus one on the bottom. so that's three N's and two N's and

Â that's an asymptotic 2N over four. So about a quarter of the.

Â [COUGH] Nodes in a random binary tree are leaves.

Â [COUGH], and that's an example of the use of

Â Accumulated generating functions to discover the average cost of a, of a

Â quantity. And we'll be seeing lots and lots of

Â derivations like this and actually many of'em will be much simpler because we

Â have coherent ways to deal with explaining the decomposition in terms of

Â the generating function. and we'll talk about that when we

Â introduce analytic combinatorics. So that's counting with generating

Â functions. And so now I want to finish by

Â Giving you a few exercises that you might do before the next lecture.

Â so the first one is to just practice solving an recurrence, and this is an

Â example that shows that the initial conditions really matter.

Â So solve recurrence with one set of initial conditions, and then solve the

Â same recurrence with just that one initial condition changed.

Â and you can see the impact of just that one change.

Â on the recurrence and also get some practice on sovereign recurrences.

Â another thing that You might do is practice expanding some unknown

Â generating functions. And there's many exercises like this in

Â the book. these are just a couple that you might

Â try. So what about one over square root of

Â 1-Z? natural log of one over one - z.

Â There's a bunch of ways to do that there's a hint or one way that might

Â work. that'll give you some practice in how we

Â extract coefficients from generating functions.

Â And you can try some other exercises there as well.

Â So, what I, think would be useful, for people to do to, make sure they

Â understood the material in this lecture. one thing is to, if you've got access to

Â a symbolic math, math system, [COUGH] or if you're used to using one.

Â do things like, check the initial values on that

Â Equation that we got for accumulated costs for leaves in binary trees.

Â Similar to the check that I did just that the regular Catalan recurrence holds.

Â And if you don't have a symbolic math system, look around to see if you can do

Â that in some way. Some are freely available.

Â I think again the best way to learn this material is after you've listened to the

Â lecture and got some idea of the overview of the material is to read the text

Â carefully.'Cause the text really does tell the story in, in some detail.

Â and then it's certainly worth while to check your understanding by really try to

Â write up full solutions to those exercises.

Â maybe using TEC. Or, or maybe using HTML plus

Â Jack and really seeing that you can create the math that is the solution to

Â those exercises. That's an introduction to generating

Â functions.

Â