Let me quickly remind you of the Optimal Substructure Lemma that we proved in the

previous video. Suppose we have an optimal binary search

tree for a given set of keys, one through N, with given probabilities.

And suppose this binary search tree has the root R.

Well then it has two sub-trees, t1 and t2.

By the search tree property, we know exactly the population of each of those

two sub-trees. T1 has to contain the keys one through r

- 1. As usual we're sorted, we're assuming

these are in sorted order. Whereas the right sub-tree t2, has to

contain exactly the keys r + 1 through n. Moreover, t1 and t2 are, in their own

right, valid search trees for these two sets of keys.

And finally, and this is what we proved in the last video they're optimal for

their respective sub-problems. T1 is optimal for keys one through r

minus one and the corresponding weights or.

Abilities and T2 is optimal for R plus one through N and their corresponding

frequencies. So let's now execute our dynamic

programming recipe. So now that we understand the way in

which an optimal solution must necessarily be composed in a simple way

from solutions to smaller subproblems. Let's take a step back, and ask, well.

Given that, at the end of the day, we care about the optimal solution to the

original problem. Which subproblems are relevant?

Which subproblems are we going to be forced to solve?

For example, with independent sets in line graphs we observed that to solve a

subproblem we needed to know the answers to the subproblems where we pluck either

one or two vertices off of the right hand side.

So overall what we cared about was subproblems corresponding to prefixes of

the graph. In the knapsack problem we needed to

understand subproblems that involved one less item and possibly a resus, reduced

residual knapsack capacity, so that led to us caring about solutions to

subproblems corresponding to all prefixes of the items and all integer

possibilities for the residual capacity of a knapsack.

In sequence alignment, when we looked at subproblems.

As we were plucking a character off of one or possibly both of the strings.

So we cared about subproblems corresponding to prefixes of each of the

two strings. Now, here's one of the things that's

interesting about the binary search tree problem which we haven't seen before.

Is that, when we look at a subproblem. In the optimal structure lemma, there's

two that we might consider. We don't just pluck off from the right.

We care about both the subproblem induced by the left subtree.

And that induced by the right subtree. In the first case, we're looking at a

prefix of the items we started with. And that's like we've seen in our many

examples. But in the second case, the sub problem

corresponding to t sub two. That's actually a suffix of the items

that we started with. So put differently, the sub-problems we

care about are those that can be obtained by either throwing away a prefix from the

items that we started with or throwing away a suffix from the items that we

started with. So in light of this observation, that the

value of an optimal solution depends only immediately on sub problems that you

obtain by throwing out a prefix with a, or a suffix of the items, what I want you

to think about on this quiz is, what is the entire set of relevant sub problems?

That is, for which subsets s of the original items one through n is it

important that we compute the value of an optimal binary search tree on the items

only in s? So before I explain the correct answer

which is the third one, let me talk a little bit about a very natural but

incorrect answer, namely the second one. Indeed, the second answer seems to have

the best correspondence to the optimal substructure lemma.

The optimal substructure lemma states that the optimal solution must be

composed of an optimal solution on some prefix and an optimal solution on some

suffix, united under a common root r. So we definitely care about the solutions

to all prefixes and suffixes of the items but we care about more than just that.

So maybe the easiest way to see that is to think about the recursive application

of the optimal substructure lemma. And again relevant subproblems at the end

of the day are going to correspond to all of the different distinct subproblems

that ever get solved over the entire trajectory of this recursive

implementation. So, I mean just think about one sort of

example path in the recursion tree, right?

So in the outermost level recursion you've got the whole item set, let's say

there's 100 items one through 100, you're going through and trying all

possibilities of the root. So at some point you're trying out root

number 23 to see how it does. You have to recurse once on items one

through 22 to optimally build a search tree for them, and similarly for items 24

through 100. Now let's sort of drill down into this

first recursive call where you recurse on item just one through 22.

Now here again, you're going to be trying all possibilities of the route, those 22

choices. At some point you'll be trying route

number seventeen. There's again going to be two recursive

calls. And the second recursive call is going to

be on items eighteen through 22. It's going to be the items that were

passed through this recursive call. A prefix of the original items.

And then the second recursive call here, locally is going to be on some suffix of

that prefix. So in this case, the items eighteen

through 22. A suffix of the original prefix, one

through 22. So, in general, as you think through this

recursion multiple levels. At every step, what you've got going for

you is, you're either deleting a chunk of items from the beginning, a prefix.

Or you're deleting a chunk of items from the end.

But you might be interleaving these two operations.

So it is not true that you're always going to have a prefix of a suffix of the

original set of items. But.

What is true is that you will have some contiguous set of items.

It's going to be. If you, if you have I as your smallest

item in the subproblem and J is the biggest, you're going to have all of the

subproblems in between. And that's because you were only plucking

off items from the left or from the right.

So that's why C is the correct answer. You need more subproblems than just

prefixes and suffixes. Alright, so that was a little tricky,

identifying the relevant sub problems but now that we've got them in our grubby

little hands the dynamic programming algorithm as usual is just going to fall

in to place, the relevant collection of sub problems unlocks the power in a very

mechanical way of its entire paradigm. So let's now just fill in all the details.

The first step is to formalize the recurrence.

That is, the way in which the optimal solution of a given subproblem depends on

the value of smaller subproblems. This is just going to be a mathematical

formula which encodes what we already proved in the optional substructure

lemma. And then we're going to use this formula

to populate a table in a dynamic programming algorithm to solve,

systematically, the values for all of the subproblems.

So let's have some notation to put in our recurrence, in our formula.

We're going to be indexing sub-problems with two indices I and J and this is

because we have two degrees of freedom where the continuous interval of item

starts I, and where the continuous interval of items ends, J.

So for a given choice I and J, where of course I should be the most J.

I'm going to denote by capital C sub IJ, the weighted search cost of an optimal

binary search tree just in the contiguous set of in, items from I to J.

And of course, the weights or the probabilities are exactly the same as in

the original problem they're just inherited here, PI through PJ.

So now let's state the recurrence. So, for a given sub problem cij, we're

going to express the value of an optimal binary search tree in terms of those of

smaller sub problems. The optimal sub structure lemma tells us

how to do this. The optimal substructure lemma says, that

if we knew the roots, if we know the choice of the root r which here is going

to be somewhere between the items I and j, then in that case, the optimal

solution has to be composed of optimal solutions to the two smaller sub-problems

united under the root. Now we don't know what the root is.

There's a j - I + one possibilities. It could be anything between I and j

inclusive. So as usual, we're just going to do brute

force search over the relatively small set of candidates that we've identified.

So brute force search we encode by just explicitly having a minimum.

So I choose some route R somewhere between I and J inclusive.

And given a choice of R we're going to inherit the weighted search cost of the

optimal solution on just the prefix of items I through R minus one.

So on our notation that would be C of I. R minus one.

Similarly we pick up the weighted search cost of an optimal solution to the suffix

of items R plus one through J. And if you go back to our proof of the

optimal substructure lemma you'll see we did a calculation which gives us a

formula for what, how the weighted search cost of a tree depends on that of its

subtrees. And in addition to the weighted search

cost contributed by each of the two search trees we pick up a constant,

namely the sum of all of the probabilities in the items we're working

with. So here that's sum of.

P sub K, where K ranges from the first item in the sub problem I to the last

item in the sub problem J. So one extra edge case we should deal

with is if we choose the root to be the first item, then the first recursive term

doesn't make sense, then we'll have C, I, I minus one, which is not defined.

Similarly, if we choose the root to be J, then this last term would be C of J plus

1J which is not defined. Remember indices are supposed to be in

order. So in that case, we'll just interpret

these capital C's as zero. And so why is the recurrence correct?

Well all of the heavy lifting was done and our proof of the optimal substructure

lemma. What did we prove there?

We proved the optimal solution has to be one of just J minus I plus one possible

things. It depends only on the choice of the

root. Given the root, the rest is determined

for us. The recurrency is by definition doing

brute force search through the only set of candidates.

So therefore, it is indeed a correct formula for the optimal solution value,

in terms of optimal solutions to smaller sub problems.