0:00

Okay. So, we have given this, this dynamic,

Â this algorithm for the problem of RNA secondary structure and as I said, this

Â fits within the category of algorithms that we call, dynamic programming.

Â [SOUND] Okay?

Â So that, that the, the term programming here does not have to do

Â with programming in real programming languages.

Â This is just the name of this class of algorithms.

Â And dynamic, [COUGH] dynamic programming as I said, it, it solves the problem.

Â Certain problems when we sort certain optimization problems,

Â we observe that the, the optimal solution to the big problem,

Â can be obtained from optimal solutions to solve problems.

Â Like in this case here, the solution to op-,

Â the solution to the entire problem opt of 1N, can be

Â obtained from opt of 1N minus one and opt of, of,

Â of NT and N1 to T minus one and T plus one to T minus one.

Â So the solution to,

Â the bigger problem IJ, can be obtained from a solution to sub-problems.

Â That involves smaller, smaller sub-problems here.

Â Okay, so, when we talk, when we look at dynamic programming, dynamic programming,

Â we have to have this, this property that the problem, the solution to the problem,

Â the big problem, is obtained, from solutions to sub-problems.

Â And when we have this,

Â this properties, when we compute the solutions to sub-problems,.

Â We store them in a, usually in a matrix and,

Â when we come to solve the big problem, when we encounter this problems,

Â all we need to do, because they have already been computed.

Â Just look them up.

Â So looking up this value, this value, this value, it's all of one operation for

Â each one of them.

Â 1:52

Of course, because we are doing this technique the gain from, the gain from

Â dynamic programming comes, in cases where the same sub problem is going to be, needs

Â to be computed over and over and over if we do recursive algorithm, for example.

Â So, if I get to a situation where, the problem is solved from sub-problems, but

Â these sub-problems do not overlap, ever, in the sense that,

Â OPT of (7,19) is solved once.

Â OPT of (7,20) is solved only once, or needed only once.

Â 2:24

Then I'm not going to gain much.

Â The, the, the gain I would get from this if the same sub-problem, is,

Â needs to be solved over and over and over.

Â With dynamic programming, the way we do that.

Â By solving it only once, storing it in a matrix and every time we need to solve it

Â again, all we do is just, look up the value in the matrix, okay?

Â The other thing that we have to keep in mind,

Â that the fact that I have written this, this kind of formula and this algorithm.

Â Does not necessarily mean, or if I have dynamic programming,

Â does not necessarily mean I have an efficient algorithm, or

Â a polynom, which we mean polynomial time algorithm.

Â For this to be polynomial, the sub problems, that we divide and

Â solve, has to be polynomial as well, okay?

Â Because, if you, have to solve, if the matrix has exponential size.

Â There's nothing much you can do about it, because you are filing the entire matrix.

Â So usually the running time, of a dynamic programming algorithm,

Â is dominated by the size of matrix because, or the size of it,

Â the size of matrix because, we have that entries in the matrix.

Â At least we have to fill out these entries in the matrix.

Â Even if we do the simplest amount of computation,

Â which is constant number of operations per entry, the number of

Â entries in the matrix, is going to give us a lower bound, on the running time.

Â Now, what is the running time of this algorithm?

Â It's very simple to analyze, actually.

Â This loop here is on the order of n, right.

Â n minus five, it is order of n.

Â This here is also in the order of n because in the, in the,

Â in the worst case, n is five and this is going to go from one to n minus five.

Â This is o of one.

Â Here, this is going to be O(1).

Â Of course this is O(1), O(1), the addition is O(1) but

Â there is this [BLEEP] that is ranging.

Â If I am looking again in the worse case at position N,

Â T could be, any of these positions here.

Â So in the worse case, it is O(n).

Â So this we see, in the worse case we might be looking at O(n) values.

Â So, this can be O(n).

Â So O(n), this is nested within it, this is nested within this.

Â So this algorithm, is in the order of nq.

Â Okay?

Â And this is much better than trying a brute force approach,

Â that would have tried, for example, let's look at every possible way of pairing.

Â Indexes in the [INAUDIBLE] secondary structure and then check feasibility, and

Â then find the one with the maximum number, okay?

Â And, if you,

Â if you want to think about, is this faster than the recursive algorithm?

Â I en, I highly encourage you to implement this, as a,

Â in dynamic programming; using a matrix.

Â Just implement this using the recursive version,

Â where this opt is the name of the function instead of a matrix.

Â So you call, that you are calling the function here every time.

Â And run them and look at the running time of these two algorithms and

Â see which one is going to be faster.

Â 5:32

Of course, with dynamic programming, when we're looking at

Â dynamic programming implementation, we, there is an issue always with with memory,

Â because we have to be building that matrix, okay?

Â Usually the memory is not a big deal when we are looking at, at small strings,

Â because it is, in this case it's going to be on the order of n squared.

Â But if n gets very, very large, the dynamic programming approaches,

Â start suffering because of the memory issue.

Â They are still efficient in terms of time, but the amount of memory they need to

Â store, becomes very large, and that becomes the bottleneck, okay.

Â One final word about this algorithm for our secondary structure, and with respect

Â to dynamic programming, is, remember that the biologist was not interested.

Â In the, in the, in the number.

Â The biologist didn't tell me,

Â it didn't ask me to give them the number of pairs that are going to match.

Â The biologists wanted the actual pairs, because they wanted the actual structure.

Â So how would I get, how would I get,

Â the actual pairs themselves, rather than just the number?

Â Because when I say Return OPT of (1,n), this can be like the number five.

Â But the biologist is interested not in five,

Â the biologist is interested in the five pairs, that I produced.

Â That's very easy to obtain if you think about this algorithm carefully.

Â When you look at OPT (1,n), OPT (1,n), it's either the max of this or this.

Â If it is, if, if this is the value that's giving us the max.

Â Then we know that n is not part of the solution.

Â We repeat the exercise now on 1 to n minus 1.

Â If this is the part that give us the solution,

Â then we find the t that is maximizing this value.

Â If t is let's say 7 and n is 100, then we tell the biologists 7 and

Â 100 are going to be one of the pairs.

Â Then we repeat the exercise and find what are the other pairs.

Â Okay?

Â So this is a technique that's called backtracking.

Â When we compute the ends are using dynamic programming.

Â Usually we are, computing a number that is related to

Â the solution which is usually the optimality score.

Â Here it's the maximum number of pairs.

Â Or, or something like that.

Â And, and within other algorithms.

Â So usually dynamic programming algorithms are implemented in such a way that,

Â the matrix is containing, the interest are containing the optimality score.

Â But using the trace back technique; we can find the actual solution,

Â not just the score of it, okay?

Â And again, here if I just.

Â The traceback as you go from the, remember that the matrix we built,

Â 1,n was here, this is optive 1,n.

Â