0:00
[SOUND]. In this segment, I want to demonstrate
why it's important to avoid doing recursive computations repeatedly and I
want to show you that let expressions are the natural way to avoid that problem.
So, to do this, I'm going to use a small example here.
So let me show you what this code is doing.
it's, it's a bad example. It's not going to work well, so I've
called it bad max. And the idea is to take a list of
integers, and return the maximum number, in the list.
Now, that doesn't make any sense if the list is empty.
So, in that case, I really ought to do something like raise an exception,
or in the next segment, I'll show you another option for how to deal with that.
But since that's not my point here if the list is empty, I'll just return to zero.
Now this is terrible style, but it's not my point and, its not what's bad about
this function that I'm emphasizing here, okay?
So ignore that case. Otherwise, if the element, if the list
has just one element, so the tail is null,
then return that one element, the head of the one element list.
That's the maximum of one element list. Otherwise, if the head is greater than
the max of the rest of the list, return the head.
Otherwise, turn, return the max of the tail of the list.
So this actually ignoring the empty list case does work correctly, okay,
and yet it's still a horrible algorithm. And to show you that I'm going to use a
couple of helper functions I wrote, count up and count down.
So here they are in the file. We don't need to go through them they
just basically include all the numbers from the to inclusive either from smaller
to larger, or from larger to smaller. So just know that when I come here and
test out bad max here. So, here I am in my REPL,
and let's get everything loaded up here. I also already have it fixed, so you're
going to see a good max here as well, that also has type int list arrow int.
So we're still working on bad max. So suppose I do bad max countdown.
let's say from 30 to one. Okay?
So it turns out the maximum of that is 30,
and sure enough, if I said 30 to two, it would still be 30.
And of course, this is a perfectly good function,
I could make a 30,000 element list, and ask the max of it.
Computers are fast these days. I get 30,000, no trouble.
Okay? Now, what if I tried counting up instead?
So, you know, if I want to count up from one to ten,
that's fine. The max is ten.
One to twenty, it's fine. I did test this out ahead of time.
Watch this. 25,
you see a little delay there, just a little bit of one.
How about 28? A little bit more of one.
if you tried 30,000 you would be here for a very, very long time.
In fact even at 30, it seems to have quite a delay.
In fact, I don't even want to wait for it to finish.
Although, we'll finish here in a few seconds.
So what's going on here? Let's look back at our code.
It's that count up-case that isn't going well.
So lets think about what happens when the bigger number's are near the end of the
list. So we come in here, we have like a 30
element list. It's not empty, so we don't do list then
zero. It's not almost empty, so we don't do
this. So what we end up doing, is computing
what's the maximum of a 29 element list. Well that's no problem.
That'll eventually come back and say 29. and then, we'll say oh, it's, sorry, it's
30 because we have 2 through 30. So then when we come back and we have
that 30, we'll end up computing bad max again.
So, naively, you might think oh, so the count up version is going to twice as
slow as the count down version. Well, no that's not the case.
Because in that recursive call, where we have a 29 element list, we're going to
end up calling bad max twice, on 28 element lists.
But if we call it one twice on a 29 element list and each of those calls it
twice on the twenty-eighth element list and each of those call it twice on the 27
element list you can see this is actually doubling at each level so its not twice
as many calls its actually exponentially more calls and that is a huge difference.
So let me try to make this point a little better with some slides.
here is our code here And don't worry about the calls to null
and head and tail. It turns out those all do just a little
bit of work. They take the beginning of the list, and
they either just say whether it's empty or not, or return the first element or
just return a reference to the rest of the list.
So those are all fast. It's those recursive calls that bad max,
we have to be careful about. And here's what it looks like.
In the countdown case, bad max of 50 49 48 calls bad max of 49 48 which calls bad
max of 48 and so on. And if we had a 50 element list, we would
end up doing about 50 calls. But in the count upcase.
We end up, way if we start with our list making two calls.
Right, once for in the, between the if and the then, and once in the else.
And each of those replace the word next two calls and each of those replace the
next two calls. So it looks like one, two, four, eight.
If you ever double the number one 50 times, you'd get an enormous number,
which is why even for 30 we saw a noticeable pause before we finished.
I also want to point out that counting up here is not just the only case that has
this problem. If you just have the biggest number,
more then 50 elements from the beginning of your list, it's going to take forever
to compute. So if you had a 3,000 element list, yes
count down is very fast. But even if those numbers were randomly
shuffled, chances are the maximum element is not going to be close enough to the
beginning of the list for it to compute efficiently.
Okay so it looks like ML has a problem, right?
It looks like recursion is terrible, we should have used a four loop or
something. Of course that's not how I feel, what we
just need to do is avoid doing these repeated computations.
And we know how to avoid doing repeated computation. What we can do is do that
computation and store the result in the variable.
And this is the last use I'm showing in this section of the course of led
expressions, and why they're so important.
So all I want to do is avoid computing bad max twice by remembering its answer.
In a variable. So that's what good max does.
So I really shouldn't call it good max because it's still doing the empty list
case completely wrong. But ignoring that, for nonempty lists it
does the right thing. So if the tail of the list is empty, then
just return the head of the list. In the else case, compute the maximum of
the rest of the list and remember it. Here I've just stored a variable tail
ans, just to remember what it is. And now, let's use that.
They say if the head of the list is greater than tail ants, then return head.
Otherwise don't call good max again, just return tail ends and this will work
absolutely fine. So if I just try good max, oh look my
answer for 30 did finish. Good max I get 30 right away and I can
even count up to 3,000, it will work just fine.
Countdown still works correctly as well. It didn't actually need this, because in
the countdown case, well, it's taking a minute here I think
just because I've created such a large list.
We'll look into that later and I'll fix the code up for you.
Let's go back to the slides. Oh, did I oh I know what's going on here.
I did figure it out. By the way, control C control C to break
an infinite loop. When you call countdown, you're supposed
to put the larger number first. if you do it the other way, countdown
itself is going into an infinite loop because of how I wrote it.
Good backs never got called. There we go all fixed.
Okay back to the slides always fun to mess up on the fly.
Alright. So, why is that happening?
Why is bad max taking so much longer? Well, you can just work out the
arithmetic. Suppose that the amount of work bad max
does, ignoring the recursion, is say, takes one ten millionth of a second.
I have no idea if that's anywhere close to right.
But one ten millionth of a second seems pretty fast.
Right? Well, then in the case where we made 50
recursive calls, it's going to finish in 50 one millionth, ten millionths of a
second. Right.
Or half a millionth of a second, faster than humans can tell.
But if we do it in the count up order. We have to multiply that one in ten
million, by 2 to the 50th. And if you work out the arithmatic, that
answer for a 50-element list, is going to take about three and a half years, and
for a 55-element list it would take over a century.
So I always like to emphasize that it's not about computers getting faster, if
you don't write you code in an efficient way.
And in a language that encourages recursive functions, you need to avoid
doing unnecessary recursion that leads to this exponential blowup where you go one,
two, four, eight, sixteen. Okay.
Once we've fixed our code like you see here, then everything works great,
because we only ever call good max once on the whole list, then the tail of the
list, then the tail of that list, and so on, thanks to our let expression.
So in both the count up case and the count down case and in fact any other
case, any order of elements in any list, we will just have 50 calls so if we guess
that it takes one ten millionth of a second to do one of these calls, it's
going to take 50 ten millionths of a second or one millionth of a second to do
all of them. Okay, so, that is how to use lead
expressions combined with recursion to avoid pathological cases where you're
extremely inefficient, and that concludes our discussion of lead expressions for
now.