0:13

We have a full scientific understanding of the

Â properties of these algorithms, and they've been developed

Â as practical system sorts and application sorts that

Â have been heavily used over the past 50 years.

Â In fact Quicksort, which we'll consider

Â next time, was honored as one of the top

Â 10 algorithms of the 20th century in science and engineering.

Â 0:57

The idea is very simple.

Â What we're going to do is divide an array into two halves.

Â Recursively, recursively sort each of the halves.

Â And then merge the result.

Â That's the over-view of Mergesort.

Â 1:18

John Von Norman realized that the development of

Â the EDVAC, his EDVAC computer, one of the first

Â general purpose computers that is going to need

Â a sorting method and he came up with Mergesort.

Â He's widely accredited as being the inventor of Mergesort.

Â 1:49

So, we've got an array A and its first half is sorted and its second half is

Â sorted and the computation we need to perform is to replace that with the sorted

Â array where those two sub-halves are merged together.

Â Let's look at a demo.

Â [COUGH]

Â The method that we're going to use is based

Â on taking an auxiliary array to hold the data.

Â This is a, one of the easiest ways to implement the merge.

Â So the first thing we do is copy everything over to the auxiliary array.

Â 2:27

Now, once that's done, what we'll want to do is copy

Â back to the original array to get it in sorted order.

Â In order to do that, we're going to maintain three indices.

Â I, the current entry in the left half, j, the current entry on the right half and

Â k, the current entry in the sorted result. [COUGH] so the first thing we

Â do is, take the smaller of the two entries pointed to by i and j, and compare

Â those, and take the smallest one, and move that one to be the next item output.

Â And whichever one is taken, we increment its pointer.

Â 3:10

Now we compare the minimum again, again, the one pointed group

Â by j is smaller, so we move that one to k.

Â Increment that pointer j and also increment k.

Â 3:50

Now the one pointed to my i, the G

Â is smallest so move that and increment i and k.

Â Move the M up and increment i and k. Now the last element in the left sub array

Â is the one that's going to get moved next. And now

Â that first subarray is exhausted so really all we need to do is take

Â the rest of the elements from the right part and move them back in.

Â 4:22

That's an abstract in-place merge for taking the two sorted sub-halves

Â of an array using an auxiliary array, move them out, and

Â then put them back in in sorted order. Alright, so here's the

Â code for merging, which is quite straightforward from the demo.

Â 4:42

We first in order to sort an array of

Â comparables in this implementation we pass a link to

Â the auxiliary array, in as well. And we have three arguments

Â lo, mid, and hi.

Â So lo is the first part of the array to be sorted.

Â Mid's the midpoint that divides the first

Â part from the second, so our conditions are

Â that from lo to mid is sorted, and from mid plus 1 to hi is sorted.

Â 5:21

And then that sets up for

Â this four loop that accomplishes the merge.

Â We start our I pointer at the left heart on the left half.

Â The J pointer on the left part of the right half.

Â That's mid plus one.

Â And we start the k Pointer at the beginning lo.

Â And for every value of k what we're most

Â often doing is comparing whether aux of j is less than aux

Â of i.

Â 5:56

If it's greater we move the element i over in increment i.

Â And then in both cases, we increment a,

Â not imple, increment k, and that implements the merge.

Â If the i pointer is exhausted, then we just move over the j, next jth element.

Â If the j pointer is exhausted

Â we move over the next ith element. So every time we're moving

Â a new element into k and that's the code that impelements the

Â abstract in place merge. Now with this code,

Â we're also introducing the idea of making assertions just to make it

Â easier to debug our code and to have confidence that it's correct.

Â 6:40

In this case, this insertion just says we want to

Â be sure that a of lo to mid assorted and that mid plus

Â one to high is sorted before our code and then we

Â want to check that, the whole thing is sorted after our code.

Â 6:59

And generally programmers, Java programmers know that it's a good idea to

Â try to do these assertions.

Â Not only does it help detect bugs, but it

Â also documents what the code is supposed to do.

Â And that merge code is a good example of this.

Â If you put at the beginning of the code what you expect

Â in the, in the form of an assertion, which is code itself.

Â And you put at the end of the code what you

Â think it's going to do, again in the form of an assertion.

Â You're both testing that these conditions

Â hold, and also telling someone reading the code, what you're trying to do with it.

Â 7:36

So Java is just an assert statement. It takes it, boolean condition.

Â In this case, we're using that method is sorted that we were before.

Â That returns true if the ported is sorted and false if it's not.

Â And what assert will do is it will throw an exception

Â unless that condition is true.

Â 7:58

Now the thing about assertions in Java is

Â that you can enable or disable them at runtime.

Â And that's really important, because it means you can

Â put them into your code to check while developing.

Â But it doesn't incur any extra cost at all in production code.

Â So by default, insertions are disabled.

Â Something goes wrong

Â 8:32

So, the best practice is to use insertions just as we did in that

Â example with merge and to assume that

Â they're not going to be there in production codes.

Â You shouldn't use them for the things like checking

Â if the input is the way you like it.

Â Alright, so with that merge implementation, then

Â the sort implementation is a quite simple, recursive procedure shown here.

Â 8:57

So we use the merge procedure we just showed, and then our sort procedure.

Â It's recursive so, checks that we have something to do first.

Â Then it computes the value of the midpoint same way as we did

Â for a binary search. Sort the first half.

Â Sort the second half, and then merge them together.

Â 9:30

Now, it's important to not create the auxiliary array in the re in

Â the recursive routine because that could lead

Â to extensive cost of extra array creation.

Â And you'll sometimes see Mergesort performing poorly because of that bug.

Â Otherwise this is a very straight forward implementation.

Â And it's actually a prototype for algorithm design

Â that we'll see come up again and again.

Â It's called divide and conquer.

Â Solve a problem

Â by dividing it into two halves, solving the two halves,

Â and then putting the solutions together to get the appropriate answer.

Â 10:10

[COUGH] here's a trace of what Mergesort does and if you haven't studied a

Â recursive program before it's worthwhile studying this thing in, in some detail.

Â This gives exactly what happens during each of the calls to merge.

Â We start out with a big problem to solve but we divide it in half,

Â then we divide that one in half, and then we divide that one in half.

Â And the very first thing

Â that we actually do is just compare

Â and exchange if necessary the first two elements.

Â And then we do the same thing for the next two elements.

Â Then merge those two together to get the first four done.

Â And then we do the same thing for the next four in the array.

Â So now we have two sorted sub-arrays at size four.

Â And we merge those together to get one of size eight.

Â And then we do the same thing on the right,

Â and eventually we have two eights that we merge together

Â to get the final result.

Â Very instructive to study this trace to

Â really understand what this recursive algorithm is doing.

Â 11:59

So you can run a Mergesort on huge problems.

Â It's a very efficient algorithm. And so, for example,

Â what this table shows, if you were to try to use a insertion sort for a huge

Â file, say a file with a billion elements, on

Â your PC it'd take a few centuries to finish.

Â 12:23

Even on a super computer, if you're using insertion

Â sort nowadays it'd maybe take a week or more.

Â But if you have a good algorithm like

Â Mergesort, and you're trying to do a billion items,

Â you can do it in just less than half an hour on your PC.

Â And a supercomputer can do it in an instant.

Â And smaller problems only take an instant even on your PC.

Â So you can spend a lot of money or a lot of time, or you can use a good algorithm.

Â And that's one of our main themes in this course.

Â A good algorithm is much more effective than spending

Â money or time wasting money or time using a bad one.

Â 13:06

So let's look at the analysis of Mergesort, that's a bit of math but very

Â instructive because this really shows the power of the divide and conquer method.

Â And allow us to take a problem that was taking us quadratic time with methods

Â like insertion and selection sort, and get it done in, in log N time with Mergesort.

Â So that's the proposition Mergesort uses at most N lg N

Â compares and 6 N lg N array accesses to sort any array of size N.

Â 13:42

And the way to prove this proposition is to from

Â examining the code, to write down what's called a recurrence relation.

Â And all that is, it's a mathematical reflection of what's going on in the code.

Â If we're sorting N items then let C of N denote

Â the number of compares that we need to sort the N items.

Â 14:08

In order to get that done, we're sorting the left half and the right half and

Â this notation ceiling of N over 2 and floor of N over 2 that's the N over

Â 2 round up and N over 2 round down, that's the size of the two sub-arrays, and

Â we're going to call the same routine for that

Â size, so the number of compares you need to.

Â For that is C of N over 2, ceiling of N over 2

Â for the left and ceiling of, floor of N over 2 for the right.

Â And then for the merge, we need at least, at most N compares.

Â 14:41

If neither one exhausts, we need exactly N compares.

Â And so and that's true as long as N is bigger than 1.

Â If there's only one thing, we're not

Â doing any compares at all. So this is a mathematical formula that we

Â derive by examining the code but it completely describes mathematically

Â what we an upper bound on the number of compares that are going to be needed.

Â And similarly for the number of array accesses, if you count up the number of

Â times you're accessing an array for a merge you could be at most six in.

Â 15:40

So if you have this recurrence [COUGH] which

Â is similar to the ones that we're talking about.

Â It's exactly the same when N is a power of 2 let's, let's look at this one.

Â If

Â D of N is 2D of N over 2 plus N with D of 1 equals 0, then D of N equals N log N.

Â We'll look at three proofs of that, just assuming that N is a power of 2.

Â If N is a power of 2, then N over 2

Â is also a power of two, so the recurrence makes sense.

Â 16:11

So this is just a graphical representation if we

Â want to compute D of N we want to compute D

Â of N over 2 twice. So that's 2 and then the extra cost

Â for the merge is N, but if we're going to do this twice then we have 2N over 2.

Â So let's, we have 2N over 2s and then for

Â each one of these we have divided into N over

Â 4s and each one of those 4N over 4s has an extra cross for the merge of N over 4.

Â Well 2N over 2 is N, 4N over 4 is N and we keep going down, doing that til

Â we get down to D of 2 and we always for the extra cross for the merge, we have N.

Â And how many stages do we have here?

Â Well, it's the number of times you divide N by 2 to get down to 2.

Â That's exactly log base 2 of N, so the grand total of

Â all the costs for the merge, which is where the compares are,

Â is log N times N, N log N. It's kind of a graphical

Â proof or a proof by picture that that recurrence has that solution.

Â 17:32

So then this is D of N over N equals D of N over 2 over N over 2 plus 1.

Â So it's dividing by N.

Â So now, this is a recurrence that telescopes.

Â The first term on the right hand side is exactly the same

Â as the left hand side so we can apply the same formula.

Â And all it does is divides by 2 again and then throws out another 1.

Â And we keep doing that until

Â we get down to D of 1 which is 0.

Â And when we've done that, we've thrown out lg N 1s.

Â So we get D of N over N equals log N, or D of N equals N log N.

Â That's another proof by expansion.

Â 18:14

Or using either one of those techniques you could just get the idea that D of N is

Â close to Log N or you can write a program to expand the recurrence and find that.

Â And then once we have the idea that D of N

Â equals N lg N, we can plug back into the original formula.

Â With the inductive hypothesis that D of N equals

Â N lg N, we want to show that D of 2N

Â equals 2N lg 2N, using the recurrence D of 2N equals 2D of N plus throw out the 2N.

Â Plugging in N log N we get the desired result.

Â We use this same idea on our initial recurrences

Â for comparison array accesses to show that the running,

Â the number of comparison array accesses is proportional to N log N for Mergesort.

Â So that's the running time Mergesort is fast

Â other thing that we usually want to know is memory.

Â And one of Mergesort's characteristics is that in

Â practical applications, it uses extra space proportional to N.

Â That is, we need that extra auxiliary array for the last merge.

Â 19:51

And search and selection in shellsort are in place, they don't use any extra memory.

Â But Mergesort you can only sort really half of what you can

Â fit in memory, because you need that auxiliary array for the other half.

Â If you want, again, if you think that the things we're studying

Â are easy, think about the idea of actually doing an in-place merge.

Â People have come up with methods for getting this done.

Â So it's theoretically possible, but the methods are

Â generally too complex to be useful in practice and

Â their not used.

Â But there could be out there some easy way to doing in place merge.

Â That's another great algorithm waiting to be discovered.

Â Now there's a, a number of practical improvements that we can

Â use to make Mergesort even more efficient than the simple one

Â that we've looked at and we'll take a look of those

Â because they're examples of techniques that we can use for other algorithms.

Â First thing is that Mergesort is too complicated to use for tiny arrays.

Â So say the subarrays are only of two, or three, or four there's too

Â much overhead with the recursive calls and so forth to get that done efficiently.

Â And what's worse is, the recursive nature of the sort definitely

Â means that there's going to be lots of subarrays to be sorted.

Â So, one improvement that we can make is to use insertion sort, and just

Â cut off and use insertion sort which is simple and efficient for small subarrays.

Â So that's adding this one line of code to Mergesort will make it quite a bit faster.

Â Maybe 20% faster.

Â 21:29

The second improvement that we can make that'll improve the performance

Â for cases when the array is partly sorted, is to just

Â stop if it's already sorted.

Â And that's going to happen in the case where the biggest element in the

Â first half is less or equal to the smallest item in the second half.

Â That means it's done.

Â So that's easy.

Â We just put a test in the recursive Mergesort for that,

Â through this one line of code, to check whether we're done.

Â That way, for example, if you were to call

Â Mergesort for an array that's already in order it would

Â just do this test every time and it would be done in linear time.

Â That's pretty helpful although not, not totally helpful

Â but there's a lot of situations where that's helpful.

Â 22:18

The other thing that's possible to do and it's a little mind bending so recommended

Â only for experts. Is to save a

Â little bit of time you don't really have to copy over into the auxiliary array.

Â You can kind of switch the role of the input

Â and the auxiliary array every time you make a recursive call.

Â You still need that array but

Â you can set up the code in this way which [COUGH]

Â sort, to sort an array, put the result in the other one.

Â To merge an array, put the result back in the first one.

Â So it's recursive argument switchery to get the job done.

Â And it's effective, it means you don't have to actually

Â move items, and that saves a little bit of time.

Â 23:06

So here's a visualization

Â of what the practical Mergesort might look like, and this is with

Â big cutoff to small subfiles. So you

Â got a visual feeling of how this sort gets the job done.

Â So it's the first little chunck and then the next little

Â chunk and then merges those together, and so forth and so on.

Â It's a good visual representation of

Â how Mergesort gets its job done. That's the basic Mergesort algorithm

Â that we're going to look at different versions of in the next.

Â