0:00

So far we've developed a divide and

Â conquer approach to count the number of inversions of an array.

Â So we're going to split the array in two parts,

Â recursively count inversions on the left, on the right.

Â We've identified the key challenge is counting the number of split

Â inversions quickly.

Â Where a split inversion means that the earlier indexes on the left half of

Â the array, the second index is on the right half of the array.

Â These are precisely inversions that are going to be missed by both

Â of our recursive calls.

Â And the cracks or the problem is that there might be as many as quadratics but

Â conversions.

Â It somehow they go the run time they want.

Â We need to do it in a linear time.

Â So, here is the really nice idea which is going to let us do that.

Â The idea is to piggy back on merge sort.

Â By which I mean we're actually going to demand a bit more of our recursive calls

Â to make the job of counting the number of split recursions easier.

Â This is analogous to when you're doing a proof by induction,

Â sometimes making the inductive hypothesis stronger,

Â that's what lets you push through the inductive proof.

Â So we're going to ask our recursive calls to not only count inversions in the array

Â that they're passed, but also along the way to sort the array.

Â And hey, why not?

Â We know sorting is fast.

Â Merge sort will do it in n log in time,

Â which is the run time we're shooting for, so why not just throw that in?

Â Maybe it'll help us in the combined step, and as we'll see, it will.

Â So, what is this bias, why should we demand more recursive calls?

Â Well, as we'll see in a couple of slides, the merge subroutine

Â almost seem designed just to count the number of split inversions.

Â As we'll see, as you merge two sorted sub arrays,

Â you will naturally uncover, all of the split inversions.

Â 1:30

So, let me just be a little bit more clear about how our previous high level

Â algorithm is going to now be souped up so that the recursive calls sort, as well.

Â So, here is the high level algorithm we proposed before where we just recursively

Â counted versions on the left side, on the right side.

Â And then, we have some currently unimplemented subroutine counts splint if

Â which is responsible for counting the number of split inversions.

Â So we're just going to augment this as follows so

Â instead of being called count now we're going to call it sort and count.

Â That's going to be the name of our algorithm.

Â The recursive calls, again, just invoke sort and count.

Â And so now we know each of those will not only count the number of inversions in sub

Â array, but also return a sorted version.

Â So, out from the first one we're going to get arrayed B back which is the sorted

Â version of the array that we past it and we got the sorted array C back from

Â the second recursive call or sort of version of the array that we past it.

Â And now, the counts split inversions now, in addition to count and split inversions

Â is responsible for merging the two sort of subarrays, B and C.

Â So CountSplitInv will be responsible for

Â outputting an array D, which is a sorted version of the original input array A.

Â And so I should also rename our unimplemented subroutine to reflect its

Â now more ambitious agenda.

Â So we'll call this mergeAndCountSplitInv.

Â 3:10

So you should again at this point have the question why are we doing this?

Â Why are we just making ourselves do more work?

Â And again the hope is that the payoff is somehow counting split inversions becomes

Â easier by asking our recursive calls to give some additional work of sorting.

Â So to develop some intuition for why that's true.

Â Why merging naturally uncovers the number of splits inversions.

Â Let's recall with just the definition of the original merge subroutine from

Â merge sort was.

Â So here's the same pseudocode we went through several videos ago.

Â I have renamed the letters of the arrays to be consistent with

Â the current notation.

Â So we're given two sorted subarrays.

Â These come back from a recursive calls.

Â I'm calling them B and C.

Â They both have length n/2 and were responsible for

Â producing the sorted combination of B and C so that's an output array D of length n.

Â And again the ideas simple, you just take the two sorted sub-arrays B and C and

Â then you take the output array D which you're responsible for populating.

Â And using an index k you're going to traverse the output D from left to right.

Â That's what this outer form here does and you're going to maintain pointers i and

Â j to the sorted sub arrays B and C respectively.

Â And, the only observation is that whatever the minimum element that you haven't

Â copied over to D yet is, it's got to be either the left most element of B that

Â you haven't seen yet or the left most element of C that you haven't seen yet.

Â B and C by virtue of being sorted,

Â the minimum element remaining has to be the next one available to either B or C.

Â So you just proceed in the obvious way.

Â You compare the two candidates for the next ones that copy over.

Â You look at B(i). You look at C(j).

Â Whichever one is smaller, you copy over, so

Â the first part of the if statement is for when B contains the smaller one.

Â The second part of the else statement is for when C contains the smaller one, okay?

Â So, that's how merge works.

Â You go down B and C in parallel,

Â populating D in sorted order from left to right.

Â Now to get some feel for

Â what on Earth any of this has to do with the split inversions of an array,

Â I want you to think about an input array A that has the following property.

Â That has the property that there are no split inversions whatsoever.

Â So every inversion in this input array A is going to be either a left inversion, so

Â both indices are at most n/2, or a right end version.

Â So both indexes are strictly greater than n/2.

Â Now, the question is, given such an array A,

Â once you're merging at this step, what do the assorted subarrays B and

Â C look like for an input array that has no split inversions?

Â 5:37

The correct answer is the second one.

Â That if you have an array with no split inversions then everything in the first

Â half is less than everything in the second half, why?

Â Well, consider the contra-positive.

Â Suppose you had even one element in the first half which was bigger than

Â any element in the second half,

Â that pair of elements alone would constitute a split inversion, okay?

Â So if you have no split inversions then everything on the left is smaller than

Â everything in the right half of the array.

Â Now, more to the point, think about the execution of the merge subroutine

Â on an array with this property, on an input array A where

Â everything in the left half is less than everything in the right half.

Â What is merge going to do?

Â All right, just remember it's always looking for

Â whichever is smaller the first element of remaining in B or

Â the first element remaining in C and that's what it copies over.

Â When everything in B is less than everything in C everything in

Â B is going to get copied over in to the output array D before C ever gets touched.

Â Okay, so merge has an unusually trivial execution on input arrays with no split

Â inversions with zero split inversions First it just goes through B and

Â copies it over then it just concatinate C.

Â Okay, there's no interweaving between the two.

Â So, no split in versions means nothing it copied from C, until it absolutely has to,

Â until B is exhausted.

Â So, this suggests that, perhaps, copying elements over from the second sub-array

Â C has something to do with the number of split inversions in the original array,

Â and that is in fact the case.

Â So we're going to see a general pattern about copies from the second array C

Â through the output array, exposing split inversions in the original input array A.

Â So let's look at a more detailed example to see what that pattern is.

Â So let's return to the example in the previous video,

Â which is an array with six elements, ordered 1, 3, 5, 2, 4, 6.

Â So we do our recursive call and in fact, the left half of the array is sorted and

Â the right half of the array is already sorted.

Â No sorting was going to be done and

Â I'm actually going to get zero inversions for both our recursive calls.

Â Remember in this example it turns out all of the inversions are split versions.

Â So now let's trace through the merge sub routine invoked

Â on these two sorted subarrays.

Â And try to spot a connection with the number of split inversions in

Â the original six element array.

Â So we initialize indices i and

Â j to point to the first element of each of these subarrays.

Â So this left one is B and this right one is C and the output is D.

Â And the first thing we do is we copy the 1 over from B into

Â the top of array so 1 goes there and we advance this index over to the 3.

Â And here, nothing really interesting happens,

Â there's no reason to count on this split inversions and

Â indeed the number one is not involved at any split inversions, because you want it

Â smaller than all of the other elements and it's also in the first index.

Â Things are much more interesting when we copy over

Â the element 2 from the second array C.

Â And notice, at this point, we have diverged from the trivial execution that

Â we would see with an array with no split inversions.

Â Now we're copying over something from C before we've exhausted copying B.

Â So we are hoping this will expose some split inversions.

Â So we copy over the two and we advance the second pointer j into C and

Â the thing to notice is, this exposes two split inversions.

Â The two split inversions that involve the element two.

Â And those inversions are 3,2 and 5,2.

Â So why did this happen?

Â Well the reason we copied two over is because it's smaller than

Â all the elements we haven't yet looked at in both B and C.

Â So in particular 2 is smaller than the remaining elements in B, the 3 and the 5.

Â But also because B is the left array,

Â the indices of the 3 and the 5 have to be less than the index of this 2.

Â So, these are inversions, 2 is further to the right in the original input array, and

Â yet it's smaller than these remaining elements in B.

Â So, there are two elements remaining in B, and

Â those are the two split versions that involve the elements two.

Â So, now let's go back to the merging subroutines, and what happens next.

Â Well, next we'll make a copy from the first array and we sort of realize that

Â nothing really interesting happens when we copy it from the first array,

Â at least with respect to split in versions.

Â Then we copy the four over, and yet again,

Â we discover a split inversion, the remaining one, which is 5,4.

Â Again, the reason is, given that 4 was copied over before what's left in B,

Â it's got to be smaller than it, but by virtue of being in the rightmost array,

Â it's also not going to have a bigger index, so it's gotta be a split inversion.

Â Now the rest of the merge subroutines executes without any real incident.

Â Five gets copied over and we know copies from the left array are boring and

Â then we copy the six over and copies from the right array are generally interesting

Â but not if the left array is empty.

Â That doesn't involve any split versions.

Â And you will recall from the earlier video that these were the inversions in your

Â original array, 3252 and 54.

Â We discovered them all on an automated method by just

Â keeping an eye out when we copy from the right array C.

Â 10:30

So this is indeed a general principle so let me state the general claim.

Â So, the claim is not just in this specific example, in this specific execution.

Â But no matter what the inquiry is, no matter how many split inversion

Â there might be, the split inversions that involve an element of the second half of

Â the array are precisely those elements remaining in the first array

Â when that element gets copied over to the output array.

Â So this is exactly the pattern that we saw in the example.

Â What were, so on the right array C, we have the elements two, four and six.

Â Remember every split version has to, by definition,

Â involve one element from the first half and one element from the second half.

Â So the count for split inversions, we can just group them according to

Â which element of the second array that they involve.

Â So out of the two four and six,

Â the two is involved in the split up inversions three two and five two.

Â The three and

Â the five were exactly the elements remaining in B when we copied over two.

Â The split inversions involving four is exactly the inversion five, four and

Â five is exactly the element that was remaining.

Â In B when we copied over the four, there's no split inversions involving six and

Â indeed, the element B was empty when we copied the six over in the output array D.

Â So what's the general argument?

Â Well it's quite simple.

Â Let's just zoom in and

Â fixate on a particular element x that belongs to that first half of that array.

Â That's amongst the first half of the element.

Â And let's just examine which y's, so which elements of the second array, the second

Â half of the original input array, are involved in split inversions with x.

Â So there are two cases,

Â depending on whether x is copied over into the output array D before or after y.

Â Now if x is copied to the output before y, well

Â then since the output's in sorted order it means x has got to be less than y so

Â there's not going to be any split inversion.

Â On the other hand if y is copied to the output d before

Â x then again because we populate the left to right in sorted order,

Â that's got to mean that y is less than x.

Â Now x is still hanging out in the left array so it has a less index than y,

Â y comes from the right array so it's not a split inversion.

Â So putting these two together,

Â it says that the elements x of the array B that form split inversions with y

Â are precisely those that are going to get copied to the output array after y.

Â So those are exactly the number of elements remaining in B

Â when y gets copied over.

Â So that proves the general claim.

Â 12:48

So this slide was really the key insight.

Â Now that we understand exactly why counting split inversions is easy,

Â as we're merging together two sorted subarrays, it's a simple matter to

Â just translate this into code and get a linear time of notation

Â of a sub routine that both emerges and counts the number of split inversions.

Â Which then in the overall course of the algorithm we'll have n log n running time

Â just as in merge sort.

Â So, let's just spend a quick minute filling in those details.

Â So, I'm not going to write up the pseudo code.

Â I'm just going to write what you need to augment the merge pseudo code

Â discussed a few slides ago by in order to

Â count split inversion as you're doing the merging.

Â And this will follow immediately from the previous plan which indicated

Â how split version relate to the number of elements remaining in the left array

Â as you're doing the merge.

Â So the idea is the natural one, as you're doing the merging,

Â according to the previous pseudo code, of the two sorted subarrays you

Â just keep a running total of the number of split inversions that you encounter.

Â And so you've got your sorted subarray B, you've got your sorted subarray C.

Â You're merging these into an output array D, and

Â as you traverse through D and k goes from 1 to n, you just start out at zero and

Â increment it by something each time you copy over from either B or C.

Â So, what's the increment?

Â Well, what did we just see, we saw the copies involving B don't count,

Â we're not going to look at split inversions when to copy over from B,

Â only when we look at them from C, right?

Â Every split inversion involves exactly one element from each of B and C.

Â So, I may as well count them via the elements in C and

Â how many split inversions are involved with the given element of C,

Â well it's exactly how many elements of B remain when it gets copied over.

Â So, that tells us how to increment this running count.

Â And, it follows immediately from the claim on the previous slide that this

Â implementation of this running total counts precisely the number

Â of split inversions that the original input array A possesses.

Â And we'll call that the left inversions are counted by the first recursive

Â call of the right inversions are counted by the second recursive call.

Â Every inversion is either at left or right or

Â splitt that's exactly one of those three types.

Â So, with our three different subroutines, the two recursive ones and this one here,

Â we successfully count of all the inversions of the original input array.

Â So that's the correctness of the algorithm.

Â What's the running time?

Â We'll recall in mergesort, we began just by analyzing the running time of merge and

Â then we discussed the running time of the entire mergesort algorithm.

Â Let's do the same thing here briefly.

Â So what's the running time of the subroutine for this merging and

Â simultaneously counting the number of split inversions?

Â Well there's the work that we do in the merging, and

Â we already know that that's linear.

Â And then the only additional work here is incrementing this running count,

Â and that's constant time for each element of D, right?

Â Each time we do a copy over we do some single addition to our running count.

Â So constant time for element of D, or linear time over all.

Â So, I'm being a little sloppy here.

Â Sloppy in a very conventional way but

Â it is a little sloppy by writing O(n) + O(n) = O(n).

Â Be careful when you make statements like that.

Â Right, so, if you added O(n) to itself n times, it would not be O(n),

Â but if you add O(n) to itself a constant number of times, it is still O(n).

Â So you might, as an exercise,

Â want to write out a formal version of what this means.

Â Basically there's some constant c1 so that the merge steps takes at most c1 n steps.

Â There's a constant c2 so that the rest of the work is at most c2 times n steps.

Â So when you add them, we get it's at most quantity c1 plus c2 times n steps,

Â which is still big O(n), because c1 plus c2 Is a constant, okay?

Â So, linear work for merge, linear work for the running count, so

Â does linear work in the subroutine overall.

Â And now, by exactly the same argument,

Â we'll use in merge sort because we have two reversing calls in half the size.

Â And with your linear work outside the recursive calls,

Â the overall running time is O(n) log n.

Â So, it really just piggybacked on merge sort upped to

Â the constant factor a little bit to do the counting along the way, but

Â the running time remains the big O(n log n).

Â