0:16

How do we, we do not implementing the API?

Â The API says we should just be able to create a stack and

Â it should be able to grow and shrink to any size.

Â So how are we going to grow and shrink the array?

Â 0:28

Well, first thing you might think of is, when the client pushes a new item onto

Â the stack, increase the size of the array by 1, and when it pops,

Â decrease the size of the array by 1.

Â That's easy to code up, but

Â not worth it, because it's much too expensive to do that.

Â The reason is that you have to create a new array, size one bigger, and

Â copy all the items to that new array.

Â 0:50

So, inserting the first N items would take time proportional,

Â if the stack's of size N-1, it's going to take time N, N-2, time N-1.

Â So first N items would take the sum of the first N integers,

Â which we know is about N squared over 2.

Â Quadratic time to insert N items into a stack,

Â that kind of performance is unacceptable for

Â large problems, as we've seen, as we will see many times.

Â So the challenge is to do the resizing, but

Â somehow ensure that it happens infrequently.

Â So the well-known technique for doing that,

Â called repeated doubling, is to, when the array fills up,

Â create a new array of twice the size and copy all the items over.

Â Then we don't create new arrays all that often.

Â So here's an implementation of that.

Â We start with an array of size 1.

Â If we have a full stack, which we know by testing N,

Â which is the number of items in the stack versus the array length, then

Â we just resize the array into one of twice the length before inserting the item.

Â And how do we resize to a new capacity?

Â We create a new array of that capacity and just go ahead and copy our

Â current stack into the first half of that and then return it.

Â 2:24

And that will reset our instance variable,

Â which is our stack, to this new, bigger array.

Â [COUGH] So, the idea and the consequence of this is,

Â if you insert N items into an array, into a stack with this array representation,

Â the time will be proportional to N, not N squared.

Â And the reason is that you only create a new array every time it doubles.

Â But by the time that it doubles, you've inserted that many items into the stack.

Â So on average, it's like adding a cost of one to each operation.

Â So if you just calculate the cost of inserting the first N items,

Â you're going to have instead of the sum of the integers from to 1 to N,

Â you're going to have the sum of the powers of 2 from 1 to N.

Â And that'll give a total cost of about 3N.

Â 3:23

So that's, in array accesses for the copy, there's two array accesses.

Â So to insert N items, it's about three array accesses.

Â This plot is another way of looking at it,

Â which is the number of array accesses taken as you implement push operations.

Â Every time you hit a power of 2, you take that many array accesses, but in a sense,

Â you've already paid for them by putting those items on the stack.

Â 3:51

So that's called amortized analysis,

Â where we consider the total cost averaged over all operations.

Â And this is a fine example and useful example,

Â of amortized analysis to get efficiency in a stack implementation.

Â Now we have, what about the pop, we have to think about how to shrink the array.

Â 4:13

So, you might think, well, we doubled it when it was full,

Â why don't we cut it in half when it gets to be half full?

Â We don't want the array to get too empty.

Â 4:22

Well, that one doesn't exactly work because of a phenomenon called thrashing.

Â If the client happens to do push-pop-push-pop alternating when

Â the array is full, then it's going to be doubling, having, doubling,

Â having, doubling, having.

Â Creating new arrays on every operation.

Â Take time proportional to N for every operation, and

Â therefore quadratic time for everything.

Â So I don't want to do that.

Â 4:51

The efficient solution is to

Â wait until the array gets one-quarter full before you have it.

Â And that's very easy to implement.

Â We can just test if the array is one quarter full, if it is,

Â we resize it to half full.

Â And so then at that point, it's half full, and

Â it can either grow by adding stuff or shrink by subtracting stuff.

Â But there won't be another resizing array operation

Â until it either gets totally full or half again full.

Â 5:23

So the invariant of that is that the array is always between 25% and

Â 100% full, number one.

Â And number two, that every time you resize, you've already paid for

Â it in an amortized sense by inserting, pushing or popping.

Â 5:39

So here's just what happens to the array for

Â our small client example.

Â And you can see at the beginning, it doubles from one to two to four, but

Â once it gets to four, it stays, once it gets to eight, it stays at that size for

Â awhile even though there's some operations.

Â And it doesn't shrink back to four until after there's only two items in there,

Â and then it shrinks, and so forth.

Â So array resizing doesn't happen that often, but

Â it's a very effective way of implementing the stack API with

Â an array where the client does not have to provide the maximum capacity of the stack.

Â But still, we're guaranteed that the amount of the memory that we are use

Â is always only a constant multiple of the number of items actually on the stack.

Â So, the analysis now says that the average running time per

Â operation for whatever the sequence of operations is,

Â the average running time is going to be proportional to a constant.

Â 6:53

Now, there is a worst case, that is, at the point when the stack doubles,

Â it takes time proportional to N.

Â So it's not quite as good performance as we might like.

Â But the advantage that we get is very fast pushes and

Â pops, just access array and increment it, and very efficient for most operations.

Â And for many, many clients, that's an effective tradeoff to make.

Â 7:22

So what about memory usage?

Â Well, this is the analysis of memory usage for

Â stacks, and it's actually less memory than for strings.

Â The amount used is between 8N and 32N,

Â depending on how full the array is and

Â just a quick analysis of the amount of space that arrays take in Java.

Â So, again, this analysis is just for the stack itself, not for

Â the strings, which the client owns.

Â 7:56

So, what are the tradeoffs between using a resizing array versus a linked list?

Â Those are two different implementations of the same API,

Â and the client can use them interchangeably.

Â Which one is better?

Â In many situations, we're going to have multiple implementations of APIs,

Â and depending on properties of the client program,

Â we're going to have to choose which one is the better one to use.

Â So for linked lists, every operation takes constant time in the worst case,

Â that's a guarantee.

Â But we have to use a little extra time and space to deal with the links.

Â So it's going to be slower.

Â 8:52

And so for some clients, maybe that makes a difference.

Â Perhaps you wouldn't want to use a resizing-array implementation at

Â the moment that your plane's coming in for a landing.

Â You wouldn't want it to all of a sudden not implement some operation quickly.

Â If you need that kind of order, maybe in an internet switch where packets

Â are coming through at a great rate, you wouldn't want to be in a situation where

Â you're missing some data because something got slow all of a sudden.

Â So that's a tradeoff that the client can make.

Â If I want that guarantee, if I want to be sure that every operation's going to be

Â fast, I'll use a linked list.

Â And if I don't need that guarantee, if I just care about the total amount of time,

Â I'll probably use the resizing-array because the total will be much less,

Â because individual operations are fast.

Â So even with these simple data structures, we have really important

Â tradeoffs that actually make a difference in lots of practical situations.

Â