Not all programs are created equal. Â In this course, we'll focus on writing quality code that runs correctly and efficiently. Â We'll design, code and validate our programs and learn how to compare programs that are addressing the same task.

Loading...

From the course by University of Toronto

Learn to Program: Crafting Quality Code

279 ratings

Not all programs are created equal. Â In this course, we'll focus on writing quality code that runs correctly and efficiently. Â We'll design, code and validate our programs and learn how to compare programs that are addressing the same task.

From the lesson

Week 3

- Jennifer CampbellAssociate Professor, Teaching Stream

Department of Computer Science - Paul GriesAssociate Professor, Teaching Stream

Department of Computer Science

You've learned about two ways to search for a value in a list.

Â One of them, linear search, will work on any list, regardless of whether the lists

Â are sorted. Binary search requires that the list is

Â sorted but, as you've seen, it looks like it can be a lot faster than linear

Â search. Let's poke at these two algorithms a

Â little bit more, so that we can get a feel for how good they are and when we

Â would use one or the other. Computers play a lot of tricks in order

Â to make code really fast, so it's hard to know exactly what to attempt.

Â However, we do know that there is a comparison that happens each time through

Â the linear search while loop, and that there's an assignment statement.

Â So let's count these comparisons and the assignment statements that happen each

Â time through a loop. For linear search, we see that there are

Â two that happens each iteration. One comparison and one assignment

Â statement. Again, this is a rough, rough estimate of

Â what goes on in code, but it turns out that this rough estimate is going to be

Â good enough for us to conclude that binary search is just wildly faster than

Â linear search. At least when lists are large.

Â In binary search, we have a much simpler comparison.

Â But we'll still count it as just one. We have one assignment statement, a

Â second comparison, and then the result of that comparison will lead us to one

Â assignment statement or the other. So, we have one comparison here, one

Â assignment statement, another comparison. And then one of these is going to happen.

Â So in binary search, we actually have sort of four steps on each iteration of

Â the loop. Because the initialization and the if

Â statement, at the end of each, happen only once, no matter how many items there

Â are on the list. And there are about the same number of

Â steps in linearly search and binary search.

Â We're going to ignore these in our analysis.

Â In the linear search then, we figured there was roughly 2 steps per iteration,

Â and in binary steps there was roughly 4 steps per iteration.

Â We're going to compare these algorithms in the worst case.

Â That happens when the value is not in the list, for the linear search.

Â Here's an example for linear search if my list is 1, 2, 3, 4, 5, so there are 5

Â items in the list and am looking for value maybe 6.

Â Then I look here, here, here, here, and here.

Â And then I'm done. So that looked at all 5 items.

Â Let's make a table of the number of items and the number of iterations.

Â If there's one item, then there's one iteration.

Â If there's two items, there's two iterations.

Â In general if there are k items, there are k iterations.

Â And now for binary search. If there is one item, then on the first

Â iteration, we said m, we do a comparison, we move either b or e.

Â And then the loop terminates. Let's start building our table.

Â If there is one item then there's one iteration.

Â If there are two items, then were going to throw away one half of the first time

Â to the loop leaving us with one item. And we know how many steps that's going

Â to take. So if there were two items, it takes us

Â two iterations. Here's where things start to get kind of

Â interesting. I'm actually going to skip 3 and go right

Â to 4. If I have 4 items in my list, then in one

Â comparison, I'm throwing away about half. And that means that I'm, I end up with a

Â two item list. Well, it took me one step to get to this

Â point, so that means that I just have one more iteration than I had when I had two

Â items in my list. This means that I can process twice as

Â many items with just one more iteration. Using this knowledge, let's fill in a few

Â more rows of the table. 8s twice 4, 16 is twice 8, 32 is twice

Â 16, 64 is twice 32, and 128 is twice 64. With 128 values, we can find the index of

Â v in this list in only 8 iterations. Linear search would take 128.

Â Let's plot this on a graph. Our axes are the size of the list, or the

Â number of items, and the number of iterations.

Â If the list has 1 item in it then there is 1 iteration.

Â 2 items, 2 iterations. 4 items, 3 iterations.

Â 8 items and 4 iterations. When we double the size of the list, we

Â increment the number of iterations by 1. So thinking about this backwards, if I

Â have 128 items in my list, then after one comparison, one iteration of binary

Â search, I end up with only 64 items left to consider.

Â When I have 64 items left to consider, then in one iteration I'm throwing away

Â half of them so that after that iteration I only have 32 items to consider, and so

Â on. This is awfully powerful.

Â Mathematicians have a name for this function.

Â It's called the logarithm base 2 of the number of items.

Â Here's one more way to think about logarithms.

Â It's the number of times you can divide by 2 in order to reach 1.

Â So, how many times can I divide by 2 by 2 in order to reach 1?

Â Once. What about 4?

Â I can divide 4 by 2 to get 2. And 2 by 2 to get 1.

Â That was 2 divides. 8 is going to be 1 division by 2 you got

Â 4. 4 divided by 2 is 2.

Â 2 divided by 2 is 1, so that gives us 3 divisions.

Â 16 is 4 divisions. 32 is 5 divisions.

Â And so on. So, let's try to calculate the logarithm

Â of 1,024. Well, 1,024 is 512 times 2.

Â When we divide 512 by 2, we get 256. Then, 128.

Â Then 64, then 32, then 16, then 8, then 4, then 2, and then 1.

Â So there's one division, two, three divisions, four divisions, five

Â divisions, six divisions, seven divisions, eight divisions, nine

Â divisions, ten divisions. So the log base 2 of 1024 is going to

Â produce, what did I say, one, two, three, four, five, six, seven, eight, nine, ten.

Â Module math has a log function. This function returns the logarithm of a

Â number with a given base. Here are base is 2, because we are

Â dealing with binary. If I ask what the logarithm of 2 is in

Â base 2 it tells me 1. If I ask what the logarithm of 2, of 4 in

Â base 2 it tells me 2. 8 is 3, the logarithm is 16 and base 2 is

Â 4. The logarithm of 32 and base 2 is 5.

Â So let's see if we can get an impression for just how much faster binary search is

Â by looking at the number, well that's 1 million.

Â Let's do 1 billion. So I'm just going to delete these

Â comments, because I'm not allowed to use them when I'm writing an integer.

Â Now linear search on a billion items is going to take a billion iterations.

Â What is the logarithm of 1 billion, in base 2?

Â Just under 30. So, it's going to take 30 iterations of

Â binary search in order to find a number in a billion items.

Â Let's call a binary search and linear search on a long list of numbers, to see

Â just how different these 2 are. We'll search for a value we know is not

Â in the list. So, with 10 million items, binary search

Â is nearly instantaneous. Linear search is noticeably slower, here

Â we go. cProfile is a module that contains code

Â that allows us to profile other code. To profile means to measure in some way.

Â To see how much memory it uses, how much time it takes to run.

Â Function run in module cProfile takes a statement to execute, and then presents

Â profiling information. It uses something called exec, which

Â takes an object, which can be a string or a code object, and then executes it.

Â A code object is something we haven't encountered yet, nor do we need it

Â because we can just use a string. Let's try calling exec.

Â if we're, if we give it something that isn't a string, we're told that argument

Â one must be a string, lights, or code object.

Â Okay let's put 3 plus 4 inside quotes. There's no ever, but we still don't know

Â exactly what's going on. Well, 1 picks a statement, perhaps exec

Â can pick a statement as well, maybe an assignment statement.

Â When we do the function call, indeed variable x is created.

Â Let's try assigning a string to variable y.

Â Remember that l refers to a list with 10 million items in it.

Â Let's call function run in module cProfile with a call on binary search

Â looking in that list for a value that isn't there.

Â We see a table. There were 6 function calls.

Â It took very little time. There was one call that took the total

Â time of no seconds and it was on the string that we passed in.

Â There was one call on binary search, there was one call on exec, one call on

Â len, and one call on a method disable of the profile.

Â That last one is completely irrelevant to our analysis.

Â Here we see that in linear search there was one call on the function linear

Â search. It took nearly 3 seconds to complete.

Â There were ten million and two calls on method len.

Â That took nearly a second just by itself. That gives us a hint that we could

Â optimize linear search. Let's call len on l outside of the y

Â loop, and then use variable length everywhere instead of the call on len.

Â When we run this again and profile it, we see that we went from nearly 3 seconds

Â for a linear search down to just over 1 and a half.

Â That's a remarkable speedup. However, that doesn't even come close to

Â how fast binary search is. This is all if the list is sorted and all

Â in the worst case. In the best case, where we search for the

Â first item in the list, linear search is also blazingly fast.

Â Mathematical analysis of algorithms and profiling can give us significant insight

Â into how algorithms work and why they are so good or bad.

Â This analysis of algorithms is actually a discipline within computer science.

Â It's a sub-field of computer science and people study this kind of thing for a

Â living.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.