Lecture 14 | Programming Abstractions (Stanford)
-
0:00 - 0:07
-
0:07 - 0:10
-
0:10 - 0:27This presentation is delivered by the Stanford Center for Professional Development.
-
0:27 - 0:30Good afternoon. Apparently, Winter Quarter is over and it's now Spring.
-
0:30 - 0:34I don't know what happened to the weather, but I'm pretty happy about it. I hope you are too. Let's see. Where
-
0:34 - 0:39are we at? We are picking back up with the text. Looking at Chapter 7. Chapter 7 is gonna
-
0:39 - 0:43be the theme for pretty much the whole week. Talking about algorithms and
-
0:43 - 0:46different ways of analyzing them, formalizing their
-
0:46 - 0:50performance, and then talking about sorting as an example problem to kind of
-
0:50 - 0:55discuss in terms of algorithms and options for solving that problem.
-
0:55 - 0:57Okay. So let's talk about algorithms. Algorithm is
-
0:57 - 1:03one of the most interesting things to think about from a CS perspective. Kind of one of the most
-
1:03 - 1:04intellectually
-
1:04 - 1:08interesting areas to think about things is that often, right? We're trying to solve a particular
-
1:08 - 1:08problem.
-
1:08 - 1:09We want to
-
1:09 - 1:13put this data into sorted order. We want to count these scores and place them into a
-
1:13 - 1:14histogram.
-
1:14 - 1:19We want to search a maze to find a path from the start to the goal.
-
1:19 - 1:20For any of those tasks,
-
1:20 - 1:24right? There may be multiple ways we could go about solving that problem
-
1:24 - 1:28that approach it from different angles, using different kinds of data structure, solving
-
1:28 - 1:31it in forward versus solving it in reverse, is it easier to get from
-
1:31 - 1:35the goal back to the start? In the end the path is invertible, so maybe it doesn't matter which
-
1:35 - 1:36end we start from.
-
1:36 - 1:39Would it be better to use an array for this or a vector? Would a set help us
-
1:39 - 1:41out? Could we use a map?
-
1:41 - 1:45Could I use iteration? Could I use recursion? Lots of different ways we could do this. Often sometimes the
-
1:45 - 1:47decision-making will be about, well,
-
1:47 - 1:52do I need to get the perfect best answer? Am I willing to take some approximation? Some estimation
-
1:52 - 1:54of a good answer
-
1:54 - 1:58that gives me a rough count of something that might be faster to do than doing a real
-
1:58 - 1:59precise count?
-
1:59 - 2:03So what we're gonna be looking at, right? Is taking a particular problem, sorting is gonna be the one
-
2:03 - 2:05that we spend the most time on,
-
2:05 - 2:08and think about, well, some of the different ways we can go about putting
-
2:08 - 2:09data in sorted order
-
2:09 - 2:13and then talk about what the strengths and weaknesses of those algorithms are compared
-
2:13 - 2:14to one another.
-
2:14 - 2:16Do they differ in how much
-
2:16 - 2:21time it will take to sort a particular data set? Does it matter if the data is almost sorted?
-
2:21 - 2:22Does it actually
-
2:22 - 2:26affect the performance of the algorithm? What happens if the data set is very large? How much memory is
-
2:26 - 2:28gonna be used by it?
-
2:28 - 2:30And we'll be thinking about as we do this, right? Is often
-
2:30 - 2:34that probably the most important thing is how does it run? How much time does it take?
-
2:34 - 2:37How much memory does it use? How accurate is its answer?
-
2:37 - 2:38But also
-
2:38 - 2:41given that brainpower is often in short supply,
-
2:41 - 2:45it's worth thinking about, well, is the code easy to develop, to write, to get correct, and kind
-
2:45 - 2:46of debug?
-
2:46 - 2:49It may be that a simpler algorithm
-
2:49 - 2:53that doesn't run as fast, but that's really easy to get working, might actually be the one you
-
2:53 - 2:57need to solve this particular problem and save the fancy thing for some day when really
-
2:57 - 3:00you needed that extra speed boost.
-
3:00 - 3:04So I'm gonna do a little brainstorming exercise with you. Just to kind of get you thinking about
-
3:04 - 3:08there's lots of different ways to solve what seems like the same problem.
-
3:08 - 3:12All right. I'd like to know how many students are sitting in this room right now.
-
3:12 - 3:14All right. I look around.
-
3:14 - 3:17Michelle,
-
3:17 - 3:18what's the easiest way
-
3:18 - 3:20and most reliable way
-
3:20 - 3:28to just get a count, an accurate count, of everybody in this room? Start by identifying what kind of
-
3:28 - 3:31[inaudible]. Yeah. Just have a strategy. Some people will use the front row. And I see one,
-
3:31 - 3:35two, three, four, five and then I go back and I just work my way through the room
-
3:35 - 3:37all the way to the end, right?
-
3:37 - 3:41What I've done, if I actually remember my grade school counting, right? Then I should have
-
3:41 - 3:42come up with a number
-
3:42 - 3:45that matched the number of people in the room
-
3:45 - 3:48and tells me, okay, there's 100 people, let's say, that are here.
-
3:48 - 3:52So that will definitely work. That's the easiest most ways to, sort of, likely approach you'd think
-
3:52 - 3:54of. You need to count something, well, then count them, right?
-
3:54 - 3:56Now, I'm gonna talk about ways that we could do it
-
3:56 - 4:00that maybe aren't so obvious, but that might have different properties
-
4:00 - 4:05in terms of trading it off. Let's say that the goal was to count all of the people in Stanford Stadium,
-
4:05 - 4:08right? And so I've got a whole bunch of people sitting there.
-
4:08 - 4:11The count in turn walking around the stadium doesn't
-
4:11 - 4:13scale that well.
-
4:13 - 4:16However long it took me to work my way through the rows in this room, right?
-
4:16 - 4:20When you multiple it by the size of the Stanford Stadium is gonna take a long
-
4:20 - 4:23time and as people get up and go the bathroom and move around and whatnot I might
-
4:23 - 4:25lose track of where I'm at and
-
4:25 - 4:29all sorts of complications that doesn't really work well in the large.
-
4:29 - 4:31So here's another option. Anyone
-
4:31 - 4:35else want to give me just another way of thinking about this? Yeah? Multiply
-
4:35 - 4:36yourself
-
4:36 - 4:38into more Julies.
-
4:38 - 4:40Because we can only ? one Julie's not really enough, right? So
-
4:40 - 4:44recursion. Can recursion help us out here, right? Can I do some delegating,
-
4:44 - 4:47right? Some subcounting to other people?
-
4:47 - 4:50So in the case of this room, it might be that I pick somebody to count the left side, somebody to count
-
4:50 - 4:56the right side, and somebody to count the middle. And maybe even they themselves decide to use a little delegation in
-
4:56 - 4:57their work and say, well, how about I
-
4:57 - 5:02get some volunteers on each row? That's sort of idea would work extremely well in the Stanford Stadium
-
5:02 - 5:05as well because you just divide it up even further. You say, well, here's this row, this
-
5:05 - 5:06row,
-
5:06 - 5:11that row and kind of having all of these delegates working in parallel to count the stadium, right? Getting recursion to kind of
-
5:11 - 5:14solve that same problem.
-
5:14 - 5:15What if I wanted
-
5:15 - 5:18? I was willing to tolerate a little bit of inaccuracy?
-
5:18 - 5:22What if I wanted an estimate of the people that were in this room and I was willing
-
5:22 - 5:24to accept a little bit of sampling or
-
5:24 - 5:27estimation error? Maybe you could take,
-
5:27 - 5:29like, a
-
5:29 - 5:35little area of seats and count how many people are in those seats and then multiply it by how many of
-
5:35 - 5:36those areas are in the room?
-
5:36 - 5:40Yeah. So it's like this capacity. Somewhere there's a sign, right? On one of the doors here that
-
5:40 - 5:43will say, oh, this room seats 180 people.
-
5:43 - 5:46So I know that there are 180 seats without doing any counting.
-
5:46 - 5:48So if I took a section.
-
5:48 - 5:50So I take ten seats, let's say,
-
5:50 - 5:55or 18 seats for that matter. I take 18 seats and I count those 18 seats how many are occupied
-
5:55 - 5:59and let's say half of them of the 18 that I look at were occupied then I can say, well, the room's
-
5:59 - 6:04about half full, right? And then I say, well, okay, 180 seats, half of them, right? Means 90 people are here.
-
6:04 - 6:07So in that case it's not guaranteed to be accurate, right? If I happen to pick a particularly
-
6:07 - 6:11dense or sparsely populated section, right? I'd be getting some sampling error based on
-
6:11 - 6:14that, but that would also work in the Stanford Stadium, right? We know how many seats are in the Stanford
-
6:14 - 6:17Stadium, right? You pick a section
-
6:17 - 6:20and you count it. You count 100 seats worth to get a percentage of how full
-
6:20 - 6:24it is and then just multiply it out to see what you get.
-
6:24 - 6:28There's a variation on that that's kind of a ? maybe not the first thing that would occur to you, right? Is just that
-
6:28 - 6:31the idea of counting some subsections is kind of interesting
-
6:31 - 6:35as a way to kind of divide the problem and then say, well, in the small can I do some
-
6:35 - 6:37counting that will then scale up?
-
6:37 - 6:39So, for example, if I had everybody in this room and I said, okay, think of
-
6:39 - 6:43a number between one and ten. Everybody just think of a number independently on your own. Okay.
-
6:43 - 6:47If you were thinking of the number four, raise your hand.
-
6:47 - 6:51I got one, two, three, four, five, six, seven. Seven people. And said, okay, well, if I figure that everybody
-
6:51 - 6:55was just as likely to pick a number between one and ten as four,
-
6:55 - 6:56then those seven people, right?
-
6:56 - 7:01Represent 10 percent of the population. So maybe there were 70 of you. Again, totally based on randomness
-
7:01 - 7:05and sampling error, right? If I happen to pick a number that's very unpopular with my crowd, right?
-
7:05 - 7:08Or very popular, I get kind of an artificial estimate.
-
7:08 - 7:11But it does tell you a little bit about the nature of just solving this
-
7:11 - 7:15problem. It's like how much error am I willing to tolerate and how much time I'm willing to invest in it, how big
-
7:15 - 7:17is the problem I'm trying to solve?
-
7:17 - 7:19What kind of things can I do that might,
-
7:19 - 7:22in the end, give me some estimate
-
7:22 - 7:25or accurate count, right? Depending on what I'm willing to invest in the time
-
7:25 - 7:28spent.
-
7:28 - 7:31So random thought for you. Here's another one. I can take the access
-
7:31 - 7:33class list, right?
-
7:33 - 7:35And I can take the first ten people
-
7:35 - 7:37and call out their names and find out if they're here
-
7:37 - 7:40and on that, right? I would get this estimate of, well, what percentage of my students actually
-
7:40 - 7:41come?
-
7:41 - 7:44So if my class list says that there's 220 students, which it does because where are
-
7:44 - 7:45they?
-
7:45 - 7:48And I take the first of ten students, right? And I call them out and I discover three of them are
-
7:48 - 7:52here I can say, oh, about 30 percent of my 220 students
-
7:52 - 7:54come, 66 of you.
-
7:54 - 7:57And then I would also have the advantage of knowing who was a slacker and I could write it
-
7:57 - 7:59down in my book.
-
7:59 - 8:05No, no, no. I'm sure you guys are all at home watching in your bunny slippers. All right. So
-
8:05 - 8:07let's talk about in terms of computer things. Like
-
8:07 - 8:12that the situation often is you have a problem to solve, right? You need to do this thing.
-
8:12 - 8:14You need to solve this maze or
-
8:14 - 8:15
-
8:15 - 8:20compute this histogram or find the mode score in your class or something like that
-
8:20 - 8:23and you have these different options about how you might proceed. Which data structure you use
-
8:23 - 8:25and how you're gonna approach it and stuff like that.
-
8:25 - 8:29Certainly one way to do it, and not a very practical way to do it, would be to say, well, I'll
-
8:29 - 8:32just go implement the five different ways I could do this. For example, when I was running your
-
8:32 - 8:36maze assignment that's what I did. I wrote, like, ten different maze solvers until I decided which one I was
-
8:36 - 8:37gonna give you.
-
8:37 - 8:41That's not very practical, right? If your boss said to you, you know, you need to solve task A
-
8:41 - 8:45and you're response was I'll solve it ten different ways and then I'll come back to you with the
-
8:45 - 8:46best one.
-
8:46 - 8:48That might take more time than you've got.
-
8:48 - 8:49
-
8:49 - 8:52So you would have to write it, you'd debug it, you'd test it, and then you could kind of time it using
-
8:52 - 8:56your stopwatch to find out how good did it do on these data sets.
-
8:56 - 9:00Yeah. Well, there's some issues with that. First of all, you have to go do it, right?
-
9:00 - 9:01Which is kind of a hassle.
-
9:01 - 9:04And then also it's subject to these variations, like, well, what computer were you running on? What
-
9:04 - 9:07OS were you running? What other tasks were running? Like did it ?
-
9:07 - 9:10it may not be completely reliable whatever results you saw.
-
9:10 - 9:14What we're gonna be more interested in is doing it in a more formal abstract way, which is just analyzing
-
9:14 - 9:18the algorithm from a pseudocode standpoint. Knowing what the code does and the steps that
-
9:18 - 9:19it takes, right?
-
9:19 - 9:24What can we predict about how that algorithm will behave? What situations will it
-
9:24 - 9:28do well? What situations will it do poorly? When will it get the accurate answer or an estimate?
-
9:28 - 9:31How much time and space can we
-
9:31 - 9:34predict it will use based on the size of its input?
-
9:34 - 9:37That would tell us based on just some descriptions of the algorithm
-
9:37 - 9:41which one might be the best one that's gonna work out in practice. Then we can go off and really implement
-
9:41 - 9:44the one that we've chosen.
-
9:44 - 9:49So this is actually working more the mathematical sense, less on the stopwatch sense. But
-
9:49 - 9:55helps us to analyze things before we've gone through all the process of writing the code. So what I'm gonna do
-
9:55 - 10:04with you today is a little bit of just analysis of some code we didn't write to talk about how it behaves and then we're gonna go on and talk about the sorting problem and some of the options for that.
-
10:04 - 10:09So this one is a function that converts the temperature. You give it a temperature in Celsius and
-
10:09 - 10:11it converts it to the Fahrenheit equivalent by
-
10:11 - 10:14doing the multiplication and addition.
-
10:14 - 10:19So the basic, sort of, underpitting of kind of looking at it formally is to basically realize
-
10:19 - 10:22that we're gonna count the statements that are executed.
-
10:22 - 10:25Assuming, right? In this very gross overgeneralization
-
10:25 - 10:26that each
-
10:26 - 10:29action that you take costs the same amount.
-
10:29 - 10:33That doing the multiply and the divide and the addition and a return, that
-
10:33 - 10:35all those things are, like, one unit. In this
-
10:35 - 10:37case, I'm gonna call it a penny, right?
-
10:37 - 10:39It took a penny to do each of those operations.
-
10:39 - 10:43That's not really accurate to be fair. There's some operations that are more expensive at the
-
10:43 - 10:47low level than others, but we're gonna be pretty rough about our estimate
-
10:47 - 10:48here
-
10:48 - 10:50in this first step here.
-
10:50 - 10:54So there's a multiply, there's a divide, there's an add, there's a return, right? And so I have, like, four statements
-
10:54 - 10:59worth of stuff that goes into asking a temperature to be converted.
-
10:59 - 11:01The other thing you'll note is that, does it
-
11:01 - 11:03matter if the temperature is high or low? If
-
11:03 - 11:07we ask it to convert Celsius of zero, Celsius of 100, Celsius of 50
-
11:07 - 11:09it does the same amount of work no matter what.
-
11:09 - 11:13So actually that the ? for whatever inputs, right? You give it this function pretty much
-
11:13 - 11:16will take the same amount of time.
-
11:16 - 11:19That's good to know, right? There's certain functions that will be like that.
-
11:19 - 11:22So we would call this a constant time function. It's, like, no matter what you ask it to convert
-
11:22 - 11:24it takes about the same amount of time.
-
11:24 - 11:30That doesn't mean that it takes no time, but it is a reliable performer of
-
11:30 - 11:33relative inputs.
-
11:33 - 11:35So let's look at this guy.
-
11:35 - 11:38This is one that takes in a vector
-
11:38 - 11:40and then it sums all the values in the vector
-
11:40 - 11:46and then divides that sum by the total number of elements to compute its average. Okay.
-
11:46 - 11:50So, again, kind of using this idea that will everything you do cost you a penny?
-
11:50 - 11:52How much is it gonna cost you
-
11:52 - 11:56to make a call to the average function? Well, this
-
11:56 - 11:59is a little tricky because, in fact, there's a loop in here, right? That's executed a variable
-
11:59 - 12:02number of times depending on the size of the input.
-
12:02 - 12:05So first let's look at the things that are outside the loop, right? Outside the loop we do things
-
12:05 - 12:07like initialize the sum,
-
12:07 - 12:10we initialize I before we enter the loop, right?
-
12:10 - 12:14We do this division here at the bottom and this returns. There are about four things we do
-
12:14 - 12:19outside of getting into and then iterating in the loops. So just four things we pay no
-
12:19 - 12:20matter what.
-
12:20 - 12:23Then we get into this loop body
-
12:23 - 12:28and each iteration through the loop body does a test against the I of the size to make sure that we're in
-
12:28 - 12:29range, right?
-
12:29 - 12:31Does this addition into the sum
-
12:31 - 12:33and then increments I.
-
12:33 - 12:36So every iteration has kind of three statements that happen
-
12:36 - 12:41for every element in the vector. Zero, one, two, three, and whatnot.
-
12:41 - 12:42So what we get here is
-
12:42 - 12:45a little bit of constant work that's done no matter what
-
12:45 - 12:49and then another term that varies with the size of the input. If the vector has ten elements,
-
12:49 - 12:50right? We'll do
-
12:50 - 12:5230 steps worth of stuff inside the loop.
-
12:52 - 12:55If it has 100 elements we do
-
12:55 - 12:56300.
-
12:56 - 12:57In this case, right?
-
12:57 - 13:05For different inputs, right? We would expect the average to take a different amount of time.
-
13:05 - 13:07Yeah? If n, the vector size
-
13:07 - 13:08MP though we still initialize I?
-
13:08 - 13:12We still initialize I. So I was actually counted in here, right? The init I, right? That was in there.
-
13:12 - 13:16We actually have one more test though. We do a test and then we don't enter the loop body. So there's
-
13:16 - 13:19actually one little piece maybe we could say that there's actually like three n plus five. There's one additional
-
13:19 - 13:20test
-
13:20 - 13:22that doesn't enter into the loop body.
-
13:22 - 13:23We're gonna see that actually that
-
13:23 - 13:27we're gonna be a little bit fast and loose about some of the
-
13:27 - 13:31things that are in the noise in the end and look more at the big leading term, but you are right.
-
13:31 - 13:36There is one more test than there are alliterations on the loop. That last test that has to fail.
-
13:36 - 13:38The idea of being that, right? We have
-
13:38 - 13:42three statements per thing. A little bit of extra stuff tacked onto it if we double the size of
-
13:42 - 13:47the vectors. So if we compute the average of the vector that has 100 elements and it took us two seconds.
-
13:47 - 13:50If we put in a vector that's 200 elements
-
13:50 - 13:51long,
-
13:51 - 13:53we expect that it should about double, right?
-
13:53 - 13:56If it took two seconds to do this half-sized vector,
-
13:56 - 14:00then if we'd have to do a vector that's twice as long we expect it's gonna take about
-
14:00 - 14:01four seconds.
-
14:01 - 14:04And that prediction is actually at the heart of what we're looking at today.
-
14:04 - 14:06Is trying to get this idea of,
-
14:06 - 14:10well, if we know something about how it performs relative to its input
-
14:10 - 14:14can we describe if we were to change the size of that input to make the maze much bigger or to make the
-
14:14 - 14:16vector much smaller?
-
14:16 - 14:19What can we predict or estimate about
-
14:19 - 14:28how much time and memory will be used to solve the problem using this algorithm?
-
14:28 - 14:33So here is one that given a vector computes the min and the max of the vector and it does it
-
14:33 - 14:34with two loops, right?
-
14:34 - 14:35One
-
14:35 - 14:40loop that goes through and checks each element to see if it's greater than the max it seems
-
14:40 - 14:40so far
-
14:40 - 14:42and if so it updates the max.
-
14:42 - 14:46And similarly almost identical loop here at the bottom
-
14:46 - 14:50that then checks to see if any successive element is smaller than the min it's seen so far and it
-
14:50 - 14:53updates the min to that one. Okay.
-
14:53 - 14:55So a little bit of stuff that happens
-
14:55 - 14:59outside the loop, right? We init I two times. We init the min and the max,
-
14:59 - 15:02so we've got, like, four statements that happen outside.
-
15:02 - 15:04And then inside the loop, right? We've got a test and
-
15:04 - 15:08an assignment, a comparison, an increment,
-
15:08 - 15:10and we have two of those loops, right?
-
15:10 - 15:12And so we have a
-
15:12 - 15:13little bit of stuff outside the loops,
-
15:13 - 15:17about eight instructions worth that happens per every element. So we look at it once to
-
15:17 - 15:21see if it's greater than the max. We actually look at everything again, right? To see if it's the
-
15:21 - 15:22min.
-
15:22 - 15:24I'm gonna use this actually
-
15:24 - 15:28as written, right? You might think, well, I could make this better, right? I could make
-
15:28 - 15:29this better by
-
15:29 - 15:34doing these two things together, right? Inside that loop, right? So that while I'm looking at each element
-
15:34 - 15:38I can say, well, if it's the max or if it's the min, right?
-
15:38 - 15:42If I look at it just once I can actually kind of do both those comparisons inside
-
15:42 - 15:43the loop
-
15:43 - 15:46and what we're gonna see is that in terms of kind of this analysis
-
15:46 - 15:48that's really not gonna make any difference.
-
15:48 - 15:52That whether we do a loop over the vector once and then we go back and loop over it again or whether we do
-
15:52 - 15:56twice as much stuff to each element inside one
-
15:56 - 15:58loop, it's for the purposes of the analysis we're looking at
-
15:58 - 16:01it ends up coming down to the same things.
-
16:01 - 16:04Which is, yeah, there is something that depends on the size of the input directly.
-
16:04 - 16:07Which is to say that if it were
-
16:07 - 16:12it will increase linearly with the change in size.
-
16:12 - 16:16So I have these numbers, like, four statements for the Celsius to Fahrenheit. Three n plus
-
16:16 - 16:18four. Eight n plus four.
-
16:18 - 16:19
-
16:19 - 16:23These tell us a little bit about, well, will get extremes always take more time than average
-
16:23 - 16:24or C to F?
-
16:24 - 16:25
-
16:25 - 16:28That given the way those numbers look it looks like, well, if you plugged in the same value of n
-
16:28 - 16:30you'd have the same size vector.
-
16:30 - 16:32That it should take more time
-
16:32 - 16:35to compute get extremes versus compute the average.
-
16:35 - 16:39We're gonna discover that, actually, we're not gonna, actually, try to make that guarantee.
-
16:39 - 16:44That what we're ? we're not really so much interested in comparing two algorithms
-
16:44 - 16:46and their constant factors to decide
-
16:46 - 16:50which of these two that kind of look about the same might be a little bit faster. What we're gonna try and
-
16:50 - 16:55look at it is giving you kind of a rough idea of the class an algorithm fits in. What its growth
-
16:55 - 16:56term, the kind of
-
16:56 - 17:00prediction of its growth, is. In this case, both averaging get extremes in terms of
-
17:00 - 17:05growth, both have a leading term that depends on n with some other noise and a little bit
-
17:05 - 17:06of a constant thrown in the front.
-
17:06 - 17:09That means that both of them, right? We'd expect to grow linearly
-
17:09 - 17:14in a straight line based on the change in n. So if you doubled the size of the thing
-
17:14 - 17:19then average would take twice as long as it previously did. So should get extremes. But the
-
17:19 - 17:23data point for does average of a vector of 100 elements take the same amount of time
-
17:23 - 17:26as get extremes of 100 is not what we're looking at predicting,
-
17:26 - 17:29right? We're trying to just tell you things about average against itself
-
17:29 - 17:31over different ranges of inputs.
-
17:31 - 17:36So if we know that average takes two milliseconds for a 1,000 elements. If we double the
-
17:36 - 17:40size of that we expect it to take twice as long, four milliseconds. If we increase it by a factor
-
17:40 - 17:45of ten we expect to increase the time by a factor of ten, right? So it should go up to 20 milliseconds. Get
-
17:45 - 17:45extremes
-
17:45 - 17:48might take more or less for a particular element.
-
17:48 - 17:50We expect it probably will take more
-
17:50 - 17:53because it does a little bit more work per element, but
-
17:53 - 17:54
-
17:54 - 18:01we really want to time it to be confident about that rather than making any estimation here. So in
-
18:01 - 18:04terms of what's called big O notation
-
18:04 - 18:05we're gonna see that we're going to
-
18:05 - 18:06
-
18:06 - 18:10kind of take those statement counts and we're gonna summarize them.
-
18:10 - 18:11
-
18:11 - 18:15That we can do a little bit of niggling to figure out what those numbers are, but what we're
-
18:15 - 18:16gonna then
-
18:16 - 18:16
-
18:16 - 18:22do is take the leading term, the largest term with the largest coeff ? value of the input number,
-
18:22 - 18:24in this case n.
-
18:24 - 18:26Ignore any smaller terms
-
18:26 - 18:28and then drop all the constants,
-
18:28 - 18:31all the coefficients, right? If you know it takes n over 2,
-
18:31 - 18:32it's just gonna be n.
-
18:32 - 18:34If you know that it takes
-
18:34 - 18:37ten statements, then it's gonna just be constant if there's no term of n in there. If it
-
18:37 - 18:43takes 10n then it's still n. 10n and 2n and 25n and 1/15n
-
18:43 - 18:44all end up just being n.
-
18:44 - 18:47So we might have this idea that we've estimated the time
-
18:47 - 18:51using n and having those constants in. Well, when it comes time to describe it in terms of big
-
18:51 - 18:52O,
-
18:52 - 18:54we focus on
-
18:54 - 18:55what's the ?
-
18:55 - 18:57oh, that,
-
18:57 - 19:00the subscript got lost on that. And that just makes no sense as it is right there. I will fix it
-
19:00 - 19:01in a second.
-
19:01 - 19:05So the 3n plus 5, right? The leading term is n, the coefficient 3
-
19:05 - 19:07gets dropped. It's just linear.
-
19:07 - 19:09We expect that as we change the input
-
19:09 - 19:13the time will change linearly with that change.
-
19:13 - 19:15The 10n minus
-
19:15 - 19:162,
-
19:16 - 19:20same class. O of n. Even though it didn't have kind of the same constants coming into it, right? It has the
-
19:20 - 19:23same growth pattern that they both are lines.
-
19:23 - 19:26The slope of them might be slightly different, but
-
19:26 - 19:28in terms of big O we're being
-
19:28 - 19:31pretty loose about what fits into this class.
-
19:31 - 19:341/2n squared minus
-
19:34 - 19:36n, the leading term here is the n squared.
-
19:36 - 19:40The n, subtract it off. When n is small n squared and n are actually kind of in range
-
19:40 - 19:41of each other,
-
19:41 - 19:45but then what we're looking at is in the limit. As n gets very, very large, n gets
-
19:45 - 19:49to be 1,000, n gets to be ? n squared is a much bigger number, right? When
-
19:49 - 19:52n is a million
-
19:52 - 19:56then n itself was and so as we get to these larger points kind of out into the limit this
-
19:56 - 20:00term is the only one that matters and this is just a little bit of the noise attached to it.
-
20:00 - 20:03So that's why we're gonna summarize down to that.
-
20:03 - 20:07What this one is supposed to say, and it actually is incorrect here, it just should be two to the n and n
-
20:07 - 20:12to the third power, which summarizes to two to the n. Two to the n grows much, much more
-
20:12 - 20:17rapidly than n cubed does and so as n gets to be a very large number. Even a fairly small number,
-
20:17 - 20:20you know, two to the tenth, right? Is
-
20:20 - 20:23all ready 1,000. Two to the 20th is a million,
-
20:23 - 20:24which is
-
20:24 - 20:28much bigger than these guys over here. So as we get to the long run it
-
20:28 - 20:31must be much bigger. I'm gonna put
-
20:31 - 20:38a note to myself though to fix that before I put that on the web. It should be O to the 2n? Yeah. It should
-
20:38 - 20:41be O to the 2n. So that should be two to the
-
20:41 - 20:43n, n to the third, and my two
-
20:43 - 20:46to the n there. My subscripts got blasted out of that.
-
20:46 - 20:47And so we're trying to
-
20:47 - 20:51describe this growth curve, like, in the limit, right? We're avoiding the details when they don't matter and they don't
-
20:51 - 20:53matter when n gets big enough, right?
-
20:53 - 20:55That only the leading term,
-
20:55 - 20:59right? Is predicting without this other stuff kind of just ignoring it.
-
20:59 - 21:01So this is very sloppy math,
-
21:01 - 21:05right? So those of you who are more trained in the, sort of, mathematical sciences may find this
-
21:05 - 21:09a little bit alarming. Which is just how kind of fast and loose we're gonna be. The only way all these terms left and
-
21:09 - 21:13right just kind of summarizing down to, okay, here's this leading term and how it relates to n.
-
21:13 - 21:14Everything else, right?
-
21:14 - 21:19Totally uninteresting. We could have a function, for example, that did 1,000 conversions of Celsius to
-
21:19 - 21:20Fahrenheit,
-
21:20 - 21:24but if every time you call it does 1,000 conversions that means no matter
-
21:24 - 21:28what input you give to it doesn't change. That's considered O of one or a constant
-
21:28 - 21:31time. Similarly, right? For doing average. If we did an average that somehow
-
21:31 - 21:32
-
21:32 - 21:36operated over the vector one time and another one that did it ten times or 20 times
-
21:36 - 21:39looked at each element 20 times, right?
-
21:39 - 21:42They would still both be O of n. Whatever time they took,
-
21:42 - 21:47right? On a vector of a certain length we double that length. We'd expect to double the time.
-
21:47 - 21:50More formally, right? In terms of what the math really means
-
21:50 - 21:55is that we say that O of F of n, so O, if it's big O of some function,
-
21:55 - 22:00it describes an upper bound on the time required. Meaning that
-
22:00 - 22:02for sufficiently large
-
22:02 - 22:03values of n
-
22:03 - 22:06and some constant C that we get to pick that
-
22:06 - 22:10that would bound the curves. So if you imagine what the real curve looks like, it grows and
-
22:10 - 22:13maybe it flattens out or maybe it goes up very sharply.
-
22:13 - 22:17That what C of F of n gives you kind of some upper bound of that. A curve that stays above it
-
22:17 - 22:20at ? for at some point of n and the limit, right?
-
22:20 - 22:22Will dominate it from there to affinity.
-
22:22 - 22:26So describing kind of where it's gonna hit at that upper bound gives us some measure
-
22:26 - 22:32of what's going on. Okay.
-
22:32 - 22:34So we can use this to predict times.
-
22:34 - 22:37That's probably what's most valuable to us about it is knowing that
-
22:37 - 22:41I have an n ? a linear algorithm. So an O of n algorithm we'll call linear.
-
22:41 - 22:46And n squared algorithm I'll call quadratic. N to the third I'll called cubic, right?
-
22:46 - 22:50That's gonna tell us that those curves, right? You know what a line looks like. Well, you know what a parabola
-
22:50 - 22:54looks like coming out of an n squared where it's growing much more sharply and early kind of
-
22:54 - 22:57making the decision. So
-
22:57 - 22:58it might be, right? That
-
22:58 - 23:02if I know that it takes three seconds to do something for 5,000 elements then I have a linear
-
23:02 - 23:03algorithm.
-
23:03 - 23:07That 10,000 elements, right? Should take twice as long. 20,000 should take another,
-
23:07 - 23:10a factor of two on that. So 12 seconds worth. That
-
23:10 - 23:12I may have an n squared algorithm.
-
23:12 - 23:17So one that I expect to actually perform more poorly in the long run, right? This n squared
-
23:17 - 23:21versus n tells you that it's gonna more sharply grow. That
-
23:21 - 23:25for some values of n in simply 5,000 is the case where the n squared algorithm
-
23:25 - 23:26actually outperforms
-
23:26 - 23:31it in a small case. That's not uncommon actually.
-
23:31 - 23:32But, right?
-
23:32 - 23:36As it grows, right? As I get to larger and larger values of n
-
23:36 - 23:40the fact that it is going by factor of four, right? The square of
-
23:40 - 23:40the doubling
-
23:40 - 23:45as opposed to linear means it's gonna quickly take off. And so I take my 5,000 elements that
-
23:45 - 23:47took two and a half seconds.
-
23:47 - 23:50I put an input that's twice as large into it.
-
23:50 - 23:52It's gonna take a factor of four,
-
23:52 - 23:53right? From there.
-
23:53 - 23:56And if I go from 10,000 to 20,000 another factor of four is gonna bring that up
-
23:56 - 23:57to
-
23:57 - 23:5840 seconds or so
-
23:58 - 24:02compared to my more modestly growing linear algorithm there.
-
24:02 - 24:07So if I was facing some task, right? Where I had an option between a linear algorithm and a
-
24:07 - 24:09quadratic algorithm
-
24:09 - 24:12it's telling me that in the long run the quadratic algorithm
-
24:12 - 24:15for sufficiently large inputs, right?
-
24:15 - 24:18Is going to bog down in a way that the linear one
-
24:18 - 24:26will be our kind of strong performer in the larger cases.
-
24:26 - 24:27So some algorithms
-
24:27 - 24:30actually have a variable
-
24:30 - 24:32run time expectation
-
24:32 - 24:34that you cannot say
-
24:34 - 24:35
-
24:35 - 24:39with all assuredness how much time it will take because actually depending on the input and
-
24:39 - 24:43the characteristics of the input it may do more or less work. So average looks at every
-
24:43 - 24:47element in the vector and sums them all up and does the division. It doesn't matter what elements
-
24:47 - 24:48are in there.
-
24:48 - 24:50It's very reliable in that sense.
-
24:50 - 24:54Something like search. So this is a linear search function that given a vector of names
-
24:54 - 24:56and a particular key
-
24:56 - 24:58just walks the vector looking for a match. If
-
24:58 - 25:02it finds it, it returns true. If it exhaustively searched the whole thing and didn't find it, it returns
-
25:02 - 25:04false. Okay.
-
25:04 - 25:06So we've got what looks like
-
25:06 - 25:08an O of n loop
-
25:08 - 25:09that we'll look at the things,
-
25:09 - 25:11but there are some cases, right?
-
25:11 - 25:15Where it just doesn't do very much work at all. For example, if it finds the key in the very
-
25:15 - 25:16first spot,
-
25:16 - 25:18right? If I'm looking for
-
25:18 - 25:22Bob and Bob is the first thing in there, then I immediately return true and I do no more work.
-
25:22 - 25:26And so it doesn't matter whether Bob was followed by a million more names or ten more names, right?
-
25:26 - 25:31That, in fact, it is a constant time operation to just access that first element and return it.
-
25:31 - 25:34Similarly for other things that are close to the front. It looks at a few of them and
-
25:34 - 25:35it's done.
-
25:35 - 25:39And so we would call those the best case of the algorithm. So we can divide this into things. It's like,
-
25:39 - 25:41well, what of the
-
25:41 - 25:45best case input, right? What can we expect out of the performance? Well, the best-case input would
-
25:45 - 25:46be
-
25:46 - 25:48it's found in the first few members.
-
25:48 - 25:49In which case
-
25:49 - 25:52it's an O of one algorithm for those situations.
-
25:52 - 25:56That's not often very interesting to say, well, here's a particular input that causes it to immediately
-
25:56 - 25:58be able to calculate something in return.
-
25:58 - 26:02Yeah. Not so interesting. So it's worth knowing that such a thing exists, but it turns out it's
-
26:02 - 26:03not
-
26:03 - 26:04
-
26:04 - 26:08likely to tell us a lot about the general performance of this algorithm.
-
26:08 - 26:11If it's in the middle or it's in the last slot, right? We're
-
26:11 - 26:15gonna be looking at a lot of the elements to decide that it's there or not.
-
26:15 - 26:18In the absolute worst case,
-
26:18 - 26:21right? So what's the input that causes it to do the most amount of work
-
26:21 - 26:23is the one where it doesn't find it at all. Where
-
26:23 - 26:28it looks at every single element, never finding the match, and then comes out and returns that false.
-
26:28 - 26:31And the worst case, in this case, is a fairly likely thing to happen, right?
-
26:31 - 26:34We're searching because we happen to believe it might not be there
-
26:34 - 26:35and that gives us this upper bound on
-
26:35 - 26:40how bad it gets. So in the worst case it looks at everything and it is definitely O of
-
26:40 - 26:43n. So we have this kind of constant best case,
-
26:43 - 26:44an O of n worst case,
-
26:44 - 26:47and then our average case is gonna be somewhere in the middle there.
-
26:47 - 26:52This actually is a little bit harder to predict or to compute precisely in most
-
26:52 - 26:57situations because you have to know things about, well, what are the likely range of inputs?
-
26:57 - 27:01So typically the definition is that it's averaged over all the possible inputs.
-
27:01 - 27:06Well, given the search it's pretty hard to say what are all the possible inputs for it? It's
-
27:06 - 27:10like you can make some assumptions, like, well, all in ? all permutations of the list
-
27:10 - 27:12are equally likely
-
27:12 - 27:17and the name is in the list about half the time, let's say. You can just ? you have
-
27:17 - 27:22to make up some parameters that describe what you believe to be the likely inputs
-
27:22 - 27:26and then you can say, well, if it were in the list, if it's equally likely to be
-
27:26 - 27:27in the front as in the back,
-
27:27 - 27:30then on average it's gonna look at n over two.
-
27:30 - 27:35It's just as likely to be in all the front slots. So, let's say, if you were gonna call it n times
-
27:35 - 27:36and
-
27:36 - 27:40the name you were looking for was going to be in each individual slot exactly one
-
27:40 - 27:41time, then the total random
-
27:41 - 27:43perfect permutation case.
-
27:43 - 27:46So then it would have looked at n over two of them. Sometimes it looked at one, sometimes it looked
-
27:46 - 27:48at n minus one, sometimes as it looked at n
-
27:48 - 27:52over two, sometimes n over two plus one, and whatnot. That over all of them it looked at
-
27:52 - 27:53about half.
-
27:53 - 27:57And then if there was another set of calls to search for it, right? Where it never found it, it would be looking
-
27:57 - 27:58at
-
27:58 - 27:59n, right? And so
-
27:59 - 28:02you can say, well, sometimes we look at n over two, sometimes we look at n.
-
28:02 - 28:04On average we're looking at about three-quarters of them, right?
-
28:04 - 28:07In big O, since we get to be a bit sloppy here,
-
28:07 - 28:10we can say, well, this all just boils down to being linear. That O
-
28:10 - 28:14of n still described to the growth in the average case,
-
28:14 - 28:16
-
28:16 - 28:20which is the same number we got for the worst case. So this is a little
-
28:20 - 28:23bit tricky, but this is actually the one that's probably the most interesting, right? Which is just in
-
28:23 - 28:28normal practice, how is this thing gonna behave?
-
28:28 - 28:31So the last little thing I need to
-
28:31 - 28:34complete before we can go on and start applying this to do something interesting, is to talk
-
28:34 - 28:39a little bit about how we do recursive algorithms because they're a little bit trickier than the standard iteration
-
28:39 - 28:40and counting.
-
28:40 - 28:44So the counting statements are kind of saying, oh, I've got this loop followed by this loop and this thing
-
28:44 - 28:45happening, right?
-
28:45 - 28:49Will get you through the simple cases. Well, what do we do when we have a recursive algorithm, right? Well,
-
28:49 - 28:53we're trying to compute the time spent in solving something
-
28:53 - 28:57that, in affect, is making a call to the same algorithm and so we're gonna likely have
-
28:57 - 29:00some recursive definition we need to work through.
-
29:00 - 29:02So this is the factorial function.
-
29:02 - 29:06If n equals zero, return one, otherwise return n times the factorial n minus one.
-
29:06 - 29:08We're interested in doing kind of the statement counts or kind
-
29:08 - 29:11of summarizing the time for an input
-
29:11 - 29:13whose value is n.
-
29:13 - 29:17And so what we write down is what's called a recurrence relation
-
29:17 - 29:20that largely matches the code in terms of saying
-
29:20 - 29:21how much work is done
-
29:21 - 29:25in the base case? In the different cases that are being handled? Typically there is a base case
-
29:25 - 29:29or two where you say, well, if it's in these cases, right? We do these easy things.
-
29:29 - 29:31So if n is exactly zero
-
29:31 - 29:36then we do O of one worth of work, right? We do a little bit of constant work here. We
-
29:36 - 29:38do a comparison and a return.
-
29:38 - 29:42In the otherwise case when it is not zero,
-
29:42 - 29:46then we do a little bit of work. In this case, a return, a multiply, a little bit of constant work
-
29:46 - 29:48plus whatever time it takes to compute
-
29:48 - 29:51the factorial of n minus one.
-
29:51 - 29:54So we have T of n defined in terms of
-
29:54 - 29:58T of n of something else, right? Which is exactly what we'd expect in a recursive function.
-
29:58 - 30:00This is called a recurrence relation, so that
-
30:00 - 30:03solving for T of
-
30:03 - 30:05n means working with
-
30:05 - 30:09a side that refers back to that same T, but on a smaller version of the problem
-
30:09 - 30:15in n over two and n minus one. Some other variation of that recursive call.
-
30:15 - 30:19So let me show you how we make that go into closed form.
-
30:19 - 30:23The idea ? and actually I'm just gonna do this on the board because I think it's easier if I just write what's
-
30:23 - 30:26going on and you can watch me develop it.
-
30:26 - 30:28So I have this recurrence relation.
-
30:28 - 30:29Even
-
30:29 - 30:31equals one if
-
30:31 - 30:33n equals zero
-
30:33 - 30:39and then it's one plus T of n minus one otherwise. So I'm trying to
-
30:39 - 30:41solve for T of n
-
30:41 - 30:45into a closed form. So I'm gonna start with
-
30:45 - 30:50it's non-recur ? the non-base case form. So the recursive step. And then what I'm gonna do
-
30:50 - 30:54is I'm actually just gonna reply repeated substitution
-
30:54 - 30:56to take this term and expand it out.
-
30:56 - 31:00So I know it's one plus whatever it takes to do T of n minus one.
-
31:00 - 31:02So if I go back over here and I say, well, T of
-
31:02 - 31:03n minus one
-
31:03 - 31:04must be
-
31:04 - 31:06
-
31:06 - 31:08one if n minus one equals zero or it's
-
31:08 - 31:11one plus T of n minus two. So
-
31:11 - 31:13kind
-
31:13 - 31:17of plugging it into the original formula and getting the expansion one layer in.
-
31:17 - 31:19So I can say, well, this is
-
31:19 - 31:20one plus
-
31:20 - 31:23one plus T of n minus two. And if I apply
-
31:23 - 31:25that again,
-
31:25 - 31:27right? I should get another term of
-
31:27 - 31:35
-
31:35 - 31:44one. So after I have done this I times
-
31:44 - 31:46then I will have a bunch of ones added up together
-
31:46 - 31:49and I will have subtracted an I from T
-
31:49 - 31:51
-
31:51 - 31:52down to some term. So I'm just gonna ? each of
-
31:52 - 31:54these represents kind of like
-
31:54 - 31:57this is a little bit of work from the n, which does the n minus one, which brings the quality of my two.
-
31:57 - 32:01So each of the cul's kind of in the stack frame has a little bit of a
-
32:01 - 32:03contribution to add to the whole thing
-
32:03 - 32:04and then what we're looking at is
-
32:04 - 32:08how many times do we have to do that expansion and substitution, right?
-
32:08 - 32:10Before we hit this base case, right?
-
32:10 - 32:12Where T of n exactly equals one.
-
32:12 - 32:15So we're looking for the case where n
-
32:15 - 32:18actually it's n equals zero. So
-
32:18 - 32:20where n
-
32:20 - 32:23so I want to do this until n minus I equals equals zero.
-
32:23 - 32:27Okay. So I need to have done this I times where I is n.
-
32:27 - 32:32So if I say I set I equaled to
-
32:32 - 32:34n, then I'll have one plus one plus one
-
32:34 - 32:37n times,
-
32:37 - 32:39and then I have plus
-
32:39 - 32:41T of the n minus n over here,
-
32:41 - 32:43which is my T subzero.
-
32:43 - 32:47T subzero, right? Immediately plugs into that base case and says, well, there's just one more thing
-
32:47 - 32:48to do when you get to that
-
32:48 - 32:52and so what I basically have here is n plus one.
-
32:52 - 33:00So n multiplications plus one little tack on for the base case, which in terms of big O
-
33:00 - 33:03is just linear. A little bit
-
33:03 - 33:05of extra work in the noise there, but
-
33:05 - 33:10that means that kind of as it seems more predictable, right? That factorial over particular input,
-
33:10 - 33:11right? Is linear
-
33:11 - 33:17in the number you ask to compute the factorial. The factorial of ten takes ten
-
33:17 - 33:20multiplications, right? The factorial of 20 takes 20 multiplications.
-
33:20 - 33:24So if you change the size of your input, right? You double it; it should take twice as long. However much
-
33:24 - 33:28time it cost you to compute the factorial of ten is gonna take twice as much time to compute
-
33:28 - 33:32the factorial of 20. Okay. That kind of
-
33:32 - 33:36makes sense, but it's good to know kind of how I can do this math,
-
33:36 - 33:38right? To work this out, right? So this idea of
-
33:38 - 33:45taking the term, repeatedly substituting and expanding, generalizing my pattern, and say, well, after I substitutions
-
33:45 - 33:46worth where am I at?
-
33:46 - 33:48And these correspond and kind of thinking about
-
33:48 - 33:53the recursion tree. What calls are made and then how deep does the recursion
-
33:53 - 33:55go before I hit the base case
-
33:55 - 33:58and that tells me how to
-
33:58 - 34:08stop that expanding and then substitute back in for the base case to compute my total result. I'm gonna do another one, so
-
34:08 - 34:10if you want to just
-
34:10 - 34:12watch and we'll do
-
34:12 - 34:13the math for this together, too.
-
34:13 - 34:17This is the Towers of Hanoi example
-
34:17 - 34:18that moves the
-
34:18 - 34:20tower away of n minus one off,
-
34:20 - 34:25moves the single disk on the bottom and then moves that tower back on.
-
34:25 - 34:37And so the recurrence that we're working with
-
34:37 - 34:42is one when n equals zero. So when n equals zero, right? We have a zero height tower to move and we
-
34:42 - 34:46actually do no work in the function, right? We just do that test and return, so if n equals zero
-
34:46 - 34:47there's no work to be done.
-
34:47 - 34:50So that's the easy case for us, right? Is
-
34:50 - 34:53when it's zero
-
34:53 - 34:54do no work.
-
34:54 - 34:55Otherwise,
-
34:55 - 34:57right? We move the bottom most disk.
-
34:57 - 35:01So we do a little testing and moving of that disk. We'll call that the constant amount of work that's in the function
-
35:01 - 35:02itself.
-
35:02 - 35:04And it makes two recursive calls
-
35:04 - 35:05each of a tower
-
35:05 - 35:10of height n minus one.
-
35:10 - 35:12So otherwise, right?
-
35:12 - 35:15Two calls, which gives a little clue to what the tree looks like. It'll branch twice
-
35:15 - 35:20at each stop and then it's one closer to that base case gives us a sense that the recursion
-
35:20 - 35:21depth
-
35:21 - 35:24is likely to be linear here.
-
35:24 - 35:26So let me go through the process of making this work.
-
35:26 - 35:30I've got T of
-
35:30 - 35:33n equals one plus two, T to the n minus one. So then
-
35:33 - 35:35I
-
35:35 - 35:37take T to the n minus one
-
35:37 - 35:39and I plug it in over here
-
35:39 - 35:42to get one plus two
-
35:42 - 35:44T to the n minus two.
-
35:44 - 35:47This whole thing is in a -
-
35:47 - 35:51multiplied by two though because I have two of those, right? From the original call
-
35:51 - 35:53which then itself made two. So, in fact, if I multiply through,
-
35:53 - 35:55right? I've got one plus
-
35:55 - 35:59two plus four
-
35:59 - 36:01T to the n over two. If
-
36:01 - 36:03I apply my substitution again, one plus two plus
-
36:03 - 36:06four times T to
-
36:06 - 36:09the n minus two is one plus two, T
-
36:09 - 36:12to the n minus three,
-
36:12 - 36:13and then let the
-
36:13 - 36:21multiplication go through again. One plus two plus four plus eight T to the n
-
36:21 - 36:22minus three.
-
36:22 - 36:26And so each expansion of this, right? Is causing the number of towers
-
36:26 - 36:31that are being moved around to go up by a factor of two. So each time I do this,
-
36:31 - 36:32right? I went from
-
36:32 - 36:36two towers to move to four towers to eight towers, but those towers each got one shorter
-
36:36 - 36:41in the process. So kind of telling us a little bit about what the recursion tree looks like, right?
-
36:41 - 36:42Is there is a
-
36:42 - 36:48branching factor of two all the way down and that
-
36:48 - 36:50the depth of this thing
-
36:50 - 36:51is gonna bottom out linearly
-
36:51 - 36:57because this was a tower by ten, nine, eight, seven, and so on down to the bottom.
-
36:57 - 36:59So if I imagined this happened I times,
-
36:59 - 37:03so to generalize my pattern. I've got one plus two
-
37:03 - 37:06plus four plus two
-
37:06 - 37:09to the I. So after I've done this
-
37:09 - 37:12that number of times, right? Actually it's two I minus one plus two to
-
37:12 - 37:17the I, T n minus I.
-
37:17 - 37:22So I have subtracted I off the heights of the tower.
-
37:22 - 37:26I have gone up by a factor of two each time I did that and then I have these sums in the front,
-
37:26 - 37:30which represent kind of the single disk that got moved in there. One plus two plus four
-
37:30 - 37:32plus eight plus so on
-
37:32 - 37:33down to there.
-
37:33 - 37:37And so the place I want to get to, right? Is where n equals zero.
-
37:37 - 37:39So I actually want to set
-
37:39 - 37:44n, I equal to n here. I wrote that backwards, let's say I equals n. I plug that in I've
-
37:44 - 37:49got one plus two plus four plus all the powers of two
-
37:49 - 37:52to the n minus one plus
-
37:52 - 37:54two to the n, T to the n minus
-
37:54 - 37:55n, which is T to the zero,
-
37:55 - 37:58which is just one.
-
37:58 - 38:05So what I've got here
-
38:05 - 38:11is following along.
-
38:11 - 38:13Is T to the n is one plus two plus four
-
38:13 - 38:14plus two to the n, right?
-
38:14 - 38:18So two
-
38:18 - 38:19to the n
-
38:19 - 38:20minus
-
38:20 - 38:23one plus two to the n times one.
-
38:23 - 38:25So I've got the geometric sum here.
-
38:25 - 38:26You may or may not
-
38:26 - 38:30all ready know how to solve this one, but I'm just gonna go ahead and
-
38:30 - 38:34solve it in front of you to remind you of how the process works. I've got the powers of two.
-
38:34 - 38:36One plus two plus up to the n.
-
38:36 - 38:39So that's the term I'm looking for. I want to call this A
-
38:39 - 38:40just to mark it.
-
38:40 - 38:45And then what I'm gonna compute for you is what two times A is
-
38:45 - 38:50and I'm gonna write it underneath. So if I multiply this by two I have one times two, which is two.
-
38:50 - 38:52Two times two, which is four,
-
38:52 - 38:53right? Four times four
-
38:53 - 38:55at two, which is eight.
-
38:55 - 38:56And so on.
-
38:56 - 38:59And so I should get
-
38:59 - 39:02basically the same sequence of terms, but shifted over by one.
-
39:02 - 39:04I don't have the factor of one
-
39:04 - 39:06and I do have an additional factor of two to the
-
39:06 - 39:11n times another power of two at the end. So I've got the whole term kind of shifted
-
39:11 - 39:15over by one. That's my two A so that's the same thing I'm looking for, but doubled.
-
39:15 - 39:17And then I'm gonna basically take this thing
-
39:17 - 39:19and I'm gonna
-
39:19 - 39:23add negative A to two A. So subtracting A off of two A
-
39:23 - 39:24so that all these terms
-
39:24 - 39:28will go away. If I have this minus this, this minus this all the way across, right?
-
39:28 - 39:30The terms I'll be left with
-
39:30 - 39:32is two to the n plus one
-
39:32 - 39:35minus one is
-
39:35 - 39:37A subtracted from two A so the
-
39:37 - 39:38two to the T to
-
39:38 - 39:41the n that I'm looking for
-
39:41 - 39:43is two to the n plus one minus one.
-
39:43 - 39:45That's my term there. And
-
39:45 - 39:51if I summarize in my big O way,
-
39:51 - 39:56ignoring the constants, throwing away these little parts of the terms, right? That are in the noise. That
-
39:56 - 39:58what we really have here is something that grows,
-
39:58 - 40:00right? Exponentially
-
40:00 - 40:04by about a factor of two for each additional
-
40:04 - 40:05
-
40:05 - 40:07disk added to the tower.
-
40:07 - 40:10So however much time it took to move a tower of height ten,
-
40:10 - 40:15right? A tower of height 11 we expect to take twice as long. So not growing linearly at all, right?
-
40:15 - 40:18Much, much, much sharply growing, right? What exponential
-
40:18 - 40:19growth looks like.
-
40:19 - 40:22Two to the n, right? Even for small values of n.
-
40:22 - 40:27Ten, 20, right? Is up into the millions all ready in terms of disks that need to be
-
40:27 - 40:30moved. So
-
40:30 - 40:33this sort of strategy, right? Is good to know. When you're looking at this recursive things, right?
-
40:33 - 40:37Kind of having this analysis. It says, okay, well, where did I start from?
-
40:37 - 40:41What's the time? And then just being pretty mechanical. Sometimes you can get to something here that
-
40:41 - 40:45requires a little bit of algebra to work out, but the most common patterns, right?
-
40:45 - 40:48You'll start to recognize over and again, right?
-
40:48 - 40:51And this one, right? The fact that it's branching by a factor of two
-
40:51 - 41:03telling you that at the kind of base case, right? You're gonna expect to see a two to the n expansion.
-
41:03 - 41:05So just to give you
-
41:05 - 41:09a little bit of some idea that you can use big O to predict things
-
41:09 - 41:13based on getting some run time performance for some small values of n
-
41:13 - 41:16and then what you'd expect to see for larger values.
-
41:16 - 41:17So I have
-
41:17 - 41:19three different algorithms here.
-
41:19 - 41:20One that was
-
41:20 - 41:24just logarithmic in terms of n, so it should divide the input in half and kind of
-
41:24 - 41:27work on it. Something like binary search actually fits that profile.
-
41:27 - 41:31Something that's linear. Something that's n log n and something that's n squared.
-
41:31 - 41:35Then for different values of n I've plugged in actually the first ones I ran and then actually
-
41:35 - 41:39I - some of the other ones I had to estimate because they were taking way too long to finish.
-
41:39 - 41:43So that for an input of size ten, all of them are imperceptible. These are in terms of seconds
-
41:43 - 41:44here.
-
41:44 - 41:48Taking fractions of seconds. This is on my machine that is running at a couple gigahertz, so it's
-
41:48 - 41:50got about a million instructions per second.
-
41:50 - 41:53You up that by a factor of ten, right?
-
41:53 - 41:56And they're still kind of in the subsecond range, but you can see that from the
-
41:56 - 41:59n squared algorithm
-
41:59 - 42:00took a bigger jump,
-
42:00 - 42:04let's say, than the linear algorithm did, which went up by a factor of ten almost
-
42:04 - 42:05exactly.
-
42:05 - 42:07Going up by another factor of ten
-
42:07 - 42:08and,
-
42:08 - 42:13again, sort of climbing, right? 10,000, 100,000, a million, a trillion. You
-
42:13 - 42:15get to something that's like 100,000, right?
-
42:15 - 42:20And an algorithm that's linear, right? Is still taking a fraction of a second. But something
-
42:20 - 42:24that was quadratic now has climbed up to take several hours.
-
42:24 - 42:28So even for inputs that don't seem particularly huge, right? The difference between having an
-
42:28 - 42:33algorithm that's gonna operate in linear time versus quadratic is gonna be felt very profoundly. And
-
42:33 - 42:37as you move to these larger numbers you have things that you just can't do,
-
42:37 - 42:40right? You cannot run an n squared algorithm on a trillion pieces of data
-
42:40 - 42:42in your lifetime.
-
42:42 - 42:43
-
42:43 - 42:48It's just too much work for what the input is. Clinton? Yeah. How did it go down for a
-
42:48 - 42:51billion from
-
42:51 - 42:53like a million?
-
42:53 - 42:55From - Oh, you know what,
-
42:55 - 42:59that just must be wrong. You're totally right. That should definitely be over some. Yeah.
-
42:59 - 43:02Well log n should really be - I think that
-
43:02 - 43:05when I copied and pasted from these terminal window. So, of course, I must have just made a mistake when I
-
43:05 - 43:08was moving something across, but you're totally right.
-
43:08 - 43:09This should be going up,
-
43:09 - 43:13right? Ever so slightly, right? This algorithms going very, very slowly logarithmic function,
-
43:13 - 43:16right? Almost a flat line, but that definitely should be up a little bit, right? From
-
43:16 - 43:18where it is.
-
43:18 - 43:19When you get to a trillion, right?
-
43:19 - 43:24Even the linear algorithm is starting to actually take some noticeable time, right?
-
43:24 - 43:28But things that are logarithmic still running very, very slowly. So this is an example of
-
43:28 - 43:32binary search. Binary search operating on a million or trillion items is still just
-
43:32 - 43:33in a
-
43:33 - 43:35flash, right? Telling you whether it found it or not.
-
43:35 - 43:37Doing a linear search
-
43:37 - 43:38on a trillion or billion
-
43:38 - 43:40is
-
43:40 - 43:41several minutes' worth of
-
43:41 - 43:45my old computers
-
43:45 - 43:48time. And so just another way to look at them, right? Is to think about what those things look
-
43:48 - 43:51like in terms of graphed on the Cartesian plane
-
43:51 - 43:55that a constant algorithm is just a flat line. It takes the same amount of time no matter how big the input is.
-
43:55 - 43:56It's
-
43:56 - 44:00pretty small values of n down here, so it just shows the early stages, right? But logarithmic, right?
-
44:00 - 44:03Almost a flat line itself. A little bit above
-
44:03 - 44:06constant, but very, very slowly growing.
-
44:06 - 44:07The
-
44:07 - 44:09linear scale, right? Showing the lines and then the
-
44:09 - 44:13n squared and to the n showing you that they're kind of heading to the hills very
-
44:13 - 44:15quickly here. So even for small values of n
-
44:15 - 44:16
-
44:16 - 44:18they are reaching into the
-
44:18 - 44:23high regions and will continue to do so as
-
44:23 - 44:31that will be more pronounced, right? As you move into the higher and higher values of n. All right. I get to tell
-
44:31 - 44:34you a little bit about sorting before
-
44:34 - 44:36we go away. Because some of them I'm just setting the stage for, like,
-
44:36 - 44:40okay, how can we use this to do something interesting? Knowing about how to compare and then contrast.
-
44:40 - 44:41That it tells you something.
-
44:41 - 44:44Let's try to solve a problem, right? That
-
44:44 - 44:47lends itself to different approaches that have different algorithmic
-
44:47 - 44:50results to them that are worth thinking about.
-
44:50 - 44:54So sorting turns out to be one of the best problems to study for this because it turns out sorting
-
44:54 - 44:55is
-
44:55 - 44:58one of the things that computers do a lot of.
-
44:58 - 45:02That it's very, very common that you need to keep data in sorted order. It makes it easier to
-
45:02 - 45:06search. It makes it easier to find the minimum, the maximum, and find duplicates. All these sort of things, right?
-
45:06 - 45:09Like by having, keeping it inserted or it tends to be more convenient to access that way.
-
45:09 - 45:11It also makes a bunch of other things,
-
45:11 - 45:14other properties about the data, easy to do.
-
45:14 - 45:18If you want to find the top ten percent, right? You can just pick them off, right? If they've all ready
-
45:18 - 45:19
-
45:19 - 45:20been sorted. You want to
-
45:20 - 45:23group them into buckets for histograming, right? All these things actually are
-
45:23 - 45:24enabled by having them
-
45:24 - 45:28all ready be in sorted order. So it turns out that a lot of times that data that you get
-
45:28 - 45:31from other sources tends to first need to be put in sorted order before you start doing
-
45:31 - 45:33stuff on it. Okay.
-
45:33 - 45:36Well, there's lots of different ways you can sort it.
-
45:36 - 45:40There are simple ways. There are complicated ways. There are fancy ways. There are ways that
-
45:40 - 45:42are dumb,
-
45:42 - 45:44
-
45:44 - 45:46but easy to write. Ways that are smart, but hard
-
45:46 - 45:48to write. And everything in between.
-
45:48 - 45:52So we're gonna be looking at it in terms of sorting vectors because that's probably the most common data structure
-
45:52 - 45:53that needs the sorting.
-
45:53 - 45:57But you can also imagine taking the same kind of algorithms and applying them to different kinds of data
-
45:57 - 46:00structures you have. Like starting a linked list.
-
46:00 - 46:04Okay. So I'm gonna show you a sorting algorithm.
-
46:04 - 46:06It's probably the simplest and the easiest to write.
-
46:06 - 46:10If somebody - if you were stuck on a desert island without a textbook, but you happen to have
-
46:10 - 46:11a compiler
-
46:11 - 46:15and you needed to write a sorting algorithm to get your way off the island. It
-
46:15 - 46:18comes up all the time.
-
46:18 - 46:20This is probably the algorithm you're gonna use.
-
46:20 - 46:23It's called selection sort.
-
46:23 - 46:24And the idea behind selection sort
-
46:24 - 46:26is that it's going to select
-
46:26 - 46:28the smallest element
-
46:28 - 46:30and put it in the front.
-
46:30 - 46:33So if I have a big stack of test papers
-
46:33 - 46:36and I want to sort them in order of score,
-
46:36 - 46:40then I'm gonna go through there and find somebody who got the absolute lowest score. It's said, but
-
46:40 - 46:42true. Somebody had to be there.
-
46:42 - 46:45So I kind of work my through it and maybe I hold my finger on the one that I've seen that looks
-
46:45 - 46:49smallest so far. So this one's a 40, oh,
-
46:49 - 46:50well, this one's a 38. Okay. Oh, look there's a 35. Oh,
-
46:50 - 46:54look, there's a 22. Nobody gets these scores in my exams, I'm just kidding.
-
46:54 - 46:56And then finally get down to the bottom. You say, okay,
-
46:56 - 47:00that 25, that was the smallest. I have a hold of it, right? And
-
47:00 - 47:01I'm gonna basically take that
-
47:01 - 47:03and bring it to the front.
-
47:03 - 47:05
-
47:05 - 47:07And in terms of managing a vector
-
47:07 - 47:11that actually there's a slight efficiency to be had by instead of kind of pulling it out of the
-
47:11 - 47:15stack and sliding everything down I'm actually just gonna swap it with the one that's in the
-
47:15 - 47:18very front. So whoever was the current first one is gonna booted
-
47:18 - 47:22to pull in this one and I'm gonna put them back where this other one came from.
-
47:22 - 47:26So I have moved the very smallest to the top of the vector.
-
47:26 - 47:28Then I just do the same thing again,
-
47:28 - 47:32but now excluding that smallest one. So I've all ready seen that one and kind of set it aside. I start looking
-
47:32 - 47:35at the remaining n minus one and I do the same thing again.
-
47:35 - 47:38Find the smallest of what remains, kind of just walking through,
-
47:38 - 47:39going to find it,
-
47:39 - 47:44and swap it down into the second slot, and so on.
-
47:44 - 47:50A little bit of code.
-
47:50 - 47:55Doing it kind of the way I described it, right? Of tracking the minimum index, right? Looking from
-
47:55 - 47:59here to the end, so this four loop in the middle here starts at the current position and only considers
-
47:59 - 48:01from here
-
48:01 - 48:02to the very end of the
-
48:02 - 48:05vector and then finds
-
48:05 - 48:09any element which is smaller than the one I'm kind of currently holding my finger on
-
48:09 - 48:13and then when I'm done after that inner loop has executed I will know what the min index is
-
48:13 - 48:15and then there's a spot function here
-
48:15 - 48:17that just exchanges those two things.
-
48:17 - 48:19The I slot now gets replaced with the one
-
48:19 - 48:22at min index and,
-
48:22 - 48:25again, exchanged in the contents of the array. I just do that
-
48:25 - 48:28n minus one times and I will have done it all. I'd like to show
-
48:28 - 48:32you animation, but I guess I'll just wait for that until Wednesday. We will
-
48:32 - 48:36watch it doing it's kind of thing in progress and you can kind of be thinking about for Wednesday what other
-
48:36 - 48:37ways you might be able to sort
-
48:37 - 48:42data than just this strategy.
- Title:
- Lecture 14 | Programming Abstractions (Stanford)
- Description:
-
Lecture 14 by Julie Zelenski for the Programming Abstractions Course (CS106B) in the Stanford Computer Science Department.
Julie starts off with algorithm analysis, the big-O notation and introduces sorting. She begins off with a brief overview of what algorithm analysis is and how to utilize it. Later, she continues to go through recursive algorithms and their uses.
Complete Playlist for the Course:
http://www.youtube.com/view_play_list?p=FE6E58F856038C69CS 106B Course Website:
http://cs106b.stanford.eduStanford Center for Professional Development:
http://scpd.stanford.edu/Stanford University:
http://www.stanford.edu/Stanford University Channel on YouTube:
http://www.youtube.com/stanforduniversity/ - Video Language:
- English
- Duration:
- 49:33
N. Ueda edited English subtitles for Lecture 14 | Programming Abstractions (Stanford) | ||
Eunjeong_Kim added a translation |