WEBVTT

00:00:00.000 --> 00:00:07.000


00:00:07.000 --> 00:00:10.000


00:00:10.000 --> 00:00:27.000
This presentation is delivered by the Stanford Center for Professional Development.

00:00:27.000 --> 00:00:30.000
Good afternoon. Apparently, Winter Quarter is over and it's now Spring.

00:00:30.000 --> 00:00:34.000
I don't know what happened to the weather, but I'm pretty happy about it. I hope you are too. Let's see. Where

00:00:34.000 --> 00:00:39.000
are we at? We are picking back up with the text. Looking at Chapter 7. Chapter 7 is gonna

00:00:39.000 --> 00:00:43.000
be the theme for pretty much the whole week. Talking about algorithms and

00:00:43.000 --> 00:00:46.000
different ways of analyzing them, formalizing their

00:00:46.000 --> 00:00:50.000
performance, and then talking about sorting as an example problem to kind of

00:00:50.000 --> 00:00:55.000
discuss in terms of algorithms and options for solving that problem.

00:00:55.000 --> 00:00:57.000
Okay. So let's talk about algorithms. Algorithm is

00:00:57.000 --> 00:01:03.000
one of the most interesting things to think about from a CS perspective. Kind of one of the most

00:01:03.000 --> 00:01:04.000
intellectually

00:01:04.000 --> 00:01:08.000
interesting areas to think about things is that often, right? We're trying to solve a particular

00:01:08.000 --> 00:01:08.000
problem.

00:01:08.000 --> 00:01:09.000
We want to

00:01:09.000 --> 00:01:13.000
put this data into sorted order. We want to count these scores and place them into a

00:01:13.000 --> 00:01:14.000
histogram.

00:01:14.000 --> 00:01:19.000
We want to search a maze to find a path from the start to the goal.

00:01:19.000 --> 00:01:20.000
For any of those tasks,

00:01:20.000 --> 00:01:24.000
right? There may be multiple ways we could go about solving that problem

00:01:24.000 --> 00:01:28.000
that approach it from different angles, using different kinds of data structure, solving

00:01:28.000 --> 00:01:31.000
it in forward versus solving it in reverse, is it easier to get from

00:01:31.000 --> 00:01:35.000
the goal back to the start? In the end the path is invertible, so maybe it doesn't matter which

00:01:35.000 --> 00:01:36.000
end we start from.

00:01:36.000 --> 00:01:39.000
Would it be better to use an array for this or a vector? Would a set help us

00:01:39.000 --> 00:01:41.000
out? Could we use a map?

00:01:41.000 --> 00:01:45.000
Could I use iteration? Could I use recursion? Lots of different ways we could do this. Often sometimes the

00:01:45.000 --> 00:01:47.000
decision-making will be about, well,

00:01:47.000 --> 00:01:52.000
do I need to get the perfect best answer? Am I willing to take some approximation? Some estimation

00:01:52.000 --> 00:01:54.000
of a good answer

00:01:54.000 --> 00:01:58.000
that gives me a rough count of something that might be faster to do than doing a real

00:01:58.000 --> 00:01:59.000
precise count?

00:01:59.000 --> 00:02:03.000
So what we're gonna be looking at, right? Is taking a particular problem, sorting is gonna be the one

00:02:03.000 --> 00:02:05.000
that we spend the most time on,

00:02:05.000 --> 00:02:08.000
and think about, well, some of the different ways we can go about putting

00:02:08.000 --> 00:02:09.000
data in sorted order

00:02:09.000 --> 00:02:13.000
and then talk about what the strengths and weaknesses of those algorithms are compared

00:02:13.000 --> 00:02:14.000
to one another.

00:02:14.000 --> 00:02:16.000
Do they differ in how much

00:02:16.000 --> 00:02:21.000
time it will take to sort a particular data set? Does it matter if the data is almost sorted?

00:02:21.000 --> 00:02:22.000
Does it actually

00:02:22.000 --> 00:02:26.000
affect the performance of the algorithm? What happens if the data set is very large? How much memory is

00:02:26.000 --> 00:02:28.000
gonna be used by it?

00:02:28.000 --> 00:02:30.000
And we'll be thinking about as we do this, right? Is often

00:02:30.000 --> 00:02:34.000
that probably the most important thing is how does it run? How much time does it take?

00:02:34.000 --> 00:02:37.000
How much memory does it use? How accurate is its answer?

00:02:37.000 --> 00:02:38.000
But also

00:02:38.000 --> 00:02:41.000
given that brainpower is often in short supply,

00:02:41.000 --> 00:02:45.000
it's worth thinking about, well, is the code easy to develop, to write, to get correct, and kind

00:02:45.000 --> 00:02:46.000
of debug?

00:02:46.000 --> 00:02:49.000
It may be that a simpler algorithm

00:02:49.000 --> 00:02:53.000
that doesn't run as fast, but that's really easy to get working, might actually be the one you

00:02:53.000 --> 00:02:57.000
need to solve this particular problem and save the fancy thing for some day when really

00:02:57.000 --> 00:03:00.000
you needed that extra speed boost.

00:03:00.000 --> 00:03:04.000
So I'm gonna do a little brainstorming exercise with you. Just to kind of get you thinking about

00:03:04.000 --> 00:03:08.000
there's lots of different ways to solve what seems like the same problem.

00:03:08.000 --> 00:03:12.000
All right. I'd like to know how many students are sitting in this room right now.

00:03:12.000 --> 00:03:14.000
All right. I look around.

00:03:14.000 --> 00:03:17.000
Michelle,

00:03:17.000 --> 00:03:18.000
what's the easiest way

00:03:18.000 --> 00:03:20.000
and most reliable way

00:03:20.000 --> 00:03:28.000
to just get a count, an accurate count, of everybody in this room? Start by identifying what kind of

00:03:28.000 --> 00:03:31.000
[inaudible]. Yeah. Just have a strategy. Some people will use the front row. And I see one,

00:03:31.000 --> 00:03:35.000
two, three, four, five and then I go back and I just work my way through the room

00:03:35.000 --> 00:03:37.000
all the way to the end, right?

00:03:37.000 --> 00:03:41.000
What I've done, if I actually remember my grade school counting, right? Then I should have

00:03:41.000 --> 00:03:42.000
come up with a number

00:03:42.000 --> 00:03:45.000
that matched the number of people in the room

00:03:45.000 --> 00:03:48.000
and tells me, okay, there's 100 people, let's say, that are here.

00:03:48.000 --> 00:03:52.000
So that will definitely work. That's the easiest most ways to, sort of, likely approach you'd think

00:03:52.000 --> 00:03:54.000
of. You need to count something, well, then count them, right?

00:03:54.000 --> 00:03:56.000
Now, I'm gonna talk about ways that we could do it

00:03:56.000 --> 00:04:00.000
that maybe aren't so obvious, but that might have different properties

00:04:00.000 --> 00:04:05.000
in terms of trading it off. Let's say that the goal was to count all of the people in Stanford Stadium,

00:04:05.000 --> 00:04:08.000
right? And so I've got a whole bunch of people sitting there.

00:04:08.000 --> 00:04:11.000
The count in turn walking around the stadium doesn't

00:04:11.000 --> 00:04:13.000
scale that well.

00:04:13.000 --> 00:04:16.000
However long it took me to work my way through the rows in this room, right?

00:04:16.000 --> 00:04:20.000
When you multiple it by the size of the Stanford Stadium is gonna take a long

00:04:20.000 --> 00:04:23.000
time and as people get up and go the bathroom and move around and whatnot I might

00:04:23.000 --> 00:04:25.000
lose track of where I'm at and

00:04:25.000 --> 00:04:29.000
all sorts of complications that doesn't really work well in the large.

00:04:29.000 --> 00:04:31.000
So here's another option. Anyone

00:04:31.000 --> 00:04:35.000
else want to give me just another way of thinking about this? Yeah? Multiply

00:04:35.000 --> 00:04:36.000
yourself

00:04:36.000 --> 00:04:38.000
into more Julies.

00:04:38.000 --> 00:04:40.000
Because we can only ? one Julie's not really enough, right? So

00:04:40.000 --> 00:04:44.000
recursion. Can recursion help us out here, right? Can I do some delegating,

00:04:44.000 --> 00:04:47.000
right? Some subcounting to other people?

00:04:47.000 --> 00:04:50.000
So in the case of this room, it might be that I pick somebody to count the left side, somebody to count

00:04:50.000 --> 00:04:56.000
the right side, and somebody to count the middle. And maybe even they themselves decide to use a little delegation in

00:04:56.000 --> 00:04:57.000
their work and say, well, how about I

00:04:57.000 --> 00:05:02.000
get some volunteers on each row? That's sort of idea would work extremely well in the Stanford Stadium

00:05:02.000 --> 00:05:05.000
as well because you just divide it up even further. You say, well, here's this row, this

00:05:05.000 --> 00:05:06.000
row,

00:05:06.000 --> 00:05:11.000
that row and kind of having all of these delegates working in parallel to count the stadium, right? Getting recursion to kind of

00:05:11.000 --> 00:05:14.000
solve that same problem.

00:05:14.000 --> 00:05:15.000
What if I wanted

00:05:15.000 --> 00:05:18.000
? I was willing to tolerate a little bit of inaccuracy?

00:05:18.000 --> 00:05:22.000
What if I wanted an estimate of the people that were in this room and I was willing

00:05:22.000 --> 00:05:24.000
to accept a little bit of sampling or

00:05:24.000 --> 00:05:27.000
estimation error? Maybe you could take,

00:05:27.000 --> 00:05:29.000
like, a

00:05:29.000 --> 00:05:35.000
little area of seats and count how many people are in those seats and then multiply it by how many of

00:05:35.000 --> 00:05:36.000
those areas are in the room?

00:05:36.000 --> 00:05:40.000
Yeah. So it's like this capacity. Somewhere there's a sign, right? On one of the doors here that

00:05:40.000 --> 00:05:43.000
will say, oh, this room seats 180 people.

00:05:43.000 --> 00:05:46.000
So I know that there are 180 seats without doing any counting.

00:05:46.000 --> 00:05:48.000
So if I took a section.

00:05:48.000 --> 00:05:50.000
So I take ten seats, let's say,

00:05:50.000 --> 00:05:55.000
or 18 seats for that matter. I take 18 seats and I count those 18 seats how many are occupied

00:05:55.000 --> 00:05:59.000
and let's say half of them of the 18 that I look at were occupied then I can say, well, the room's

00:05:59.000 --> 00:06:04.000
about half full, right? And then I say, well, okay, 180 seats, half of them, right? Means 90 people are here.

00:06:04.000 --> 00:06:07.000
So in that case it's not guaranteed to be accurate, right? If I happen to pick a particularly

00:06:07.000 --> 00:06:11.000
dense or sparsely populated section, right? I'd be getting some sampling error based on

00:06:11.000 --> 00:06:14.000
that, but that would also work in the Stanford Stadium, right? We know how many seats are in the Stanford

00:06:14.000 --> 00:06:17.000
Stadium, right? You pick a section

00:06:17.000 --> 00:06:20.000
and you count it. You count 100 seats worth to get a percentage of how full

00:06:20.000 --> 00:06:24.000
it is and then just multiply it out to see what you get.

00:06:24.000 --> 00:06:28.000
There's a variation on that that's kind of a ? maybe not the first thing that would occur to you, right? Is just that

00:06:28.000 --> 00:06:31.000
the idea of counting some subsections is kind of interesting

00:06:31.000 --> 00:06:35.000
as a way to kind of divide the problem and then say, well, in the small can I do some

00:06:35.000 --> 00:06:37.000
counting that will then scale up?

00:06:37.000 --> 00:06:39.000
So, for example, if I had everybody in this room and I said, okay, think of

00:06:39.000 --> 00:06:43.000
a number between one and ten. Everybody just think of a number independently on your own. Okay.

00:06:43.000 --> 00:06:47.000
If you were thinking of the number four, raise your hand.

00:06:47.000 --> 00:06:51.000
I got one, two, three, four, five, six, seven. Seven people. And said, okay, well, if I figure that everybody

00:06:51.000 --> 00:06:55.000
was just as likely to pick a number between one and ten as four,

00:06:55.000 --> 00:06:56.000
then those seven people, right?

00:06:56.000 --> 00:07:01.000
Represent 10 percent of the population. So maybe there were 70 of you. Again, totally based on randomness

00:07:01.000 --> 00:07:05.000
and sampling error, right? If I happen to pick a number that's very unpopular with my crowd, right?

00:07:05.000 --> 00:07:08.000
Or very popular, I get kind of an artificial estimate.

00:07:08.000 --> 00:07:11.000
But it does tell you a little bit about the nature of just solving this

00:07:11.000 --> 00:07:15.000
problem. It's like how much error am I willing to tolerate and how much time I'm willing to invest in it, how big

00:07:15.000 --> 00:07:17.000
is the problem I'm trying to solve?

00:07:17.000 --> 00:07:19.000
What kind of things can I do that might,

00:07:19.000 --> 00:07:22.000
in the end, give me some estimate

00:07:22.000 --> 00:07:25.000
or accurate count, right? Depending on what I'm willing to invest in the time

00:07:25.000 --> 00:07:28.000
spent.

00:07:28.000 --> 00:07:31.000
So random thought for you. Here's another one. I can take the access

00:07:31.000 --> 00:07:33.000
class list, right?

00:07:33.000 --> 00:07:35.000
And I can take the first ten people

00:07:35.000 --> 00:07:37.000
and call out their names and find out if they're here

00:07:37.000 --> 00:07:40.000
and on that, right? I would get this estimate of, well, what percentage of my students actually

00:07:40.000 --> 00:07:41.000
come?

00:07:41.000 --> 00:07:44.000
So if my class list says that there's 220 students, which it does because where are

00:07:44.000 --> 00:07:45.000
they?

00:07:45.000 --> 00:07:48.000
And I take the first of ten students, right? And I call them out and I discover three of them are

00:07:48.000 --> 00:07:52.000
here I can say, oh, about 30 percent of my 220 students

00:07:52.000 --> 00:07:54.000
come, 66 of you.

00:07:54.000 --> 00:07:57.000
And then I would also have the advantage of knowing who was a slacker and I could write it

00:07:57.000 --> 00:07:59.000
down in my book.

00:07:59.000 --> 00:08:05.000
No, no, no. I'm sure you guys are all at home watching in your bunny slippers. All right. So

00:08:05.000 --> 00:08:07.000
let's talk about in terms of computer things. Like

00:08:07.000 --> 00:08:12.000
that the situation often is you have a problem to solve, right? You need to do this thing.

00:08:12.000 --> 00:08:14.000
You need to solve this maze or

00:08:14.000 --> 00:08:15.000


00:08:15.000 --> 00:08:20.000
compute this histogram or find the mode score in your class or something like that

00:08:20.000 --> 00:08:23.000
and you have these different options about how you might proceed. Which data structure you use

00:08:23.000 --> 00:08:25.000
and how you're gonna approach it and stuff like that.

00:08:25.000 --> 00:08:29.000
Certainly one way to do it, and not a very practical way to do it, would be to say, well, I'll

00:08:29.000 --> 00:08:32.000
just go implement the five different ways I could do this. For example, when I was running your

00:08:32.000 --> 00:08:36.000
maze assignment that's what I did. I wrote, like, ten different maze solvers until I decided which one I was

00:08:36.000 --> 00:08:37.000
gonna give you.

00:08:37.000 --> 00:08:41.000
That's not very practical, right? If your boss said to you, you know, you need to solve task A

00:08:41.000 --> 00:08:45.000
and you're response was I'll solve it ten different ways and then I'll come back to you with the

00:08:45.000 --> 00:08:46.000
best one.

00:08:46.000 --> 00:08:48.000
That might take more time than you've got.

00:08:48.000 --> 00:08:49.000


00:08:49.000 --> 00:08:52.000
So you would have to write it, you'd debug it, you'd test it, and then you could kind of time it using

00:08:52.000 --> 00:08:56.000
your stopwatch to find out how good did it do on these data sets.

00:08:56.000 --> 00:09:00.000
Yeah. Well, there's some issues with that. First of all, you have to go do it, right?

00:09:00.000 --> 00:09:01.000
Which is kind of a hassle.

00:09:01.000 --> 00:09:04.000
And then also it's subject to these variations, like, well, what computer were you running on? What

00:09:04.000 --> 00:09:07.000
OS were you running? What other tasks were running? Like did it ?

00:09:07.000 --> 00:09:10.000
it may not be completely reliable whatever results you saw.

00:09:10.000 --> 00:09:14.000
What we're gonna be more interested in is doing it in a more formal abstract way, which is just analyzing

00:09:14.000 --> 00:09:18.000
the algorithm from a pseudocode standpoint. Knowing what the code does and the steps that

00:09:18.000 --> 00:09:19.000
it takes, right?

00:09:19.000 --> 00:09:24.000
What can we predict about how that algorithm will behave? What situations will it

00:09:24.000 --> 00:09:28.000
do well? What situations will it do poorly? When will it get the accurate answer or an estimate?

00:09:28.000 --> 00:09:31.000
How much time and space can we

00:09:31.000 --> 00:09:34.000
predict it will use based on the size of its input?

00:09:34.000 --> 00:09:37.000
That would tell us based on just some descriptions of the algorithm

00:09:37.000 --> 00:09:41.000
which one might be the best one that's gonna work out in practice. Then we can go off and really implement

00:09:41.000 --> 00:09:44.000
the one that we've chosen.

00:09:44.000 --> 00:09:49.000
So this is actually working more the mathematical sense, less on the stopwatch sense. But

00:09:49.000 --> 00:09:55.000
helps us to analyze things before we've gone through all the process of writing the code. So what I'm gonna do

00:09:55.000 --> 00:10:04.000
with you today is a little bit of just analysis of some code we didn't write to talk about how it behaves and then we're gonna go on and talk about the sorting problem and some of the options for that.

00:10:04.000 --> 00:10:09.000
So this one is a function that converts the temperature. You give it a temperature in Celsius and

00:10:09.000 --> 00:10:11.000
it converts it to the Fahrenheit equivalent by

00:10:11.000 --> 00:10:14.000
doing the multiplication and addition.

00:10:14.000 --> 00:10:19.000
So the basic, sort of, underpitting of kind of looking at it formally is to basically realize

00:10:19.000 --> 00:10:22.000
that we're gonna count the statements that are executed.

00:10:22.000 --> 00:10:25.000
Assuming, right? In this very gross overgeneralization

00:10:25.000 --> 00:10:26.000
that each

00:10:26.000 --> 00:10:29.000
action that you take costs the same amount.

00:10:29.000 --> 00:10:33.000
That doing the multiply and the divide and the addition and a return, that

00:10:33.000 --> 00:10:35.000
all those things are, like, one unit. In this

00:10:35.000 --> 00:10:37.000
case, I'm gonna call it a penny, right?

00:10:37.000 --> 00:10:39.000
It took a penny to do each of those operations.

00:10:39.000 --> 00:10:43.000
That's not really accurate to be fair. There's some operations that are more expensive at the

00:10:43.000 --> 00:10:47.000
low level than others, but we're gonna be pretty rough about our estimate

00:10:47.000 --> 00:10:48.000
here

00:10:48.000 --> 00:10:50.000
in this first step here.

00:10:50.000 --> 00:10:54.000
So there's a multiply, there's a divide, there's an add, there's a return, right? And so I have, like, four statements

00:10:54.000 --> 00:10:59.000
worth of stuff that goes into asking a temperature to be converted.

00:10:59.000 --> 00:11:01.000
The other thing you'll note is that, does it

00:11:01.000 --> 00:11:03.000
matter if the temperature is high or low? If

00:11:03.000 --> 00:11:07.000
we ask it to convert Celsius of zero, Celsius of 100, Celsius of 50

00:11:07.000 --> 00:11:09.000
it does the same amount of work no matter what.

00:11:09.000 --> 00:11:13.000
So actually that the ? for whatever inputs, right? You give it this function pretty much

00:11:13.000 --> 00:11:16.000
will take the same amount of time.

00:11:16.000 --> 00:11:19.000
That's good to know, right? There's certain functions that will be like that.

00:11:19.000 --> 00:11:22.000
So we would call this a constant time function. It's, like, no matter what you ask it to convert

00:11:22.000 --> 00:11:24.000
it takes about the same amount of time.

00:11:24.000 --> 00:11:30.000
That doesn't mean that it takes no time, but it is a reliable performer of

00:11:30.000 --> 00:11:33.000
relative inputs.

00:11:33.000 --> 00:11:35.000
So let's look at this guy.

00:11:35.000 --> 00:11:38.000
This is one that takes in a vector

00:11:38.000 --> 00:11:40.000
and then it sums all the values in the vector

00:11:40.000 --> 00:11:46.000
and then divides that sum by the total number of elements to compute its average. Okay.

00:11:46.000 --> 00:11:50.000
So, again, kind of using this idea that will everything you do cost you a penny?

00:11:50.000 --> 00:11:52.000
How much is it gonna cost you

00:11:52.000 --> 00:11:56.000
to make a call to the average function? Well, this

00:11:56.000 --> 00:11:59.000
is a little tricky because, in fact, there's a loop in here, right? That's executed a variable

00:11:59.000 --> 00:12:02.000
number of times depending on the size of the input.

00:12:02.000 --> 00:12:05.000
So first let's look at the things that are outside the loop, right? Outside the loop we do things

00:12:05.000 --> 00:12:07.000
like initialize the sum,

00:12:07.000 --> 00:12:10.000
we initialize I before we enter the loop, right?

00:12:10.000 --> 00:12:14.000
We do this division here at the bottom and this returns. There are about four things we do

00:12:14.000 --> 00:12:19.000
outside of getting into and then iterating in the loops. So just four things we pay no

00:12:19.000 --> 00:12:20.000
matter what.

00:12:20.000 --> 00:12:23.000
Then we get into this loop body

00:12:23.000 --> 00:12:28.000
and each iteration through the loop body does a test against the I of the size to make sure that we're in

00:12:28.000 --> 00:12:29.000
range, right?

00:12:29.000 --> 00:12:31.000
Does this addition into the sum

00:12:31.000 --> 00:12:33.000
and then increments I.

00:12:33.000 --> 00:12:36.000
So every iteration has kind of three statements that happen

00:12:36.000 --> 00:12:41.000
for every element in the vector. Zero, one, two, three, and whatnot.

00:12:41.000 --> 00:12:42.000
So what we get here is

00:12:42.000 --> 00:12:45.000
a little bit of constant work that's done no matter what

00:12:45.000 --> 00:12:49.000
and then another term that varies with the size of the input. If the vector has ten elements,

00:12:49.000 --> 00:12:50.000
right? We'll do

00:12:50.000 --> 00:12:52.000
30 steps worth of stuff inside the loop.

00:12:52.000 --> 00:12:55.000
If it has 100 elements we do

00:12:55.000 --> 00:12:56.000
300.

00:12:56.000 --> 00:12:57.000
In this case, right?

00:12:57.000 --> 00:13:05.000
For different inputs, right? We would expect the average to take a different amount of time.

00:13:05.000 --> 00:13:07.000
Yeah? If n, the vector size

00:13:07.000 --> 00:13:08.000
MP though we still initialize I?

00:13:08.000 --> 00:13:12.000
We still initialize I. So I was actually counted in here, right? The init I, right? That was in there.

00:13:12.000 --> 00:13:16.000
We actually have one more test though. We do a test and then we don't enter the loop body. So there's

00:13:16.000 --> 00:13:19.000
actually one little piece maybe we could say that there's actually like three n plus five. There's one additional

00:13:19.000 --> 00:13:20.000
test

00:13:20.000 --> 00:13:22.000
that doesn't enter into the loop body.

00:13:22.000 --> 00:13:23.000
We're gonna see that actually that

00:13:23.000 --> 00:13:27.000
we're gonna be a little bit fast and loose about some of the

00:13:27.000 --> 00:13:31.000
things that are in the noise in the end and look more at the big leading term, but you are right.

00:13:31.000 --> 00:13:36.000
There is one more test than there are alliterations on the loop. That last test that has to fail.

00:13:36.000 --> 00:13:38.000
The idea of being that, right? We have

00:13:38.000 --> 00:13:42.000
three statements per thing. A little bit of extra stuff tacked onto it if we double the size of

00:13:42.000 --> 00:13:47.000
the vectors. So if we compute the average of the vector that has 100 elements and it took us two seconds.

00:13:47.000 --> 00:13:50.000
If we put in a vector that's 200 elements

00:13:50.000 --> 00:13:51.000
long,

00:13:51.000 --> 00:13:53.000
we expect that it should about double, right?

00:13:53.000 --> 00:13:56.000
If it took two seconds to do this half-sized vector,

00:13:56.000 --> 00:14:00.000
then if we'd have to do a vector that's twice as long we expect it's gonna take about

00:14:00.000 --> 00:14:01.000
four seconds.

00:14:01.000 --> 00:14:04.000
And that prediction is actually at the heart of what we're looking at today.

00:14:04.000 --> 00:14:06.000
Is trying to get this idea of,

00:14:06.000 --> 00:14:10.000
well, if we know something about how it performs relative to its input

00:14:10.000 --> 00:14:14.000
can we describe if we were to change the size of that input to make the maze much bigger or to make the

00:14:14.000 --> 00:14:16.000
vector much smaller?

00:14:16.000 --> 00:14:19.000
What can we predict or estimate about

00:14:19.000 --> 00:14:28.000
how much time and memory will be used to solve the problem using this algorithm?

00:14:28.000 --> 00:14:33.000
So here is one that given a vector computes the min and the max of the vector and it does it

00:14:33.000 --> 00:14:34.000
with two loops, right?

00:14:34.000 --> 00:14:35.000
One

00:14:35.000 --> 00:14:40.000
loop that goes through and checks each element to see if it's greater than the max it seems

00:14:40.000 --> 00:14:40.000
so far

00:14:40.000 --> 00:14:42.000
and if so it updates the max.

00:14:42.000 --> 00:14:46.000
And similarly almost identical loop here at the bottom

00:14:46.000 --> 00:14:50.000
that then checks to see if any successive element is smaller than the min it's seen so far and it

00:14:50.000 --> 00:14:53.000
updates the min to that one. Okay.

00:14:53.000 --> 00:14:55.000
So a little bit of stuff that happens

00:14:55.000 --> 00:14:59.000
outside the loop, right? We init I two times. We init the min and the max,

00:14:59.000 --> 00:15:02.000
so we've got, like, four statements that happen outside.

00:15:02.000 --> 00:15:04.000
And then inside the loop, right? We've got a test and

00:15:04.000 --> 00:15:08.000
an assignment, a comparison, an increment,

00:15:08.000 --> 00:15:10.000
and we have two of those loops, right?

00:15:10.000 --> 00:15:12.000
And so we have a

00:15:12.000 --> 00:15:13.000
little bit of stuff outside the loops,

00:15:13.000 --> 00:15:17.000
about eight instructions worth that happens per every element. So we look at it once to

00:15:17.000 --> 00:15:21.000
see if it's greater than the max. We actually look at everything again, right? To see if it's the

00:15:21.000 --> 00:15:22.000
min.

00:15:22.000 --> 00:15:24.000
I'm gonna use this actually

00:15:24.000 --> 00:15:28.000
as written, right? You might think, well, I could make this better, right? I could make

00:15:28.000 --> 00:15:29.000
this better by

00:15:29.000 --> 00:15:34.000
doing these two things together, right? Inside that loop, right? So that while I'm looking at each element

00:15:34.000 --> 00:15:38.000
I can say, well, if it's the max or if it's the min, right?

00:15:38.000 --> 00:15:42.000
If I look at it just once I can actually kind of do both those comparisons inside

00:15:42.000 --> 00:15:43.000
the loop

00:15:43.000 --> 00:15:46.000
and what we're gonna see is that in terms of kind of this analysis

00:15:46.000 --> 00:15:48.000
that's really not gonna make any difference.

00:15:48.000 --> 00:15:52.000
That whether we do a loop over the vector once and then we go back and loop over it again or whether we do

00:15:52.000 --> 00:15:56.000
twice as much stuff to each element inside one

00:15:56.000 --> 00:15:58.000
loop, it's for the purposes of the analysis we're looking at

00:15:58.000 --> 00:16:01.000
it ends up coming down to the same things.

00:16:01.000 --> 00:16:04.000
Which is, yeah, there is something that depends on the size of the input directly.

00:16:04.000 --> 00:16:07.000
Which is to say that if it were

00:16:07.000 --> 00:16:12.000
it will increase linearly with the change in size.

00:16:12.000 --> 00:16:16.000
So I have these numbers, like, four statements for the Celsius to Fahrenheit. Three n plus

00:16:16.000 --> 00:16:18.000
four. Eight n plus four.

00:16:18.000 --> 00:16:19.000


00:16:19.000 --> 00:16:23.000
These tell us a little bit about, well, will get extremes always take more time than average

00:16:23.000 --> 00:16:24.000
or C to F?

00:16:24.000 --> 00:16:25.000


00:16:25.000 --> 00:16:28.000
That given the way those numbers look it looks like, well, if you plugged in the same value of n

00:16:28.000 --> 00:16:30.000
you'd have the same size vector.

00:16:30.000 --> 00:16:32.000
That it should take more time

00:16:32.000 --> 00:16:35.000
to compute get extremes versus compute the average.

00:16:35.000 --> 00:16:39.000
We're gonna discover that, actually, we're not gonna, actually, try to make that guarantee.

00:16:39.000 --> 00:16:44.000
That what we're ? we're not really so much interested in comparing two algorithms

00:16:44.000 --> 00:16:46.000
and their constant factors to decide

00:16:46.000 --> 00:16:50.000
which of these two that kind of look about the same might be a little bit faster. What we're gonna try and

00:16:50.000 --> 00:16:55.000
look at it is giving you kind of a rough idea of the class an algorithm fits in. What its growth

00:16:55.000 --> 00:16:56.000
term, the kind of

00:16:56.000 --> 00:17:00.000
prediction of its growth, is. In this case, both averaging get extremes in terms of

00:17:00.000 --> 00:17:05.000
growth, both have a leading term that depends on n with some other noise and a little bit

00:17:05.000 --> 00:17:06.000
of a constant thrown in the front.

00:17:06.000 --> 00:17:09.000
That means that both of them, right? We'd expect to grow linearly

00:17:09.000 --> 00:17:14.000
in a straight line based on the change in n. So if you doubled the size of the thing

00:17:14.000 --> 00:17:19.000
then average would take twice as long as it previously did. So should get extremes. But the

00:17:19.000 --> 00:17:23.000
data point for does average of a vector of 100 elements take the same amount of time

00:17:23.000 --> 00:17:26.000
as get extremes of 100 is not what we're looking at predicting,

00:17:26.000 --> 00:17:29.000
right? We're trying to just tell you things about average against itself

00:17:29.000 --> 00:17:31.000
over different ranges of inputs.

00:17:31.000 --> 00:17:36.000
So if we know that average takes two milliseconds for a 1,000 elements. If we double the

00:17:36.000 --> 00:17:40.000
size of that we expect it to take twice as long, four milliseconds. If we increase it by a factor

00:17:40.000 --> 00:17:45.000
of ten we expect to increase the time by a factor of ten, right? So it should go up to 20 milliseconds. Get

00:17:45.000 --> 00:17:45.000
extremes

00:17:45.000 --> 00:17:48.000
might take more or less for a particular element.

00:17:48.000 --> 00:17:50.000
We expect it probably will take more

00:17:50.000 --> 00:17:53.000
because it does a little bit more work per element, but

00:17:53.000 --> 00:17:54.000


00:17:54.000 --> 00:18:01.000
we really want to time it to be confident about that rather than making any estimation here. So in

00:18:01.000 --> 00:18:04.000
terms of what's called big O notation

00:18:04.000 --> 00:18:05.000
we're gonna see that we're going to

00:18:05.000 --> 00:18:06.000


00:18:06.000 --> 00:18:10.000
kind of take those statement counts and we're gonna summarize them.

00:18:10.000 --> 00:18:11.000


00:18:11.000 --> 00:18:15.000
That we can do a little bit of niggling to figure out what those numbers are, but what we're

00:18:15.000 --> 00:18:16.000
gonna then

00:18:16.000 --> 00:18:16.000


00:18:16.000 --> 00:18:22.000
do is take the leading term, the largest term with the largest coeff ? value of the input number,

00:18:22.000 --> 00:18:24.000
in this case n.

00:18:24.000 --> 00:18:26.000
Ignore any smaller terms

00:18:26.000 --> 00:18:28.000
and then drop all the constants,

00:18:28.000 --> 00:18:31.000
all the coefficients, right? If you know it takes n over 2,

00:18:31.000 --> 00:18:32.000
it's just gonna be n.

00:18:32.000 --> 00:18:34.000
If you know that it takes

00:18:34.000 --> 00:18:37.000
ten statements, then it's gonna just be constant if there's no term of n in there. If it

00:18:37.000 --> 00:18:43.000
takes 10n then it's still n. 10n and 2n and 25n and 1/15n

00:18:43.000 --> 00:18:44.000
all end up just being n.

00:18:44.000 --> 00:18:47.000
So we might have this idea that we've estimated the time

00:18:47.000 --> 00:18:51.000
using n and having those constants in. Well, when it comes time to describe it in terms of big

00:18:51.000 --> 00:18:52.000
O,

00:18:52.000 --> 00:18:54.000
we focus on

00:18:54.000 --> 00:18:55.000
what's the ?

00:18:55.000 --> 00:18:57.000
oh, that,

00:18:57.000 --> 00:19:00.000
the subscript got lost on that. And that just makes no sense as it is right there. I will fix it

00:19:00.000 --> 00:19:01.000
in a second.

00:19:01.000 --> 00:19:05.000
So the 3n plus 5, right? The leading term is n, the coefficient 3

00:19:05.000 --> 00:19:07.000
gets dropped. It's just linear.

00:19:07.000 --> 00:19:09.000
We expect that as we change the input

00:19:09.000 --> 00:19:13.000
the time will change linearly with that change.

00:19:13.000 --> 00:19:15.000
The 10n minus

00:19:15.000 --> 00:19:16.000
2,

00:19:16.000 --> 00:19:20.000
same class. O of n. Even though it didn't have kind of the same constants coming into it, right? It has the

00:19:20.000 --> 00:19:23.000
same growth pattern that they both are lines.

00:19:23.000 --> 00:19:26.000
The slope of them might be slightly different, but

00:19:26.000 --> 00:19:28.000
in terms of big O we're being

00:19:28.000 --> 00:19:31.000
pretty loose about what fits into this class.

00:19:31.000 --> 00:19:34.000
1/2n squared minus

00:19:34.000 --> 00:19:36.000
n, the leading term here is the n squared.

00:19:36.000 --> 00:19:40.000
The n, subtract it off. When n is small n squared and n are actually kind of in range

00:19:40.000 --> 00:19:41.000
of each other,

00:19:41.000 --> 00:19:45.000
but then what we're looking at is in the limit. As n gets very, very large, n gets

00:19:45.000 --> 00:19:49.000
to be 1,000, n gets to be ? n squared is a much bigger number, right? When

00:19:49.000 --> 00:19:52.000
n is a million

00:19:52.000 --> 00:19:56.000
then n itself was and so as we get to these larger points kind of out into the limit this

00:19:56.000 --> 00:20:00.000
term is the only one that matters and this is just a little bit of the noise attached to it.

00:20:00.000 --> 00:20:03.000
So that's why we're gonna summarize down to that.

00:20:03.000 --> 00:20:07.000
What this one is supposed to say, and it actually is incorrect here, it just should be two to the n and n

00:20:07.000 --> 00:20:12.000
to the third power, which summarizes to two to the n. Two to the n grows much, much more

00:20:12.000 --> 00:20:17.000
rapidly than n cubed does and so as n gets to be a very large number. Even a fairly small number,

00:20:17.000 --> 00:20:20.000
you know, two to the tenth, right? Is

00:20:20.000 --> 00:20:23.000
all ready 1,000. Two to the 20th is a million,

00:20:23.000 --> 00:20:24.000
which is

00:20:24.000 --> 00:20:28.000
much bigger than these guys over here. So as we get to the long run it

00:20:28.000 --> 00:20:31.000
must be much bigger. I'm gonna put

00:20:31.000 --> 00:20:38.000
a note to myself though to fix that before I put that on the web. It should be O to the 2n? Yeah. It should

00:20:38.000 --> 00:20:41.000
be O to the 2n. So that should be two to the

00:20:41.000 --> 00:20:43.000
n, n to the third, and my two

00:20:43.000 --> 00:20:46.000
to the n there. My subscripts got blasted out of that.

00:20:46.000 --> 00:20:47.000
And so we're trying to

00:20:47.000 --> 00:20:51.000
describe this growth curve, like, in the limit, right? We're avoiding the details when they don't matter and they don't

00:20:51.000 --> 00:20:53.000
matter when n gets big enough, right?

00:20:53.000 --> 00:20:55.000
That only the leading term,

00:20:55.000 --> 00:20:59.000
right? Is predicting without this other stuff kind of just ignoring it.

00:20:59.000 --> 00:21:01.000
So this is very sloppy math,

00:21:01.000 --> 00:21:05.000
right? So those of you who are more trained in the, sort of, mathematical sciences may find this

00:21:05.000 --> 00:21:09.000
a little bit alarming. Which is just how kind of fast and loose we're gonna be. The only way all these terms left and

00:21:09.000 --> 00:21:13.000
right just kind of summarizing down to, okay, here's this leading term and how it relates to n.

00:21:13.000 --> 00:21:14.000
Everything else, right?

00:21:14.000 --> 00:21:19.000
Totally uninteresting. We could have a function, for example, that did 1,000 conversions of Celsius to

00:21:19.000 --> 00:21:20.000
Fahrenheit,

00:21:20.000 --> 00:21:24.000
but if every time you call it does 1,000 conversions that means no matter

00:21:24.000 --> 00:21:28.000
what input you give to it doesn't change. That's considered O of one or a constant

00:21:28.000 --> 00:21:31.000
time. Similarly, right? For doing average. If we did an average that somehow

00:21:31.000 --> 00:21:32.000


00:21:32.000 --> 00:21:36.000
operated over the vector one time and another one that did it ten times or 20 times

00:21:36.000 --> 00:21:39.000
looked at each element 20 times, right?

00:21:39.000 --> 00:21:42.000
They would still both be O of n. Whatever time they took,

00:21:42.000 --> 00:21:47.000
right? On a vector of a certain length we double that length. We'd expect to double the time.

00:21:47.000 --> 00:21:50.000
More formally, right? In terms of what the math really means

00:21:50.000 --> 00:21:55.000
is that we say that O of F of n, so O, if it's big O of some function,

00:21:55.000 --> 00:22:00.000
it describes an upper bound on the time required. Meaning that

00:22:00.000 --> 00:22:02.000
for sufficiently large

00:22:02.000 --> 00:22:03.000
values of n

00:22:03.000 --> 00:22:06.000
and some constant C that we get to pick that

00:22:06.000 --> 00:22:10.000
that would bound the curves. So if you imagine what the real curve looks like, it grows and

00:22:10.000 --> 00:22:13.000
maybe it flattens out or maybe it goes up very sharply.

00:22:13.000 --> 00:22:17.000
That what C of F of n gives you kind of some upper bound of that. A curve that stays above it

00:22:17.000 --> 00:22:20.000
at ? for at some point of n and the limit, right?

00:22:20.000 --> 00:22:22.000
Will dominate it from there to affinity.

00:22:22.000 --> 00:22:26.000
So describing kind of where it's gonna hit at that upper bound gives us some measure

00:22:26.000 --> 00:22:32.000
of what's going on. Okay.

00:22:32.000 --> 00:22:34.000
So we can use this to predict times.

00:22:34.000 --> 00:22:37.000
That's probably what's most valuable to us about it is knowing that

00:22:37.000 --> 00:22:41.000
I have an n ? a linear algorithm. So an O of n algorithm we'll call linear.

00:22:41.000 --> 00:22:46.000
And n squared algorithm I'll call quadratic. N to the third I'll called cubic, right?

00:22:46.000 --> 00:22:50.000
That's gonna tell us that those curves, right? You know what a line looks like. Well, you know what a parabola

00:22:50.000 --> 00:22:54.000
looks like coming out of an n squared where it's growing much more sharply and early kind of

00:22:54.000 --> 00:22:57.000
making the decision. So

00:22:57.000 --> 00:22:58.000
it might be, right? That

00:22:58.000 --> 00:23:02.000
if I know that it takes three seconds to do something for 5,000 elements then I have a linear

00:23:02.000 --> 00:23:03.000
algorithm.

00:23:03.000 --> 00:23:07.000
That 10,000 elements, right? Should take twice as long. 20,000 should take another,

00:23:07.000 --> 00:23:10.000
a factor of two on that. So 12 seconds worth. That

00:23:10.000 --> 00:23:12.000
I may have an n squared algorithm.

00:23:12.000 --> 00:23:17.000
So one that I expect to actually perform more poorly in the long run, right? This n squared

00:23:17.000 --> 00:23:21.000
versus n tells you that it's gonna more sharply grow. That

00:23:21.000 --> 00:23:25.000
for some values of n in simply 5,000 is the case where the n squared algorithm

00:23:25.000 --> 00:23:26.000
actually outperforms

00:23:26.000 --> 00:23:31.000
it in a small case. That's not uncommon actually.

00:23:31.000 --> 00:23:32.000
But, right?

00:23:32.000 --> 00:23:36.000
As it grows, right? As I get to larger and larger values of n

00:23:36.000 --> 00:23:40.000
the fact that it is going by factor of four, right? The square of

00:23:40.000 --> 00:23:40.000
the doubling

00:23:40.000 --> 00:23:45.000
as opposed to linear means it's gonna quickly take off. And so I take my 5,000 elements that

00:23:45.000 --> 00:23:47.000
took two and a half seconds.

00:23:47.000 --> 00:23:50.000
I put an input that's twice as large into it.

00:23:50.000 --> 00:23:52.000
It's gonna take a factor of four,

00:23:52.000 --> 00:23:53.000
right? From there.

00:23:53.000 --> 00:23:56.000
And if I go from 10,000 to 20,000 another factor of four is gonna bring that up

00:23:56.000 --> 00:23:57.000
to

00:23:57.000 --> 00:23:58.000
40 seconds or so

00:23:58.000 --> 00:24:02.000
compared to my more modestly growing linear algorithm there.

00:24:02.000 --> 00:24:07.000
So if I was facing some task, right? Where I had an option between a linear algorithm and a

00:24:07.000 --> 00:24:09.000
quadratic algorithm

00:24:09.000 --> 00:24:12.000
it's telling me that in the long run the quadratic algorithm

00:24:12.000 --> 00:24:15.000
for sufficiently large inputs, right?

00:24:15.000 --> 00:24:18.000
Is going to bog down in a way that the linear one

00:24:18.000 --> 00:24:26.000
will be our kind of strong performer in the larger cases.

00:24:26.000 --> 00:24:27.000
So some algorithms

00:24:27.000 --> 00:24:30.000
actually have a variable

00:24:30.000 --> 00:24:32.000
run time expectation

00:24:32.000 --> 00:24:34.000
that you cannot say

00:24:34.000 --> 00:24:35.000


00:24:35.000 --> 00:24:39.000
with all assuredness how much time it will take because actually depending on the input and

00:24:39.000 --> 00:24:43.000
the characteristics of the input it may do more or less work. So average looks at every

00:24:43.000 --> 00:24:47.000
element in the vector and sums them all up and does the division. It doesn't matter what elements

00:24:47.000 --> 00:24:48.000
are in there.

00:24:48.000 --> 00:24:50.000
It's very reliable in that sense.

00:24:50.000 --> 00:24:54.000
Something like search. So this is a linear search function that given a vector of names

00:24:54.000 --> 00:24:56.000
and a particular key

00:24:56.000 --> 00:24:58.000
just walks the vector looking for a match. If

00:24:58.000 --> 00:25:02.000
it finds it, it returns true. If it exhaustively searched the whole thing and didn't find it, it returns

00:25:02.000 --> 00:25:04.000
false. Okay.

00:25:04.000 --> 00:25:06.000
So we've got what looks like

00:25:06.000 --> 00:25:08.000
an O of n loop

00:25:08.000 --> 00:25:09.000
that we'll look at the things,

00:25:09.000 --> 00:25:11.000
but there are some cases, right?

00:25:11.000 --> 00:25:15.000
Where it just doesn't do very much work at all. For example, if it finds the key in the very

00:25:15.000 --> 00:25:16.000
first spot,

00:25:16.000 --> 00:25:18.000
right? If I'm looking for

00:25:18.000 --> 00:25:22.000
Bob and Bob is the first thing in there, then I immediately return true and I do no more work.

00:25:22.000 --> 00:25:26.000
And so it doesn't matter whether Bob was followed by a million more names or ten more names, right?

00:25:26.000 --> 00:25:31.000
That, in fact, it is a constant time operation to just access that first element and return it.

00:25:31.000 --> 00:25:34.000
Similarly for other things that are close to the front. It looks at a few of them and

00:25:34.000 --> 00:25:35.000
it's done.

00:25:35.000 --> 00:25:39.000
And so we would call those the best case of the algorithm. So we can divide this into things. It's like,

00:25:39.000 --> 00:25:41.000
well, what of the

00:25:41.000 --> 00:25:45.000
best case input, right? What can we expect out of the performance? Well, the best-case input would

00:25:45.000 --> 00:25:46.000
be

00:25:46.000 --> 00:25:48.000
it's found in the first few members.

00:25:48.000 --> 00:25:49.000
In which case

00:25:49.000 --> 00:25:52.000
it's an O of one algorithm for those situations.

00:25:52.000 --> 00:25:56.000
That's not often very interesting to say, well, here's a particular input that causes it to immediately

00:25:56.000 --> 00:25:58.000
be able to calculate something in return.

00:25:58.000 --> 00:26:02.000
Yeah. Not so interesting. So it's worth knowing that such a thing exists, but it turns out it's

00:26:02.000 --> 00:26:03.000
not

00:26:03.000 --> 00:26:04.000


00:26:04.000 --> 00:26:08.000
likely to tell us a lot about the general performance of this algorithm.

00:26:08.000 --> 00:26:11.000
If it's in the middle or it's in the last slot, right? We're

00:26:11.000 --> 00:26:15.000
gonna be looking at a lot of the elements to decide that it's there or not.

00:26:15.000 --> 00:26:18.000
In the absolute worst case,

00:26:18.000 --> 00:26:21.000
right? So what's the input that causes it to do the most amount of work

00:26:21.000 --> 00:26:23.000
is the one where it doesn't find it at all. Where

00:26:23.000 --> 00:26:28.000
it looks at every single element, never finding the match, and then comes out and returns that false.

00:26:28.000 --> 00:26:31.000
And the worst case, in this case, is a fairly likely thing to happen, right?

00:26:31.000 --> 00:26:34.000
We're searching because we happen to believe it might not be there

00:26:34.000 --> 00:26:35.000
and that gives us this upper bound on

00:26:35.000 --> 00:26:40.000
how bad it gets. So in the worst case it looks at everything and it is definitely O of

00:26:40.000 --> 00:26:43.000
n. So we have this kind of constant best case,

00:26:43.000 --> 00:26:44.000
an O of n worst case,

00:26:44.000 --> 00:26:47.000
and then our average case is gonna be somewhere in the middle there.

00:26:47.000 --> 00:26:52.000
This actually is a little bit harder to predict or to compute precisely in most

00:26:52.000 --> 00:26:57.000
situations because you have to know things about, well, what are the likely range of inputs?

00:26:57.000 --> 00:27:01.000
So typically the definition is that it's averaged over all the possible inputs.

00:27:01.000 --> 00:27:06.000
Well, given the search it's pretty hard to say what are all the possible inputs for it? It's

00:27:06.000 --> 00:27:10.000
like you can make some assumptions, like, well, all in ? all permutations of the list

00:27:10.000 --> 00:27:12.000
are equally likely

00:27:12.000 --> 00:27:17.000
and the name is in the list about half the time, let's say. You can just ? you have

00:27:17.000 --> 00:27:22.000
to make up some parameters that describe what you believe to be the likely inputs

00:27:22.000 --> 00:27:26.000
and then you can say, well, if it were in the list, if it's equally likely to be

00:27:26.000 --> 00:27:27.000
in the front as in the back,

00:27:27.000 --> 00:27:30.000
then on average it's gonna look at n over two.

00:27:30.000 --> 00:27:35.000
It's just as likely to be in all the front slots. So, let's say, if you were gonna call it n times

00:27:35.000 --> 00:27:36.000
and

00:27:36.000 --> 00:27:40.000
the name you were looking for was going to be in each individual slot exactly one

00:27:40.000 --> 00:27:41.000
time, then the total random

00:27:41.000 --> 00:27:43.000
perfect permutation case.

00:27:43.000 --> 00:27:46.000
So then it would have looked at n over two of them. Sometimes it looked at one, sometimes it looked

00:27:46.000 --> 00:27:48.000
at n minus one, sometimes as it looked at n

00:27:48.000 --> 00:27:52.000
over two, sometimes n over two plus one, and whatnot. That over all of them it looked at

00:27:52.000 --> 00:27:53.000
about half.

00:27:53.000 --> 00:27:57.000
And then if there was another set of calls to search for it, right? Where it never found it, it would be looking

00:27:57.000 --> 00:27:58.000
at

00:27:58.000 --> 00:27:59.000
n, right? And so

00:27:59.000 --> 00:28:02.000
you can say, well, sometimes we look at n over two, sometimes we look at n.

00:28:02.000 --> 00:28:04.000
On average we're looking at about three-quarters of them, right?

00:28:04.000 --> 00:28:07.000
In big O, since we get to be a bit sloppy here,

00:28:07.000 --> 00:28:10.000
we can say, well, this all just boils down to being linear. That O

00:28:10.000 --> 00:28:14.000
of n still described to the growth in the average case,

00:28:14.000 --> 00:28:16.000


00:28:16.000 --> 00:28:20.000
which is the same number we got for the worst case. So this is a little

00:28:20.000 --> 00:28:23.000
bit tricky, but this is actually the one that's probably the most interesting, right? Which is just in

00:28:23.000 --> 00:28:28.000
normal practice, how is this thing gonna behave?

00:28:28.000 --> 00:28:31.000
So the last little thing I need to

00:28:31.000 --> 00:28:34.000
complete before we can go on and start applying this to do something interesting, is to talk

00:28:34.000 --> 00:28:39.000
a little bit about how we do recursive algorithms because they're a little bit trickier than the standard iteration

00:28:39.000 --> 00:28:40.000
and counting.

00:28:40.000 --> 00:28:44.000
So the counting statements are kind of saying, oh, I've got this loop followed by this loop and this thing

00:28:44.000 --> 00:28:45.000
happening, right?

00:28:45.000 --> 00:28:49.000
Will get you through the simple cases. Well, what do we do when we have a recursive algorithm, right? Well,

00:28:49.000 --> 00:28:53.000
we're trying to compute the time spent in solving something

00:28:53.000 --> 00:28:57.000
that, in affect, is making a call to the same algorithm and so we're gonna likely have

00:28:57.000 --> 00:29:00.000
some recursive definition we need to work through.

00:29:00.000 --> 00:29:02.000
So this is the factorial function.

00:29:02.000 --> 00:29:06.000
If n equals zero, return one, otherwise return n times the factorial n minus one.

00:29:06.000 --> 00:29:08.000
We're interested in doing kind of the statement counts or kind

00:29:08.000 --> 00:29:11.000
of summarizing the time for an input

00:29:11.000 --> 00:29:13.000
whose value is n.

00:29:13.000 --> 00:29:17.000
And so what we write down is what's called a recurrence relation

00:29:17.000 --> 00:29:20.000
that largely matches the code in terms of saying

00:29:20.000 --> 00:29:21.000
how much work is done

00:29:21.000 --> 00:29:25.000
in the base case? In the different cases that are being handled? Typically there is a base case

00:29:25.000 --> 00:29:29.000
or two where you say, well, if it's in these cases, right? We do these easy things.

00:29:29.000 --> 00:29:31.000
So if n is exactly zero

00:29:31.000 --> 00:29:36.000
then we do O of one worth of work, right? We do a little bit of constant work here. We

00:29:36.000 --> 00:29:38.000
do a comparison and a return.

00:29:38.000 --> 00:29:42.000
In the otherwise case when it is not zero,

00:29:42.000 --> 00:29:46.000
then we do a little bit of work. In this case, a return, a multiply, a little bit of constant work

00:29:46.000 --> 00:29:48.000
plus whatever time it takes to compute

00:29:48.000 --> 00:29:51.000
the factorial of n minus one.

00:29:51.000 --> 00:29:54.000
So we have T of n defined in terms of

00:29:54.000 --> 00:29:58.000
T of n of something else, right? Which is exactly what we'd expect in a recursive function.

00:29:58.000 --> 00:30:00.000
This is called a recurrence relation, so that

00:30:00.000 --> 00:30:03.000
solving for T of

00:30:03.000 --> 00:30:05.000
n means working with

00:30:05.000 --> 00:30:09.000
a side that refers back to that same T, but on a smaller version of the problem

00:30:09.000 --> 00:30:15.000
in n over two and n minus one. Some other variation of that recursive call.

00:30:15.000 --> 00:30:19.000
So let me show you how we make that go into closed form.

00:30:19.000 --> 00:30:23.000
The idea ? and actually I'm just gonna do this on the board because I think it's easier if I just write what's

00:30:23.000 --> 00:30:26.000
going on and you can watch me develop it.

00:30:26.000 --> 00:30:28.000
So I have this recurrence relation.

00:30:28.000 --> 00:30:29.000
Even

00:30:29.000 --> 00:30:31.000
equals one if

00:30:31.000 --> 00:30:33.000
n equals zero

00:30:33.000 --> 00:30:39.000
and then it's one plus T of n minus one otherwise. So I'm trying to

00:30:39.000 --> 00:30:41.000
solve for T of n

00:30:41.000 --> 00:30:45.000
into a closed form. So I'm gonna start with

00:30:45.000 --> 00:30:50.000
it's non-recur ? the non-base case form. So the recursive step. And then what I'm gonna do

00:30:50.000 --> 00:30:54.000
is I'm actually just gonna reply repeated substitution

00:30:54.000 --> 00:30:56.000
to take this term and expand it out.

00:30:56.000 --> 00:31:00.000
So I know it's one plus whatever it takes to do T of n minus one.

00:31:00.000 --> 00:31:02.000
So if I go back over here and I say, well, T of

00:31:02.000 --> 00:31:03.000
n minus one

00:31:03.000 --> 00:31:04.000
must be

00:31:04.000 --> 00:31:06.000


00:31:06.000 --> 00:31:08.000
one if n minus one equals zero or it's

00:31:08.000 --> 00:31:11.000
one plus T of n minus two. So

00:31:11.000 --> 00:31:13.000
kind

00:31:13.000 --> 00:31:17.000
of plugging it into the original formula and getting the expansion one layer in.

00:31:17.000 --> 00:31:19.000
So I can say, well, this is

00:31:19.000 --> 00:31:20.000
one plus

00:31:20.000 --> 00:31:23.000
one plus T of n minus two. And if I apply

00:31:23.000 --> 00:31:25.000
that again,

00:31:25.000 --> 00:31:27.000
right? I should get another term of

00:31:27.000 --> 00:31:35.000


00:31:35.000 --> 00:31:44.000
one. So after I have done this I times

00:31:44.000 --> 00:31:46.000
then I will have a bunch of ones added up together

00:31:46.000 --> 00:31:49.000
and I will have subtracted an I from T

00:31:49.000 --> 00:31:51.000


00:31:51.000 --> 00:31:52.000
down to some term. So I'm just gonna ? each of

00:31:52.000 --> 00:31:54.000
these represents kind of like

00:31:54.000 --> 00:31:57.000
this is a little bit of work from the n, which does the n minus one, which brings the quality of my two.

00:31:57.000 --> 00:32:01.000
So each of the cul's kind of in the stack frame has a little bit of a

00:32:01.000 --> 00:32:03.000
contribution to add to the whole thing

00:32:03.000 --> 00:32:04.000
and then what we're looking at is

00:32:04.000 --> 00:32:08.000
how many times do we have to do that expansion and substitution, right?

00:32:08.000 --> 00:32:10.000
Before we hit this base case, right?

00:32:10.000 --> 00:32:12.000
Where T of n exactly equals one.

00:32:12.000 --> 00:32:15.000
So we're looking for the case where n

00:32:15.000 --> 00:32:18.000
actually it's n equals zero. So

00:32:18.000 --> 00:32:20.000
where n

00:32:20.000 --> 00:32:23.000
so I want to do this until n minus I equals equals zero.

00:32:23.000 --> 00:32:27.000
Okay. So I need to have done this I times where I is n.

00:32:27.000 --> 00:32:32.000
So if I say I set I equaled to

00:32:32.000 --> 00:32:34.000
n, then I'll have one plus one plus one

00:32:34.000 --> 00:32:37.000
n times,

00:32:37.000 --> 00:32:39.000
and then I have plus

00:32:39.000 --> 00:32:41.000
T of the n minus n over here,

00:32:41.000 --> 00:32:43.000
which is my T subzero.

00:32:43.000 --> 00:32:47.000
T subzero, right? Immediately plugs into that base case and says, well, there's just one more thing

00:32:47.000 --> 00:32:48.000
to do when you get to that

00:32:48.000 --> 00:32:52.000
and so what I basically have here is n plus one.

00:32:52.000 --> 00:33:00.000
So n multiplications plus one little tack on for the base case, which in terms of big O

00:33:00.000 --> 00:33:03.000
is just linear. A little bit

00:33:03.000 --> 00:33:05.000
of extra work in the noise there, but

00:33:05.000 --> 00:33:10.000
that means that kind of as it seems more predictable, right? That factorial over particular input,

00:33:10.000 --> 00:33:11.000
right? Is linear

00:33:11.000 --> 00:33:17.000
in the number you ask to compute the factorial. The factorial of ten takes ten

00:33:17.000 --> 00:33:20.000
multiplications, right? The factorial of 20 takes 20 multiplications.

00:33:20.000 --> 00:33:24.000
So if you change the size of your input, right? You double it; it should take twice as long. However much

00:33:24.000 --> 00:33:28.000
time it cost you to compute the factorial of ten is gonna take twice as much time to compute

00:33:28.000 --> 00:33:32.000
the factorial of 20. Okay. That kind of

00:33:32.000 --> 00:33:36.000
makes sense, but it's good to know kind of how I can do this math,

00:33:36.000 --> 00:33:38.000
right? To work this out, right? So this idea of

00:33:38.000 --> 00:33:45.000
taking the term, repeatedly substituting and expanding, generalizing my pattern, and say, well, after I substitutions

00:33:45.000 --> 00:33:46.000
worth where am I at?

00:33:46.000 --> 00:33:48.000
And these correspond and kind of thinking about

00:33:48.000 --> 00:33:53.000
the recursion tree. What calls are made and then how deep does the recursion

00:33:53.000 --> 00:33:55.000
go before I hit the base case

00:33:55.000 --> 00:33:58.000
and that tells me how to

00:33:58.000 --> 00:34:08.000
stop that expanding and then substitute back in for the base case to compute my total result. I'm gonna do another one, so

00:34:08.000 --> 00:34:10.000
if you want to just

00:34:10.000 --> 00:34:12.000
watch and we'll do

00:34:12.000 --> 00:34:13.000
the math for this together, too.

00:34:13.000 --> 00:34:17.000
This is the Towers of Hanoi example

00:34:17.000 --> 00:34:18.000
that moves the

00:34:18.000 --> 00:34:20.000
tower away of n minus one off,

00:34:20.000 --> 00:34:25.000
moves the single disk on the bottom and then moves that tower back on.

00:34:25.000 --> 00:34:37.000
And so the recurrence that we're working with

00:34:37.000 --> 00:34:42.000
is one when n equals zero. So when n equals zero, right? We have a zero height tower to move and we

00:34:42.000 --> 00:34:46.000
actually do no work in the function, right? We just do that test and return, so if n equals zero

00:34:46.000 --> 00:34:47.000
there's no work to be done.

00:34:47.000 --> 00:34:50.000
So that's the easy case for us, right? Is

00:34:50.000 --> 00:34:53.000
when it's zero

00:34:53.000 --> 00:34:54.000
do no work.

00:34:54.000 --> 00:34:55.000
Otherwise,

00:34:55.000 --> 00:34:57.000
right? We move the bottom most disk.

00:34:57.000 --> 00:35:01.000
So we do a little testing and moving of that disk. We'll call that the constant amount of work that's in the function

00:35:01.000 --> 00:35:02.000
itself.

00:35:02.000 --> 00:35:04.000
And it makes two recursive calls

00:35:04.000 --> 00:35:05.000
each of a tower

00:35:05.000 --> 00:35:10.000
of height n minus one.

00:35:10.000 --> 00:35:12.000
So otherwise, right?

00:35:12.000 --> 00:35:15.000
Two calls, which gives a little clue to what the tree looks like. It'll branch twice

00:35:15.000 --> 00:35:20.000
at each stop and then it's one closer to that base case gives us a sense that the recursion

00:35:20.000 --> 00:35:21.000
depth

00:35:21.000 --> 00:35:24.000
is likely to be linear here.

00:35:24.000 --> 00:35:26.000
So let me go through the process of making this work.

00:35:26.000 --> 00:35:30.000
I've got T of

00:35:30.000 --> 00:35:33.000
n equals one plus two, T to the n minus one. So then

00:35:33.000 --> 00:35:35.000
I

00:35:35.000 --> 00:35:37.000
take T to the n minus one

00:35:37.000 --> 00:35:39.000
and I plug it in over here

00:35:39.000 --> 00:35:42.000
to get one plus two

00:35:42.000 --> 00:35:44.000
T to the n minus two.

00:35:44.000 --> 00:35:47.000
This whole thing is in a -

00:35:47.000 --> 00:35:51.000
multiplied by two though because I have two of those, right? From the original call

00:35:51.000 --> 00:35:53.000
which then itself made two. So, in fact, if I multiply through,

00:35:53.000 --> 00:35:55.000
right? I've got one plus

00:35:55.000 --> 00:35:59.000
two plus four

00:35:59.000 --> 00:36:01.000
T to the n over two. If

00:36:01.000 --> 00:36:03.000
I apply my substitution again, one plus two plus

00:36:03.000 --> 00:36:06.000
four times T to

00:36:06.000 --> 00:36:09.000
the n minus two is one plus two, T

00:36:09.000 --> 00:36:12.000
to the n minus three,

00:36:12.000 --> 00:36:13.000
and then let the

00:36:13.000 --> 00:36:21.000
multiplication go through again. One plus two plus four plus eight T to the n

00:36:21.000 --> 00:36:22.000
minus three.

00:36:22.000 --> 00:36:26.000
And so each expansion of this, right? Is causing the number of towers

00:36:26.000 --> 00:36:31.000
that are being moved around to go up by a factor of two. So each time I do this,

00:36:31.000 --> 00:36:32.000
right? I went from

00:36:32.000 --> 00:36:36.000
two towers to move to four towers to eight towers, but those towers each got one shorter

00:36:36.000 --> 00:36:41.000
in the process. So kind of telling us a little bit about what the recursion tree looks like, right?

00:36:41.000 --> 00:36:42.000
Is there is a

00:36:42.000 --> 00:36:48.000
branching factor of two all the way down and that

00:36:48.000 --> 00:36:50.000
the depth of this thing

00:36:50.000 --> 00:36:51.000
is gonna bottom out linearly

00:36:51.000 --> 00:36:57.000
because this was a tower by ten, nine, eight, seven, and so on down to the bottom.

00:36:57.000 --> 00:36:59.000
So if I imagined this happened I times,

00:36:59.000 --> 00:37:03.000
so to generalize my pattern. I've got one plus two

00:37:03.000 --> 00:37:06.000
plus four plus two

00:37:06.000 --> 00:37:09.000
to the I. So after I've done this

00:37:09.000 --> 00:37:12.000
that number of times, right? Actually it's two I minus one plus two to

00:37:12.000 --> 00:37:17.000
the I, T n minus I.

00:37:17.000 --> 00:37:22.000
So I have subtracted I off the heights of the tower.

00:37:22.000 --> 00:37:26.000
I have gone up by a factor of two each time I did that and then I have these sums in the front,

00:37:26.000 --> 00:37:30.000
which represent kind of the single disk that got moved in there. One plus two plus four

00:37:30.000 --> 00:37:32.000
plus eight plus so on

00:37:32.000 --> 00:37:33.000
down to there.

00:37:33.000 --> 00:37:37.000
And so the place I want to get to, right? Is where n equals zero.

00:37:37.000 --> 00:37:39.000
So I actually want to set

00:37:39.000 --> 00:37:44.000
n, I equal to n here. I wrote that backwards, let's say I equals n. I plug that in I've

00:37:44.000 --> 00:37:49.000
got one plus two plus four plus all the powers of two

00:37:49.000 --> 00:37:52.000
to the n minus one plus

00:37:52.000 --> 00:37:54.000
two to the n, T to the n minus

00:37:54.000 --> 00:37:55.000
n, which is T to the zero,

00:37:55.000 --> 00:37:58.000
which is just one.

00:37:58.000 --> 00:38:05.000
So what I've got here

00:38:05.000 --> 00:38:11.000
is following along.

00:38:11.000 --> 00:38:13.000
Is T to the n is one plus two plus four

00:38:13.000 --> 00:38:14.000
plus two to the n, right?

00:38:14.000 --> 00:38:18.000
So two

00:38:18.000 --> 00:38:19.000
to the n

00:38:19.000 --> 00:38:20.000
minus

00:38:20.000 --> 00:38:23.000
one plus two to the n times one.

00:38:23.000 --> 00:38:25.000
So I've got the geometric sum here.

00:38:25.000 --> 00:38:26.000
You may or may not

00:38:26.000 --> 00:38:30.000
all ready know how to solve this one, but I'm just gonna go ahead and

00:38:30.000 --> 00:38:34.000
solve it in front of you to remind you of how the process works. I've got the powers of two.

00:38:34.000 --> 00:38:36.000
One plus two plus up to the n.

00:38:36.000 --> 00:38:39.000
So that's the term I'm looking for. I want to call this A

00:38:39.000 --> 00:38:40.000
just to mark it.

00:38:40.000 --> 00:38:45.000
And then what I'm gonna compute for you is what two times A is

00:38:45.000 --> 00:38:50.000
and I'm gonna write it underneath. So if I multiply this by two I have one times two, which is two.

00:38:50.000 --> 00:38:52.000
Two times two, which is four,

00:38:52.000 --> 00:38:53.000
right? Four times four

00:38:53.000 --> 00:38:55.000
at two, which is eight.

00:38:55.000 --> 00:38:56.000
And so on.

00:38:56.000 --> 00:38:59.000
And so I should get

00:38:59.000 --> 00:39:02.000
basically the same sequence of terms, but shifted over by one.

00:39:02.000 --> 00:39:04.000
I don't have the factor of one

00:39:04.000 --> 00:39:06.000
and I do have an additional factor of two to the

00:39:06.000 --> 00:39:11.000
n times another power of two at the end. So I've got the whole term kind of shifted

00:39:11.000 --> 00:39:15.000
over by one. That's my two A so that's the same thing I'm looking for, but doubled.

00:39:15.000 --> 00:39:17.000
And then I'm gonna basically take this thing

00:39:17.000 --> 00:39:19.000
and I'm gonna

00:39:19.000 --> 00:39:23.000
add negative A to two A. So subtracting A off of two A

00:39:23.000 --> 00:39:24.000
so that all these terms

00:39:24.000 --> 00:39:28.000
will go away. If I have this minus this, this minus this all the way across, right?

00:39:28.000 --> 00:39:30.000
The terms I'll be left with

00:39:30.000 --> 00:39:32.000
is two to the n plus one

00:39:32.000 --> 00:39:35.000
minus one is

00:39:35.000 --> 00:39:37.000
A subtracted from two A so the

00:39:37.000 --> 00:39:38.000
two to the T to

00:39:38.000 --> 00:39:41.000
the n that I'm looking for

00:39:41.000 --> 00:39:43.000
is two to the n plus one minus one.

00:39:43.000 --> 00:39:45.000
That's my term there. And

00:39:45.000 --> 00:39:51.000
if I summarize in my big O way,

00:39:51.000 --> 00:39:56.000
ignoring the constants, throwing away these little parts of the terms, right? That are in the noise. That

00:39:56.000 --> 00:39:58.000
what we really have here is something that grows,

00:39:58.000 --> 00:40:00.000
right? Exponentially

00:40:00.000 --> 00:40:04.000
by about a factor of two for each additional

00:40:04.000 --> 00:40:05.000


00:40:05.000 --> 00:40:07.000
disk added to the tower.

00:40:07.000 --> 00:40:10.000
So however much time it took to move a tower of height ten,

00:40:10.000 --> 00:40:15.000
right? A tower of height 11 we expect to take twice as long. So not growing linearly at all, right?

00:40:15.000 --> 00:40:18.000
Much, much, much sharply growing, right? What exponential

00:40:18.000 --> 00:40:19.000
growth looks like.

00:40:19.000 --> 00:40:22.000
Two to the n, right? Even for small values of n.

00:40:22.000 --> 00:40:27.000
Ten, 20, right? Is up into the millions all ready in terms of disks that need to be

00:40:27.000 --> 00:40:30.000
moved. So

00:40:30.000 --> 00:40:33.000
this sort of strategy, right? Is good to know. When you're looking at this recursive things, right?

00:40:33.000 --> 00:40:37.000
Kind of having this analysis. It says, okay, well, where did I start from?

00:40:37.000 --> 00:40:41.000
What's the time? And then just being pretty mechanical. Sometimes you can get to something here that

00:40:41.000 --> 00:40:45.000
requires a little bit of algebra to work out, but the most common patterns, right?

00:40:45.000 --> 00:40:48.000
You'll start to recognize over and again, right?

00:40:48.000 --> 00:40:51.000
And this one, right? The fact that it's branching by a factor of two

00:40:51.000 --> 00:41:03.000
telling you that at the kind of base case, right? You're gonna expect to see a two to the n expansion.

00:41:03.000 --> 00:41:05.000
So just to give you

00:41:05.000 --> 00:41:09.000
a little bit of some idea that you can use big O to predict things

00:41:09.000 --> 00:41:13.000
based on getting some run time performance for some small values of n

00:41:13.000 --> 00:41:16.000
and then what you'd expect to see for larger values.

00:41:16.000 --> 00:41:17.000
So I have

00:41:17.000 --> 00:41:19.000
three different algorithms here.

00:41:19.000 --> 00:41:20.000
One that was

00:41:20.000 --> 00:41:24.000
just logarithmic in terms of n, so it should divide the input in half and kind of

00:41:24.000 --> 00:41:27.000
work on it. Something like binary search actually fits that profile.

00:41:27.000 --> 00:41:31.000
Something that's linear. Something that's n log n and something that's n squared.

00:41:31.000 --> 00:41:35.000
Then for different values of n I've plugged in actually the first ones I ran and then actually

00:41:35.000 --> 00:41:39.000
I - some of the other ones I had to estimate because they were taking way too long to finish.

00:41:39.000 --> 00:41:43.000
So that for an input of size ten, all of them are imperceptible. These are in terms of seconds

00:41:43.000 --> 00:41:44.000
here.

00:41:44.000 --> 00:41:48.000
Taking fractions of seconds. This is on my machine that is running at a couple gigahertz, so it's

00:41:48.000 --> 00:41:50.000
got about a million instructions per second.

00:41:50.000 --> 00:41:53.000
You up that by a factor of ten, right?

00:41:53.000 --> 00:41:56.000
And they're still kind of in the subsecond range, but you can see that from the

00:41:56.000 --> 00:41:59.000
n squared algorithm

00:41:59.000 --> 00:42:00.000
took a bigger jump,

00:42:00.000 --> 00:42:04.000
let's say, than the linear algorithm did, which went up by a factor of ten almost

00:42:04.000 --> 00:42:05.000
exactly.

00:42:05.000 --> 00:42:07.000
Going up by another factor of ten

00:42:07.000 --> 00:42:08.000
and,

00:42:08.000 --> 00:42:13.000
again, sort of climbing, right? 10,000, 100,000, a million, a trillion. You

00:42:13.000 --> 00:42:15.000
get to something that's like 100,000, right?

00:42:15.000 --> 00:42:20.000
And an algorithm that's linear, right? Is still taking a fraction of a second. But something

00:42:20.000 --> 00:42:24.000
that was quadratic now has climbed up to take several hours.

00:42:24.000 --> 00:42:28.000
So even for inputs that don't seem particularly huge, right? The difference between having an

00:42:28.000 --> 00:42:33.000
algorithm that's gonna operate in linear time versus quadratic is gonna be felt very profoundly. And

00:42:33.000 --> 00:42:37.000
as you move to these larger numbers you have things that you just can't do,

00:42:37.000 --> 00:42:40.000
right? You cannot run an n squared algorithm on a trillion pieces of data

00:42:40.000 --> 00:42:42.000
in your lifetime.

00:42:42.000 --> 00:42:43.000


00:42:43.000 --> 00:42:48.000
It's just too much work for what the input is. Clinton? Yeah. How did it go down for a

00:42:48.000 --> 00:42:51.000
billion from

00:42:51.000 --> 00:42:53.000
like a million?

00:42:53.000 --> 00:42:55.000
From - Oh, you know what,

00:42:55.000 --> 00:42:59.000
that just must be wrong. You're totally right. That should definitely be over some. Yeah.

00:42:59.000 --> 00:43:02.000
Well log n should really be - I think that

00:43:02.000 --> 00:43:05.000
when I copied and pasted from these terminal window. So, of course, I must have just made a mistake when I

00:43:05.000 --> 00:43:08.000
was moving something across, but you're totally right.

00:43:08.000 --> 00:43:09.000
This should be going up,

00:43:09.000 --> 00:43:13.000
right? Ever so slightly, right? This algorithms going very, very slowly logarithmic function,

00:43:13.000 --> 00:43:16.000
right? Almost a flat line, but that definitely should be up a little bit, right? From

00:43:16.000 --> 00:43:18.000
where it is.

00:43:18.000 --> 00:43:19.000
When you get to a trillion, right?

00:43:19.000 --> 00:43:24.000
Even the linear algorithm is starting to actually take some noticeable time, right?

00:43:24.000 --> 00:43:28.000
But things that are logarithmic still running very, very slowly. So this is an example of

00:43:28.000 --> 00:43:32.000
binary search. Binary search operating on a million or trillion items is still just

00:43:32.000 --> 00:43:33.000
in a

00:43:33.000 --> 00:43:35.000
flash, right? Telling you whether it found it or not.

00:43:35.000 --> 00:43:37.000
Doing a linear search

00:43:37.000 --> 00:43:38.000
on a trillion or billion

00:43:38.000 --> 00:43:40.000
is

00:43:40.000 --> 00:43:41.000
several minutes' worth of

00:43:41.000 --> 00:43:45.000
my old computers

00:43:45.000 --> 00:43:48.000
time. And so just another way to look at them, right? Is to think about what those things look

00:43:48.000 --> 00:43:51.000
like in terms of graphed on the Cartesian plane

00:43:51.000 --> 00:43:55.000
that a constant algorithm is just a flat line. It takes the same amount of time no matter how big the input is.

00:43:55.000 --> 00:43:56.000
It's

00:43:56.000 --> 00:44:00.000
pretty small values of n down here, so it just shows the early stages, right? But logarithmic, right?

00:44:00.000 --> 00:44:03.000
Almost a flat line itself. A little bit above

00:44:03.000 --> 00:44:06.000
constant, but very, very slowly growing.

00:44:06.000 --> 00:44:07.000
The

00:44:07.000 --> 00:44:09.000
linear scale, right? Showing the lines and then the

00:44:09.000 --> 00:44:13.000
n squared and to the n showing you that they're kind of heading to the hills very

00:44:13.000 --> 00:44:15.000
quickly here. So even for small values of n

00:44:15.000 --> 00:44:16.000


00:44:16.000 --> 00:44:18.000
they are reaching into the

00:44:18.000 --> 00:44:23.000
high regions and will continue to do so as

00:44:23.000 --> 00:44:31.000
that will be more pronounced, right? As you move into the higher and higher values of n. All right. I get to tell

00:44:31.000 --> 00:44:34.000
you a little bit about sorting before

00:44:34.000 --> 00:44:36.000
we go away. Because some of them I'm just setting the stage for, like,

00:44:36.000 --> 00:44:40.000
okay, how can we use this to do something interesting? Knowing about how to compare and then contrast.

00:44:40.000 --> 00:44:41.000
That it tells you something.

00:44:41.000 --> 00:44:44.000
Let's try to solve a problem, right? That

00:44:44.000 --> 00:44:47.000
lends itself to different approaches that have different algorithmic

00:44:47.000 --> 00:44:50.000
results to them that are worth thinking about.

00:44:50.000 --> 00:44:54.000
So sorting turns out to be one of the best problems to study for this because it turns out sorting

00:44:54.000 --> 00:44:55.000
is

00:44:55.000 --> 00:44:58.000
one of the things that computers do a lot of.

00:44:58.000 --> 00:45:02.000
That it's very, very common that you need to keep data in sorted order. It makes it easier to

00:45:02.000 --> 00:45:06.000
search. It makes it easier to find the minimum, the maximum, and find duplicates. All these sort of things, right?

00:45:06.000 --> 00:45:09.000
Like by having, keeping it inserted or it tends to be more convenient to access that way.

00:45:09.000 --> 00:45:11.000
It also makes a bunch of other things,

00:45:11.000 --> 00:45:14.000
other properties about the data, easy to do.

00:45:14.000 --> 00:45:18.000
If you want to find the top ten percent, right? You can just pick them off, right? If they've all ready

00:45:18.000 --> 00:45:19.000


00:45:19.000 --> 00:45:20.000
been sorted. You want to

00:45:20.000 --> 00:45:23.000
group them into buckets for histograming, right? All these things actually are

00:45:23.000 --> 00:45:24.000
enabled by having them

00:45:24.000 --> 00:45:28.000
all ready be in sorted order. So it turns out that a lot of times that data that you get

00:45:28.000 --> 00:45:31.000
from other sources tends to first need to be put in sorted order before you start doing

00:45:31.000 --> 00:45:33.000
stuff on it. Okay.

00:45:33.000 --> 00:45:36.000
Well, there's lots of different ways you can sort it.

00:45:36.000 --> 00:45:40.000
There are simple ways. There are complicated ways. There are fancy ways. There are ways that

00:45:40.000 --> 00:45:42.000
are dumb,

00:45:42.000 --> 00:45:44.000


00:45:44.000 --> 00:45:46.000
but easy to write. Ways that are smart, but hard

00:45:46.000 --> 00:45:48.000
to write. And everything in between.

00:45:48.000 --> 00:45:52.000
So we're gonna be looking at it in terms of sorting vectors because that's probably the most common data structure

00:45:52.000 --> 00:45:53.000
that needs the sorting.

00:45:53.000 --> 00:45:57.000
But you can also imagine taking the same kind of algorithms and applying them to different kinds of data

00:45:57.000 --> 00:46:00.000
structures you have. Like starting a linked list.

00:46:00.000 --> 00:46:04.000
Okay. So I'm gonna show you a sorting algorithm.

00:46:04.000 --> 00:46:06.000
It's probably the simplest and the easiest to write.

00:46:06.000 --> 00:46:10.000
If somebody - if you were stuck on a desert island without a textbook, but you happen to have

00:46:10.000 --> 00:46:11.000
a compiler

00:46:11.000 --> 00:46:15.000
and you needed to write a sorting algorithm to get your way off the island. It

00:46:15.000 --> 00:46:18.000
comes up all the time.

00:46:18.000 --> 00:46:20.000
This is probably the algorithm you're gonna use.

00:46:20.000 --> 00:46:23.000
It's called selection sort.

00:46:23.000 --> 00:46:24.000
And the idea behind selection sort

00:46:24.000 --> 00:46:26.000
is that it's going to select

00:46:26.000 --> 00:46:28.000
the smallest element

00:46:28.000 --> 00:46:30.000
and put it in the front.

00:46:30.000 --> 00:46:33.000
So if I have a big stack of test papers

00:46:33.000 --> 00:46:36.000
and I want to sort them in order of score,

00:46:36.000 --> 00:46:40.000
then I'm gonna go through there and find somebody who got the absolute lowest score. It's said, but

00:46:40.000 --> 00:46:42.000
true. Somebody had to be there.

00:46:42.000 --> 00:46:45.000
So I kind of work my through it and maybe I hold my finger on the one that I've seen that looks

00:46:45.000 --> 00:46:49.000
smallest so far. So this one's a 40, oh,

00:46:49.000 --> 00:46:50.000
well, this one's a 38. Okay. Oh, look there's a 35. Oh,

00:46:50.000 --> 00:46:54.000
look, there's a 22. Nobody gets these scores in my exams, I'm just kidding.

00:46:54.000 --> 00:46:56.000
And then finally get down to the bottom. You say, okay,

00:46:56.000 --> 00:47:00.000
that 25, that was the smallest. I have a hold of it, right? And

00:47:00.000 --> 00:47:01.000
I'm gonna basically take that

00:47:01.000 --> 00:47:03.000
and bring it to the front.

00:47:03.000 --> 00:47:05.000


00:47:05.000 --> 00:47:07.000
And in terms of managing a vector

00:47:07.000 --> 00:47:11.000
that actually there's a slight efficiency to be had by instead of kind of pulling it out of the

00:47:11.000 --> 00:47:15.000
stack and sliding everything down I'm actually just gonna swap it with the one that's in the

00:47:15.000 --> 00:47:18.000
very front. So whoever was the current first one is gonna booted

00:47:18.000 --> 00:47:22.000
to pull in this one and I'm gonna put them back where this other one came from.

00:47:22.000 --> 00:47:26.000
So I have moved the very smallest to the top of the vector.

00:47:26.000 --> 00:47:28.000
Then I just do the same thing again,

00:47:28.000 --> 00:47:32.000
but now excluding that smallest one. So I've all ready seen that one and kind of set it aside. I start looking

00:47:32.000 --> 00:47:35.000
at the remaining n minus one and I do the same thing again.

00:47:35.000 --> 00:47:38.000
Find the smallest of what remains, kind of just walking through,

00:47:38.000 --> 00:47:39.000
going to find it,

00:47:39.000 --> 00:47:44.000
and swap it down into the second slot, and so on.

00:47:44.000 --> 00:47:50.000
A little bit of code.

00:47:50.000 --> 00:47:55.000
Doing it kind of the way I described it, right? Of tracking the minimum index, right? Looking from

00:47:55.000 --> 00:47:59.000
here to the end, so this four loop in the middle here starts at the current position and only considers

00:47:59.000 --> 00:48:01.000
from here

00:48:01.000 --> 00:48:02.000
to the very end of the

00:48:02.000 --> 00:48:05.000
vector and then finds

00:48:05.000 --> 00:48:09.000
any element which is smaller than the one I'm kind of currently holding my finger on

00:48:09.000 --> 00:48:13.000
and then when I'm done after that inner loop has executed I will know what the min index is

00:48:13.000 --> 00:48:15.000
and then there's a spot function here

00:48:15.000 --> 00:48:17.000
that just exchanges those two things.

00:48:17.000 --> 00:48:19.000
The I slot now gets replaced with the one

00:48:19.000 --> 00:48:22.000
at min index and,

00:48:22.000 --> 00:48:25.000
again, exchanged in the contents of the array. I just do that

00:48:25.000 --> 00:48:28.000
n minus one times and I will have done it all. I'd like to show

00:48:28.000 --> 00:48:32.000
you animation, but I guess I'll just wait for that until Wednesday. We will

00:48:32.000 --> 00:48:36.000
watch it doing it's kind of thing in progress and you can kind of be thinking about for Wednesday what other

00:48:36.000 --> 00:48:37.000
ways you might be able to sort

00:48:37.000 --> 00:48:42.000
data than just this strategy.