0:00:00.000,0:00:08.000


0:00:08.000,0:00:11.000


0:00:11.000,0:00:15.000
This presentation is delivered by the Stanford Center for Professional Development.

0:00:15.000,0:00:24.000


0:00:24.000,0:00:26.000
Good afternoon.

0:00:26.000,0:00:29.000
You guys are talkative.

0:00:29.000,0:00:32.000
So we talked just a little bit about the vocabulary of recursion at the

0:00:32.000,0:00:36.000
end of Friday. It was very rushed for time, so I'm

0:00:36.000,0:00:40.000
just gonna repeat some of those words to get us thinking about what the

0:00:40.000,0:00:42.000
context for solving problems recursively looks like. And then we're gonna go along and

0:00:42.000,0:00:45.000
do a lot of examples [cuts out].

0:00:45.000,0:00:48.000
So the idea is that a recursive function is one that calls itself.

0:00:48.000,0:00:51.000
That's really all it means

0:00:51.000,0:00:54.000
at the most trivial level. It says that in the context of writing a

0:00:54.000,0:00:56.000
function binky, it's gonna make a call

0:00:56.000,0:01:00.000
or one or more calls to binky itself past

0:01:00.000,0:01:03.000
an argument as part of solving or doing some task.

0:01:03.000,0:01:04.000
The idea is that

0:01:04.000,0:01:08.000
we're using that because the problem itself has some self-similarity where the

0:01:08.000,0:01:11.000
answer I was seeking - so the idea of surveying the whole campus can actually be thought of as well if

0:01:11.000,0:01:14.000
I could get somebody to survey this part of campus and this part of campus,

0:01:14.000,0:01:17.000
somebody to survey all the freshmen, somebody to [cuts out]

0:01:17.000,0:01:18.000
and whatnot,

0:01:18.000,0:01:21.000
that those in a sense are - those surveys are just smaller instances of the same kind

0:01:21.000,0:01:24.000
of problem I was originally trying to solve. And

0:01:24.000,0:01:26.000
if I could recursively

0:01:26.000,0:01:30.000
delegate those things out, and then they themselves may in turn delegate even

0:01:30.000,0:01:32.000
further to smaller but same

0:01:32.000,0:01:36.000
structured problems to where we could eventually get to something so simple - in

0:01:36.000,0:01:40.000
that case of asking ten people for their input

0:01:40.000,0:01:44.000
that don't require any further decomposition - we will have worked our way to this base

0:01:44.000,0:01:48.000
case that then we can gather back up all the results and solve the whole thing.

0:01:48.000,0:01:51.000
This is gonna feel very mysterious at first,

0:01:51.000,0:01:53.000
and some of the examples I give you'll say I have

0:01:53.000,0:01:56.000
really easy alternatives other than recursion, so it's not gonna seem worth

0:01:56.000,0:01:58.000
the pain of trying to get your head around it.

0:01:58.000,0:02:03.000
But eventually, we're gonna work our way up to problems where recursion really is the right solution,

0:02:03.000,0:02:06.000
and there are no other alternatives that are obvious or simple

0:02:06.000,0:02:07.000
to do.

0:02:07.000,0:02:07.000
So the

0:02:07.000,0:02:11.000
idea throughout this week is actually just a lot of practice. Telling you

0:02:11.000,0:02:13.000
what the terms mean I think is not actually gonna help you understand it. I think what you

0:02:13.000,0:02:18.000
need to see is examples. So I'm gonna be doing four or five examples today, four or five

0:02:18.000,0:02:20.000
examples on Wednesday, and four or five examples on Friday

0:02:20.000,0:02:22.000
that each kind of

0:02:22.000,0:02:25.000
build on each other, kind of take the ideas

0:02:25.000,0:02:27.000
and get a little more sophisticated.

0:02:27.000,0:02:28.000
But

0:02:28.000,0:02:31.000
by the end of the week, I'm hoping that you're gonna start to see these patterns, and realize

0:02:31.000,0:02:35.000
that in some sense the recursive solutions tend to be more alike than different.

0:02:35.000,0:02:37.000
Once you have your head around how to solve

0:02:37.000,0:02:41.000
one type of problem, you may very well be able to take the exact same technique

0:02:41.000,0:02:43.000
and solve several other problems that

0:02:43.000,0:02:46.000
may sound different at first glance, but in the end, the recursive structure looks the

0:02:46.000,0:02:47.000
same.

0:02:47.000,0:02:54.000
So I'd say just hold the discomfort a little bit, and wait to see as

0:02:54.000,0:02:57.000
we keep working which example may be the one that kind of sticks out for you to

0:02:57.000,0:02:59.000
help you get it through.

0:02:59.000,0:03:02.000
So we're gonna start with things that fit in the category of

0:03:02.000,0:03:03.000
functional recursion.

0:03:03.000,0:03:05.000
The functional in this case just says that

0:03:05.000,0:03:09.000
you're writing functions that return some non-void thing, an integer, a string,

0:03:09.000,0:03:10.000
some

0:03:10.000,0:03:12.000
vector of results, or whatever that is,

0:03:12.000,0:03:14.000
and that's all it means to be a functional recursion. It's a recursive

0:03:14.000,0:03:18.000
function that has a result to it. And because of the recursive nature, it's gonna

0:03:18.000,0:03:22.000
say that the outer problem's result, the answer to the larger problem is gonna be

0:03:22.000,0:03:26.000
based on making calls, one or more calls to the function itself to get the answer

0:03:26.000,0:03:30.000
to a smaller problem, and then adding them, multiplying them, comparing them to decide

0:03:30.000,0:03:33.000
how to formulate the larger answer.

0:03:33.000,0:03:35.000
All recursive

0:03:35.000,0:03:35.000
code

0:03:35.000,0:03:38.000
follows the same decomposition

0:03:38.000,0:03:42.000
into two cases. Sometimes there's some subdivisions within there, but two general

0:03:42.000,0:03:43.000
cases.

0:03:43.000,0:03:45.000
The first thing - this base case.

0:03:45.000,0:03:48.000
That's something that the recursion eventually has to stop.

0:03:48.000,0:03:52.000
We keep saying take the task and break it down, make it a little smaller,

0:03:52.000,0:03:55.000
but at some point we really have to stop doing that. We can't

0:03:55.000,0:03:57.000
go on infinitely.

0:03:57.000,0:04:00.000
There has to be some base case, the simplest possible version of the problem that

0:04:00.000,0:04:05.000
you can directly solve. You don't need to make any further recursive calls. So the idea

0:04:05.000,0:04:10.000
is that it kinda bottoms out there, and then allows the recursion to kind of

0:04:10.000,0:04:10.000
unwind.

0:04:10.000,0:04:12.000
The recursive cases,

0:04:12.000,0:04:15.000
and there may be one or more of these, are the cases where it's not that simple, that the

0:04:15.000,0:04:20.000
answer isn't directly solvable, but if you had the answer to a smaller, simpler

0:04:20.000,0:04:20.000
version, you

0:04:20.000,0:04:24.000
would be able to assemble that answer you're looking for

0:04:24.000,0:04:27.000
using that information from the recursive call.

0:04:27.000,0:04:31.000
So that's the kind of structure that they all look like. If I'm

0:04:31.000,0:04:34.000
at the base case, then do the base case and return it.

0:04:34.000,0:04:37.000
Otherwise, make some recursive calls, and

0:04:37.000,0:04:43.000
use that to return a result from this iteration.

0:04:43.000,0:04:45.000
So let's first look at something that

0:04:45.000,0:04:48.000
- the first couple examples that I'm gonna show you actually are gonna be so

0:04:48.000,0:04:52.000
easy that in some sense they're almost gonna be a little bit counterproductive because they're gonna

0:04:52.000,0:04:54.000
teach you that recursion lets you do things that you already

0:04:54.000,0:04:57.000
know how to do. And then I'm gonna work my way up to ones that actually

0:04:57.000,0:04:59.000
get beyond that.

0:04:59.000,0:05:02.000
But let's look at first the raise to power.

0:05:02.000,0:05:05.000
C++ has no built-in exponentiation operator. There's nothing that raises

0:05:05.000,0:05:07.000
a base to a particular exponent

0:05:07.000,0:05:08.000
in the operator set.

0:05:08.000,0:05:12.000
So if you want it, you need to write it, or you can use - there's a math library called

0:05:12.000,0:05:15.000
pow for raise to power. We're gonna run our own version of it

0:05:15.000,0:05:17.000
because it's gonna give us some practice thing about this.

0:05:17.000,0:05:20.000
The first one I'm gonna show you is one that should feel very familiar and

0:05:20.000,0:05:24.000
very intuitive, which is using an iterative formulation. If I'm

0:05:24.000,0:05:27.000
trying to raise the base to the exponent, then that's really simply

0:05:27.000,0:05:30.000
multiplying base by itself exponent times.

0:05:30.000,0:05:31.000
So this one

0:05:31.000,0:05:32.000
uses a for loop

0:05:32.000,0:05:37.000
and does so. It starts the result at one, and for

0:05:37.000,0:05:40.000
iterations through the number of exponents keeps

0:05:40.000,0:05:46.000
multiplying to get there. So

0:05:46.000,0:05:48.000
that one's fine, and it will perfectly work.

0:05:48.000,0:05:51.000
I'm gonna show you this alternative way

0:05:51.000,0:05:54.000
that starts you thinking about what it means to divide a problem up in a

0:05:54.000,0:05:56.000
recursive strategy.

0:05:56.000,0:06:00.000
Base to the exponent - I wanna raise five to the tenth power. If

0:06:00.000,0:06:05.000
I had around some delegate, some

0:06:05.000,0:06:08.000
clone of myself that I could dispatch to solve

0:06:08.000,0:06:11.000
the slightly smaller problem of computing five to the ninth power,

0:06:11.000,0:06:15.000
then all I would need to do is take that answer and multiply it by one more five, and I'd get

0:06:15.000,0:06:17.000
five to the tenth.

0:06:17.000,0:06:19.000
Okay.

0:06:19.000,0:06:22.000
If I write code

0:06:22.000,0:06:25.000
that's based on that,

0:06:25.000,0:06:28.000
then I end up with something here - and I'm gonna

0:06:28.000,0:06:31.000
let these two things go through to show us

0:06:31.000,0:06:32.000
that

0:06:32.000,0:06:34.000
to compute the answer to five

0:06:34.000,0:06:36.000
to the tenth power,

0:06:36.000,0:06:38.000
what I really need is the answer to five

0:06:38.000,0:06:42.000
to the ninth power, which I do by making a recursive call to the same function I'm in

0:06:42.000,0:06:43.000
the middle of writing.

0:06:43.000,0:06:45.000
So this is raise that I'm

0:06:45.000,0:06:49.000
defining, and in the body of it, it makes a call to raise. That's what is the

0:06:49.000,0:06:51.000
mark of a recursive function here. I

0:06:51.000,0:06:55.000
pass slightly different arguments. In this case, one smaller exponent

0:06:55.000,0:06:57.000
which is getting a little bit closer

0:06:57.000,0:07:00.000
to that simplest possible case

0:07:00.000,0:07:02.000
that we will eventually terminate at where

0:07:02.000,0:07:07.000
we can say I don't need to further dispatch any delegates and any clones out there

0:07:07.000,0:07:08.000
to do the work, but if

0:07:08.000,0:07:11.000
the exponent I'm raising it to is zero, by

0:07:11.000,0:07:16.000
definition anything raised to the exponent of zero is one, I could just stop there.

0:07:16.000,0:07:19.000
So when computing five to the tenth, we're gonna

0:07:19.000,0:07:21.000
see some recursion at work. Let me

0:07:21.000,0:07:24.000
take this code into the compiler

0:07:24.000,0:07:27.000


0:07:27.000,0:07:31.000
so that we can see a little bit about how this actually works

0:07:31.000,0:07:32.000
in terms of that.

0:07:32.000,0:07:35.000
So that's exactly the code I have there, but

0:07:35.000,0:07:42.000
I can say

0:07:42.000,0:07:43.000
something that I

0:07:43.000,0:07:46.000
know the answer to. How about that?

0:07:46.000,0:07:49.000


0:07:49.000,0:07:51.000
First, we'll take a look at it

0:07:51.000,0:07:55.000
doing its work, so five times five times five should be 125.

0:07:55.000,0:07:57.000
So we can test a couple

0:07:57.000,0:07:58.000
little

0:07:58.000,0:08:01.000
values while we're at it. Two the sixth power, that should be

0:08:01.000,0:08:07.000
64, just to see a couple of the cases, just to feel good about what's going on.

0:08:07.000,0:08:10.000
And then raising say 23 to the zero power

0:08:10.000,0:08:11.000
should be one

0:08:11.000,0:08:14.000
as anything raised to the zero power should be.

0:08:14.000,0:08:16.000
So a little bit of spot testing

0:08:16.000,0:08:18.000
to feel good about what's going on.

0:08:18.000,0:08:22.000
Now I'm gonna go back to this idea like two to the sixth.

0:08:22.000,0:08:24.000
And I'm gonna set a breakpoint here.

0:08:24.000,0:08:25.000
Get my breakpoint out.

0:08:25.000,0:08:36.000
And I'm gonna run this guy in the debugger.

0:08:36.000,0:08:39.000
Takes a little bit longer to get the debugger up and running,

0:08:39.000,0:08:44.000
so I'll have to make a little padder up while we're going here.

0:08:44.000,0:08:47.000
And then it tells me right now it's breaking on raise, and

0:08:47.000,0:08:53.000
I can look around in the debugger. This is a - did not

0:08:53.000,0:08:56.000
pick up my compilation? I think it did not. I

0:08:56.000,0:08:59.000
must not have saved it right before I did it because it's actually got the base is 23 and the

0:08:59.000,0:09:00.000
exponent is zero. It turns

0:09:00.000,0:09:06.000
out I don't wanna see that case, so I'm gonna go back and

0:09:06.000,0:09:08.000
try again. I

0:09:08.000,0:09:16.000
wanna see it -

0:09:16.000,0:09:24.000
no, I did not.

0:09:24.000,0:09:28.000
And I'm just interested in knowing a little bit about the mechanics of what's gonna happen in a

0:09:28.000,0:09:29.000
recursive situation.

0:09:29.000,0:09:33.000
If I look at the first time that I hit my breakpoint,

0:09:33.000,0:09:35.000
then I'll see that there's a little bit of the beginnings of the

0:09:35.000,0:09:39.000
student main, some stuff behind it. There's a little bit of magic underneath your

0:09:39.000,0:09:43.000
stack that you don't really need to know about, but starting from main it went into raise,

0:09:43.000,0:09:46.000
and the arguments it has there is the base is two, the exponent is six.

0:09:46.000,0:09:50.000
If I continue from here, then you'll notice the stack frame got one deeper.

0:09:50.000,0:09:53.000
There's actually another indication of raise, and in fact, they're both active at the

0:09:53.000,0:09:54.000
same time.

0:09:54.000,0:09:57.000
The previous raise that was trying to compute two to the sixth

0:09:57.000,0:10:01.000
is kind of in stasis back there waiting for the answer to come back from

0:10:01.000,0:10:05.000
this version, which is looking to raise two to the fifth power. I continue again,

0:10:05.000,0:10:08.000
I get two to the fourth. I keep doing this. I'm gonna see these guys kinda stack up,

0:10:08.000,0:10:12.000
each one of those kind of waiting for the delegate or the clone

0:10:12.000,0:10:17.000
to come back with that answer, so that then it can do its further work

0:10:17.000,0:10:19.000
incorporating that result to compute the thing it needed to do.

0:10:19.000,0:10:22.000
I get down here to raising

0:10:22.000,0:10:24.000
two to the first power,

0:10:24.000,0:10:26.000
and then finally I get to two to the zero,

0:10:26.000,0:10:29.000
so now I've got these eight or so stacked frames,

0:10:29.000,0:10:30.000
six up there.

0:10:30.000,0:10:32.000
This one, if I step

0:10:32.000,0:10:34.000
from here,

0:10:34.000,0:10:37.000
it's gonna hit the base case of returning one,

0:10:37.000,0:10:39.000
and then

0:10:39.000,0:10:43.000
we will end up kind of working our way back out. So now, we are

0:10:43.000,0:10:46.000
at the end of the two to the one case

0:10:46.000,0:10:49.000
that's using the answer it got from the other one, multiplying it by two.

0:10:49.000,0:10:53.000
Now I'm at the two to the two case, and so each of them's kind of unfolding in the stack is

0:10:53.000,0:10:55.000
what's called unwinding. It's popping back off

0:10:55.000,0:10:57.000
the stack frames that are there

0:10:57.000,0:11:01.000
and kind of revisiting them, each passing up the information it got back,

0:11:01.000,0:11:04.000
and eventually telling us that

0:11:04.000,0:11:08.000
the answer was 64, so I will let that go.

0:11:08.000,0:11:15.000


0:11:15.000,0:11:17.000
So the idea that

0:11:17.000,0:11:21.000
all of those stack frames kind of exist at the same time - they're all being maintained

0:11:21.000,0:11:24.000
independently, so the idea that the compiler isn't confused by the idea that raise is

0:11:24.000,0:11:26.000
invoking raise which is invoking raise,

0:11:26.000,0:11:30.000
that each of the raise stack frames is distinct from the other ones, so the

0:11:30.000,0:11:33.000
indications are actually kept separate. So one had two to the sixth, the next one had two to

0:11:33.000,0:11:35.000
the fifth, and so on.

0:11:35.000,0:11:36.000
And then eventually

0:11:36.000,0:11:37.000
we

0:11:37.000,0:11:40.000
need to make some progress toward that base case, so that we can then

0:11:40.000,0:11:42.000
stop that recursion

0:11:42.000,0:11:43.000
and unwind.

0:11:43.000,0:11:47.000
Let me actually show you something while I'm here, which is one thing that's

0:11:47.000,0:11:48.000
a pretty common mistake

0:11:48.000,0:11:51.000
to make early on in a recursion is to somehow fail to make progress toward

0:11:51.000,0:11:53.000
that base case

0:11:53.000,0:11:57.000
or to

0:11:57.000,0:11:58.000
-

0:11:58.000,0:12:01.000
not all cases make it to the base case. For example, if

0:12:01.000,0:12:03.000
I did something where

0:12:03.000,0:12:06.000
I forgot to

0:12:06.000,0:12:07.000
subtract one [cuts out],

0:12:07.000,0:12:10.000
and I said oh yeah, I need to [cuts out]. In this case, I'm passing it exactly the

0:12:10.000,0:12:13.000
same arguments I got.

0:12:13.000,0:12:16.000
If I do this and I run this guy,

0:12:16.000,0:12:19.000
then what's gonna happen is it's gonna go two to the sixth, two to the sixth, two to the sixth,

0:12:19.000,0:12:20.000
and I'm gonna

0:12:20.000,0:12:21.000
let go of

0:12:21.000,0:12:23.000
this breakpoint here because

0:12:23.000,0:12:30.000
I don't really wanna watch it all happening.

0:12:30.000,0:12:31.000
And there it goes.

0:12:31.000,0:12:35.000
Loading 6,493 stack frames, zero percent completed, so

0:12:35.000,0:12:38.000
that's gonna take a while as you

0:12:38.000,0:12:41.000
can imaging. And usually, once you see this error message, it's your clue to say I

0:12:41.000,0:12:45.000
can cancel, I know what happened. The only reason I should've

0:12:45.000,0:12:46.000
gotten

0:12:46.000,0:12:50.000
6,500 stack frames loaded up is because I made a mistake that

0:12:50.000,0:12:53.000
caused the stack to totally overflow.

0:12:53.000,0:12:57.000
So the behavior in C++ or C is that when you have

0:12:57.000,0:13:01.000
so many of those stack frames, eventually the space that's been allocated or set

0:13:01.000,0:13:03.000
aside for the function stack will be exhausted.

0:13:03.000,0:13:05.000
It will use all the space it has,

0:13:05.000,0:13:09.000
and run up against a boundary, and typically report it in some way that

0:13:09.000,0:13:13.000
suggests - sometimes you'll see stack overflow, stack out of memory error.

0:13:13.000,0:13:14.000


0:13:14.000,0:13:15.000
In this case, it's showing you the

0:13:15.000,0:13:17.000
7,000 stack frames that

0:13:17.000,0:13:20.000
are behind you, and then if you were to examine them you would see they all have exactly the

0:13:20.000,0:13:24.000
same argument, so they weren't getting anywhere.

0:13:24.000,0:13:27.000
I'm not gonna actually let it do that because I'm

0:13:27.000,0:13:28.000
too impatient.

0:13:28.000,0:13:29.000
Let

0:13:29.000,0:13:30.000
me fix this code while I'm here.

0:13:30.000,0:13:33.000
But other things like even this code that actually looks correct for

0:13:33.000,0:13:39.000
some situations also has a subtle bug in it. Even if

0:13:39.000,0:13:40.000
I fixed this, which is

0:13:40.000,0:13:44.000
that, right now it assumed that the exponent is positive,

0:13:44.000,0:13:46.000
that it's some number

0:13:46.000,0:13:47.000


0:13:47.000,0:13:49.000
that I can subtract my way down to zero.

0:13:49.000,0:13:53.000
If I actually miscalled raise, and I gave it a negative exponent, it would

0:13:53.000,0:13:57.000
go into infinite recursion as well. If you started it

0:13:57.000,0:14:00.000
at ten to the negative first power, it would go negative first, negative second, negative

0:14:00.000,0:14:02.000
third.

0:14:02.000,0:14:05.000
6,500 stack frames later, we'd run out of space.

0:14:05.000,0:14:08.000
In this case, since we're only intending to handle those positive powers, we

0:14:08.000,0:14:12.000
could just put an error that was like if the exponent is less than zero then

0:14:12.000,0:14:14.000
raise an error and don't even

0:14:14.000,0:14:21.000
try to do anything with it. Okay. So let

0:14:21.000,0:14:25.000
me show you a slightly different way of doing this that's also recursive,

0:14:25.000,0:14:28.000
but that actually gets the answer a little bit more efficiently.

0:14:28.000,0:14:31.000
This is a different way of dividing it up, but still using a

0:14:31.000,0:14:36.000
recursive strategy which is that if I'm trying to compute five to the tenth power,

0:14:36.000,0:14:40.000
but if I have the answer not of five to ninth power, but instead I have the

0:14:40.000,0:14:42.000
answer of five to the fifth power,

0:14:42.000,0:14:44.000
and then I multiply that by itself,

0:14:44.000,0:14:48.000
I would get that five to the tenth power that I seek.

0:14:48.000,0:14:51.000
And then there's a little bit of a case though of what if the power I was trying to

0:14:51.000,0:14:54.000
get was odd, if I was trying to raise it to the eleventh power, I could compute the

0:14:54.000,0:14:58.000
half power which get me to the fifth - multiplied by the fifth, but then I need

0:14:58.000,0:14:59.000
one more

0:14:59.000,0:15:04.000
base multiplied in there to make up for that half. Okay.

0:15:04.000,0:15:05.000
And so I can write

0:15:05.000,0:15:08.000
another recursive formulation here. Same

0:15:08.000,0:15:11.000
sort of base case about

0:15:11.000,0:15:14.000
detecting when we've gotten down, but then in this case the recursive call we make is

0:15:14.000,0:15:17.000
to base of the exponent divided by two,

0:15:17.000,0:15:19.000
and then we use it with an

0:15:19.000,0:15:23.000
if else whether the exponent was even or odd, it decided whether to just square

0:15:23.000,0:15:25.000
that number or whether to square it and

0:15:25.000,0:15:28.000
tack in another power of the base while we're at it.

0:15:28.000,0:15:31.000
So this one is gonna be much more quick about getting down to it,

0:15:31.000,0:15:34.000
whereas the one we saw was gonna put one stack frame down for every

0:15:34.000,0:15:35.000


0:15:35.000,0:15:39.000
successive exponent power. So if you wanted to raise something to the 10th, or 20th,

0:15:39.000,0:15:41.000
or 30th power, then you were

0:15:41.000,0:15:43.000
giving yourself 10, 20, 30 stack frames.

0:15:43.000,0:15:46.000
Something like 30 stack frames is not really something to be worried

0:15:46.000,0:15:48.000
about, but if you were really trying to make this work on much larger

0:15:48.000,0:15:49.000
numbers,

0:15:49.000,0:15:52.000
which would require some other work because exponent

0:15:52.000,0:15:55.000
is a very rapidly growing operator and would overflow your integers quickly,

0:15:55.000,0:15:57.000
but this way

0:15:57.000,0:16:00.000
very quickly divides in half. So it goes from a power of 100, to a power of

0:16:00.000,0:16:05.000
50, to 25, to 12, to 6, to 3, to 2, to 1, so

0:16:05.000,0:16:08.000
that dividing in half is a much quicker way to work our way down to that base case

0:16:08.000,0:16:12.000
and get our answer back, and we're doing a lot fewer calculations

0:16:12.000,0:16:14.000
than all those multiplies,

0:16:14.000,0:16:16.000
one per

0:16:16.000,0:16:19.000
base. So just a little

0:16:19.000,0:16:21.000
diversion.

0:16:21.000,0:16:22.000
So let me tell you something,

0:16:22.000,0:16:23.000
just a little bit about

0:16:23.000,0:16:27.000
kind of style as it applies to recursion.

0:16:27.000,0:16:28.000
Recursion is

0:16:28.000,0:16:34.000
really best when you can express it in the most kind of direct, clear, and

0:16:34.000,0:16:36.000
simple code.

0:16:36.000,0:16:38.000
It's hard enough to get your head around a recursive formulation

0:16:38.000,0:16:42.000
without complicating it by having a bunch of extraneous parts where

0:16:42.000,0:16:43.000
you're

0:16:43.000,0:16:47.000
doing more work than necessary, or redundantly handling certain things. And

0:16:47.000,0:16:49.000
one thing that's actually very easy to fall in the trap of

0:16:49.000,0:16:52.000
is thinking about there's lots of other base cases you might be able to

0:16:52.000,0:16:56.000
easily handle, so why not just go ahead and call out for them,

0:16:56.000,0:16:57.000
test for

0:16:57.000,0:16:59.000
- you're at the base case. You're close to the base case.

0:16:59.000,0:17:02.000
Checking before you might make a recursive call, if you're gonna hit the base case when you

0:17:02.000,0:17:08.000
make that call, then why make the call. I'll just anticipate and get the answer it

0:17:08.000,0:17:09.000
would've returned anyway.

0:17:09.000,0:17:13.000
It can lead to code that looks a little bit like this before you're done. If the

0:17:13.000,0:17:16.000
exponent's zero, that's one. If the exponent's one, then I can just

0:17:16.000,0:17:19.000
return the base. If it's two, then I can just multiply the base by itself. If it's

0:17:19.000,0:17:20.000
three, I can start doing this.

0:17:20.000,0:17:22.000
Certainly, you

0:17:22.000,0:17:25.000
can follow this to the point of absurdity, and even for some of the simple

0:17:25.000,0:17:28.000
cases, it might seem like you're saving yourself a little bit of extra time. You're like

0:17:28.000,0:17:31.000
why go back around and let it make another recursive call. I could stop it right

0:17:31.000,0:17:33.000
here. It's easy to know that answer.

0:17:33.000,0:17:36.000
But as you do this, it complicates the code. It's a little harder to read. It's

0:17:36.000,0:17:38.000
a little harder to

0:17:38.000,0:17:39.000
debug.

0:17:39.000,0:17:43.000
Really, the expense of making that one extra call, two extra calls

0:17:43.000,0:17:46.000
is not the thing to be worried about.

0:17:46.000,0:17:50.000
What we really want is the cleanest code that expresses what we do, has the

0:17:50.000,0:17:53.000
simplest cases to test, and to examine, and to follow,

0:17:53.000,0:17:56.000
and not muck it up with things that in

0:17:56.000,0:17:57.000
effect

0:17:57.000,0:18:01.000
don't change the computational power of this but just introduce the opportunity for error.

0:18:01.000,0:18:02.000
Instead of a

0:18:02.000,0:18:05.000
multiply here, I accidentally put a plus.

0:18:05.000,0:18:07.000
It might be easy to overlook it

0:18:07.000,0:18:08.000
and not realize what I've done,

0:18:08.000,0:18:10.000
and then

0:18:10.000,0:18:14.000
end up computing the wrong answer when it gets to the base case of two. In fact, if

0:18:14.000,0:18:15.000
you do it this

0:18:15.000,0:18:17.000
way, most things would stop at three,

0:18:17.000,0:18:20.000
but these would suddenly become kind of their own special cases that were only

0:18:20.000,0:18:25.000
hit if you directly called them with the two. The recursive cases will all stop earlier. It

0:18:25.000,0:18:28.000
just complicates your testing pass now because not everything is going through

0:18:28.000,0:18:31.000
the same code.

0:18:31.000,0:18:34.000
So we call this arm's length recursion. I put a big X on that

0:18:34.000,0:18:35.000
looking ahead,

0:18:35.000,0:18:36.000
testing before you get there.

0:18:36.000,0:18:41.000
Just let the code fall through.

0:18:41.000,0:18:43.000
So

0:18:43.000,0:18:45.000
Dan, he's not here today,

0:18:45.000,0:18:48.000
but he talked to me at the end of Friday's class, and it made

0:18:48.000,0:18:49.000
me

0:18:49.000,0:18:53.000
wanna actually just give you a little bit of

0:18:53.000,0:18:54.000
insight into

0:18:54.000,0:18:57.000
recursion as it relates to

0:18:57.000,0:18:58.000


0:18:58.000,0:19:02.000
efficiency. Recursion by itself, just the idea of applying a recursive technique to solving a

0:19:02.000,0:19:06.000
problem, does not give you any guarantee that it's gonna be the best solution, the

0:19:06.000,0:19:09.000
most efficient solution, or the fact that it's gonna give you a very inefficient solution.

0:19:09.000,0:19:11.000
Sometimes people have kind of

0:19:11.000,0:19:11.000
a

0:19:11.000,0:19:15.000
bad rap on recursion. It's like recursion will definitely be inefficient.

0:19:15.000,0:19:17.000
That actually is not guaranteed.

0:19:17.000,0:19:21.000
Recursion often requires exactly the same resources as the iterative approach.

0:19:21.000,0:19:24.000
It takes the same amount of effort. Surveying the campus - if you're gonna

0:19:24.000,0:19:27.000
survey the 10,000 people on campus and get everybody's information back,

0:19:27.000,0:19:30.000
whether you're doing it divide and conquer, or whether you're sitting out there in White

0:19:30.000,0:19:30.000
Plaza

0:19:30.000,0:19:34.000
asking each person one by one, in the end 10,000 people got survey.

0:19:34.000,0:19:36.000
The recursion is not

0:19:36.000,0:19:39.000
part of what made that longer or shorter. It might actually depending on how you

0:19:39.000,0:19:42.000
have resources work out better. Like if you could get a bunch of people in parallel

0:19:42.000,0:19:46.000
surveying, you might end up completing the whole thing in less time, but it required

0:19:46.000,0:19:49.000
more people, and more clipboards, and more paper

0:19:49.000,0:19:52.000
while the process is ongoing than me standing there with one piece of paper and a

0:19:52.000,0:19:57.000
clipboard. But then again, it took lots of my time to do it.

0:19:57.000,0:20:00.000
So in many situations, it's really no better or no worse

0:20:00.000,0:20:02.000
than the alternative. It makes a little bit of some tradeoffs of where the time is

0:20:02.000,0:20:03.000
spent.

0:20:03.000,0:20:06.000
There are situations though where recursion can actually make something that was

0:20:06.000,0:20:09.000
efficient inefficient. There are situations where it can take something that was inefficient and

0:20:09.000,0:20:11.000
make it efficient. So

0:20:11.000,0:20:15.000
it's more of a case-by-case basis to decide whether recursion is the

0:20:15.000,0:20:18.000
right tool if efficiency is one of your primary concerns.

0:20:18.000,0:20:21.000
I will say that for problems with simple iterative solutions

0:20:21.000,0:20:24.000
that operate relatively efficiently, iteration

0:20:24.000,0:20:26.000
is probably the best way to solve it.

0:20:26.000,0:20:27.000
So like

0:20:27.000,0:20:31.000
the raise to power, yeah you could certainly do that

0:20:31.000,0:20:32.000
iteratively. There's not

0:20:32.000,0:20:35.000
some huge advantage that recursion is offering. I'm using them here because they're

0:20:35.000,0:20:37.000
simple enough to get our head around.

0:20:37.000,0:20:39.000
We're gonna work our way toward problems where

0:20:39.000,0:20:42.000
we're gonna find things that require recursion, which

0:20:42.000,0:20:45.000
is kind of one of the third points I put in. Why do we learn recursion? What's recursion gonna be good

0:20:45.000,0:20:48.000
for? First off, recursion is awesome.

0:20:48.000,0:20:52.000
There are some problems that you just can't solve using anything but recursion,

0:20:52.000,0:20:55.000
so that the alternatives like I'll just write it iteratively won't work.

0:20:55.000,0:20:59.000
You'll try, but you'll fail. They inherently have a recursive

0:20:59.000,0:20:59.000
structure to them

0:20:59.000,0:21:02.000
where recursion is the right tool for the job.

0:21:02.000,0:21:06.000
Often, it produces the most beautiful, direct, and clear elegant code.

0:21:06.000,0:21:08.000
The next assignment that will go out

0:21:08.000,0:21:11.000
has these problems that you do in recursion, and they're each about ten lines

0:21:11.000,0:21:13.000
long. Some of them are like five lines long.

0:21:13.000,0:21:18.000
They do incredible things in five lines because of the descriptive power of

0:21:18.000,0:21:19.000
describing it as

0:21:19.000,0:21:22.000
here's a base case, and here's a recursive case, and everything else just follows

0:21:22.000,0:21:23.000
from

0:21:23.000,0:21:26.000
applying this, and reducing the problem step by step.

0:21:26.000,0:21:28.000
So you will see things where

0:21:28.000,0:21:31.000
the recursive code is just beautiful, clean, elegant, easy to understand, easy to follow,

0:21:31.000,0:21:32.000
easy to test,

0:21:32.000,0:21:34.000
solves the recursive problem. And in

0:21:34.000,0:21:37.000
those cases, it really is a much better answer than trying to hack up some

0:21:37.000,0:21:39.000
other iterative form

0:21:39.000,0:21:41.000
that may in the end be

0:21:41.000,0:21:44.000
no more efficient. It may be even less efficient. So don't let efficiency be

0:21:44.000,0:21:46.000
kind of a big

0:21:46.000,0:21:52.000
fear of what recursion is for you. So I'm gonna do more examples.

0:21:52.000,0:21:54.000


0:21:54.000,0:21:56.000
I've got three more examples,

0:21:56.000,0:21:58.000
or four I think today, so

0:21:58.000,0:22:01.000
I will just keep

0:22:01.000,0:22:03.000
showing you different things, and hopefully

0:22:03.000,0:22:05.000
the patterns will start to kind of

0:22:05.000,0:22:07.000
come out of the mist for you.

0:22:07.000,0:22:10.000
A palindrome string is one that reads the same forwards and backwards

0:22:10.000,0:22:14.000
when reversed. So was it a car or a cat I saw, if you read that

0:22:14.000,0:22:16.000
backwards, it turns out

0:22:16.000,0:22:17.000
it

0:22:17.000,0:22:21.000
says the same thing. Go hang a salami, I'm a lasagna

0:22:21.000,0:22:22.000
hog. Also

0:22:22.000,0:22:26.000
handy when you need to have a bar bet over what the longest palindrome is

0:22:26.000,0:22:28.000
to have these things around.

0:22:28.000,0:22:31.000
There are certainly ways to do this iteratively. If you were given a string and you were interested to

0:22:31.000,0:22:33.000
know is it a palindrome,

0:22:33.000,0:22:36.000
you could kind of do this marching - you're looking outside and kind of marching your way

0:22:36.000,0:22:38.000
into the middle.

0:22:38.000,0:22:42.000
But we're gonna go ahead and let recursion kinda help us

0:22:42.000,0:22:45.000
deal with the subproblem of this,

0:22:45.000,0:22:47.000
and imagine that at the simplest possible form - you could say that

0:22:47.000,0:22:49.000
a palindrome

0:22:49.000,0:22:51.000
consists of

0:22:51.000,0:22:53.000
an interior palindrome,

0:22:53.000,0:22:56.000
and then the same letter tacked on to the front and the back.

0:22:56.000,0:22:58.000
So if you look at was it a car or a cat I saw,

0:22:58.000,0:23:01.000
there are two Ws there. It starts and it ends with a W,

0:23:01.000,0:23:04.000
so all palindromes must start and end with the same letter. Okay.

0:23:04.000,0:23:05.000
Let's check that,

0:23:05.000,0:23:11.000
and if that matches, then extract that middle and see if it's a palindrome.

0:23:11.000,0:23:13.000
So it feels like I didn't really do anything. It's like

0:23:13.000,0:23:17.000
all I did was match two letters, and then I said by the way

0:23:17.000,0:23:19.000
delegate this problem

0:23:19.000,0:23:23.000
back to myself, making a call to a function I haven't even finished writing

0:23:23.000,0:23:27.000
to examine the rest of the letters.

0:23:27.000,0:23:29.000
And then I need a base case.

0:23:29.000,0:23:31.000
I've got a recursive case, right?

0:23:31.000,0:23:34.000
Take off the outer two letters. Make sure they match.

0:23:34.000,0:23:37.000
Recur on the inside.

0:23:37.000,0:23:42.000
What is the simplest possible palindrome? One letter? One

0:23:42.000,0:23:44.000
letter. One letter makes a good palindrome. One letter is by

0:23:44.000,0:23:47.000
definition first and last letter are the same letter, so it matches. That's

0:23:47.000,0:23:53.000
a great base case. Is it the only base case? [Inaudible]. Two

0:23:53.000,0:23:56.000
letters is also kind of important, but there's actually an even

0:23:56.000,0:23:58.000
simpler form, right?

0:23:58.000,0:24:02.000
It's the empty string. So both the empty string and the single character string are by

0:24:02.000,0:24:05.000
definition the world's simplest palindromes. They meet the

0:24:05.000,0:24:08.000
requirements that they read the same forward and backwards. Empty string forwards

0:24:08.000,0:24:10.000
and backwards is trivially the

0:24:10.000,0:24:14.000
same, so that makes it even easier than doing the two-letter case.

0:24:14.000,0:24:17.000
So if I write this code to look like that where

0:24:17.000,0:24:19.000
if the length of the strength is

0:24:19.000,0:24:24.000
one or fewer, so that handles both the zero and the one case, then I return true.

0:24:24.000,0:24:25.000
Those are trivial

0:24:25.000,0:24:26.000


0:24:26.000,0:24:28.000
palindromes of the easiest

0:24:28.000,0:24:30.000
immediate detection.

0:24:30.000,0:24:31.000
Otherwise,

0:24:31.000,0:24:33.000
I've got a return here that says

0:24:33.000,0:24:35.000
if the first character

0:24:35.000,0:24:38.000
is the same as the last character,

0:24:38.000,0:24:39.000
and the

0:24:39.000,0:24:42.000
middle - so the substring starting at Position 1

0:24:42.000,0:24:47.000
that goes for length minus two characters that picks up the interior

0:24:47.000,0:24:50.000
discarding the first and last characters - if that's also a palindrome, then we've got a

0:24:50.000,0:24:51.000
palindrome.

0:24:51.000,0:24:54.000
So given the short circuiting nature of the and, if

0:24:54.000,0:24:56.000
it looks at the outer two characters and they don't match, it immediately just

0:24:56.000,0:24:58.000
stops right there and says false.

0:24:58.000,0:25:01.000
If they do match, then it goes on looking

0:25:01.000,0:25:03.000
at the next interior pair

0:25:03.000,0:25:07.000
which will stack up a recursive call looking at its two things, and eventually we

0:25:07.000,0:25:10.000
will either catch a pair that doesn't match,

0:25:10.000,0:25:14.000
and then this false will immediately return its way out, or it will keep

0:25:14.000,0:25:17.000
going all the way down to that base case, hit a true,

0:25:17.000,0:25:21.000
and know that we do have a full palindrome there.

0:25:21.000,0:25:24.000
So you could certainly write this with a for loop. Actually, writing

0:25:24.000,0:25:27.000
it with a for loop is almost a little bit trickier because you

0:25:27.000,0:25:30.000
have to keep track of which part of the string are you on, and what happens when you

0:25:30.000,0:25:34.000
get to the middle and things like that. In some sense, the recursive

0:25:34.000,0:25:35.000
form really kinda

0:25:35.000,0:25:38.000
sidesteps that by really thinking about it in a more holistic way

0:25:38.000,0:25:41.000
of like the outer letters plus an inner palindrome

0:25:41.000,0:25:43.000
gives me the answer I'm looking for.

0:25:43.000,0:25:46.000
So this idea of

0:25:46.000,0:25:49.000
taking a function you're in the middle of writing and making a call to it as

0:25:49.000,0:25:50.000
though it worked

0:25:50.000,0:25:53.000
is something that - they require this idea of the leap of faith.

0:25:53.000,0:25:55.000
You haven't even finished describing

0:25:55.000,0:25:58.000
how is palindrome operates,

0:25:58.000,0:26:01.000
and there you are making a call to it depending on it working in order to get

0:26:01.000,0:26:02.000
your

0:26:02.000,0:26:04.000
function working. It's a very kind of

0:26:04.000,0:26:06.000
wacky thing to get your head around.

0:26:06.000,0:26:08.000
It feels a little bit

0:26:08.000,0:26:09.000
mystical at first.

0:26:09.000,0:26:11.000


0:26:11.000,0:26:13.000
That feeling you feel a little bit

0:26:13.000,0:26:16.000
discombobulated about this is probably pretty normal,

0:26:16.000,0:26:17.000
but we're gonna

0:26:17.000,0:26:18.000
keep seeing examples,

0:26:18.000,0:26:21.000
and hope that

0:26:21.000,0:26:23.000
it starts to feel a little less unsettling.

0:26:23.000,0:26:26.000
Anybody wanna ask a question about this so far? Yeah?

0:26:26.000,0:26:28.000
So I guess create your

0:26:28.000,0:26:30.000
base case first, then test it? [Inaudible]. That's a great question.

0:26:30.000,0:26:35.000
So I would say typically that's a great strategy. Get a base

0:26:35.000,0:26:38.000
case and test against the base case, so the one character and the two character

0:26:38.000,0:26:39.000
strings.

0:26:39.000,0:26:42.000
And then imagine one layer out, things that will make one recursive

0:26:42.000,0:26:46.000
call only. So in this case for the palindrome, it's like what's a two-character

0:26:46.000,0:26:47.000
string?

0:26:47.000,0:26:50.000
One has AB. One has AA. So one that is a palindrome, one that isn't,

0:26:50.000,0:26:52.000
and watch it go through.

0:26:52.000,0:26:55.000
Then from there you have to almost take that leap and say

0:26:55.000,0:26:58.000
it worked for the base case. It worked for one out. And then you have to start

0:26:58.000,0:27:02.000
imagining if it worked for a string of length N, it'll work for one of N plus one,

0:27:02.000,0:27:03.000
and

0:27:03.000,0:27:07.000
that in some sense testing it exhaustively across all strings is

0:27:07.000,0:27:09.000
of course impossible, so you have to kind of

0:27:09.000,0:27:12.000
move to a sort of larger case where you can't just sit there and trace the whole

0:27:12.000,0:27:15.000
thing. You'll have to in some sense take a faith thing that says

0:27:15.000,0:27:18.000
if it could have computed whether the interior's a palindrome, then adding two

0:27:18.000,0:27:21.000
characters on the outside

0:27:21.000,0:27:21.000


0:27:21.000,0:27:24.000
and massaging that with that answer should produce the right thing. So some bigger

0:27:24.000,0:27:26.000
tests

0:27:26.000,0:27:28.000
that

0:27:28.000,0:27:30.000
verify that the recursive case

0:27:30.000,0:27:32.000
when exercised does the right thing,

0:27:32.000,0:27:34.000
then the pair together - All your code is going through

0:27:34.000,0:27:38.000
these same lines. The outer case going down to the recursive case, down

0:27:38.000,0:27:39.000
to that base case, and that's

0:27:39.000,0:27:43.000
one of the beauty of writing it recursively is that in some sense

0:27:43.000,0:27:46.000
once this piece of code works for some simple cases, the idea that setting it to

0:27:46.000,0:27:48.000
larger cases

0:27:48.000,0:27:51.000
is almost - I don't wanna

0:27:51.000,0:27:54.000
say guaranteed. That maybe makes it sound too easy, but in fact, if it

0:27:54.000,0:27:55.000
works for

0:27:55.000,0:27:58.000
cases of N and then N plus one, then it'll work for 100, and 101, and

0:27:58.000,0:28:02.000
6,000, and 6,001, and all the way through all the numbers by

0:28:02.000,0:28:06.000
induction. Question?

0:28:06.000,0:28:10.000
You have to remove all the [inaudible], all the spaces? Yeah. So definitely like the was it a car to match I should definitely

0:28:10.000,0:28:13.000
be taking my spaces out of there to make it right. You

0:28:13.000,0:28:21.000
are totally correct on that.

0:28:21.000,0:28:22.000
So let me show you

0:28:22.000,0:28:25.000
one where recursion is definitely

0:28:25.000,0:28:27.000
gonna buy us some real efficiency

0:28:27.000,0:28:30.000
and some real clarity in solving a

0:28:30.000,0:28:32.000
search problem. I've got a

0:28:32.000,0:28:35.000
vector. Let's say it's a vector of strings. It's a vector

0:28:35.000,0:28:37.000
of numbers. A vector of anything, it doesn't really matter.

0:28:37.000,0:28:40.000
And I wanna see if I can find a particular entry in it. So unlike the

0:28:40.000,0:28:44.000
set which can do a fast contains for you, in vector

0:28:44.000,0:28:46.000
if I haven't done anything special with it,

0:28:46.000,0:28:49.000
then there's no guarantee about where to find something. So if I wanna say

0:28:49.000,0:28:51.000


0:28:51.000,0:28:54.000
did somebody score 75 on the exam, then I'm gonna have

0:28:54.000,0:28:56.000
to just walk through the vector starting at Slot

0:28:56.000,0:28:59.000
0, walk my way to the end, and

0:28:59.000,0:29:02.000
compare to see if I find a 75. If I get to the end and I haven't found one, then

0:29:02.000,0:29:04.000
I can say no.

0:29:04.000,0:29:05.000
So that's what's called linear search. Linear

0:29:05.000,0:29:10.000
kind of implies this left to right sequential processing.

0:29:10.000,0:29:11.000
And linear search

0:29:11.000,0:29:15.000
has the property that as the size of the input grows, the amount of time

0:29:15.000,0:29:18.000
taken to search it grows in proportion.

0:29:18.000,0:29:20.000
So if you have a 1000 number array,

0:29:20.000,0:29:23.000
and you doubled its size, you would expect that doing a linear search on it should take

0:29:23.000,0:29:30.000
twice as long for an array that's twice as large.

0:29:30.000,0:29:34.000
The strategy we're gonna look at today is binary search

0:29:34.000,0:29:37.000
which is gonna try to avoid this looking in every one of those boxes to

0:29:37.000,0:29:38.000
find something.

0:29:38.000,0:29:42.000
It's gonna take a divide and conquer strategy, and it's gonna require a sorted

0:29:42.000,0:29:42.000
vector.

0:29:42.000,0:29:45.000
So in order to do an efficient lookup,

0:29:45.000,0:29:48.000
it helps if we've done some pre-rearrangement

0:29:48.000,0:29:49.000


0:29:49.000,0:29:51.000
of the data. In this case, putting it into sorted order

0:29:51.000,0:29:55.000
is gonna make it much easier for us to be able to find something in it because

0:29:55.000,0:29:55.000
we have

0:29:55.000,0:29:58.000
better information about where to look,

0:29:58.000,0:29:59.000
so much faster.

0:29:59.000,0:30:03.000
So we'll see that surely there was some cost to that,

0:30:03.000,0:30:05.000
but typically binary search is gonna be used when you have an array where you

0:30:05.000,0:30:07.000
don't do a lot of changes to it so

0:30:07.000,0:30:11.000
that putting it in a sorted order can be done once, and then from that point forward you can

0:30:11.000,0:30:15.000
search it many times, getting the benefit of the work you did to put it in

0:30:15.000,0:30:16.000
sorted order.

0:30:16.000,0:30:19.000
If you plan to sort it just to search it, then in some sense all the time you spent

0:30:19.000,0:30:20.000
sorting it

0:30:20.000,0:30:25.000
would kind of count against you and unlikely to come out ahead.

0:30:25.000,0:30:28.000
So the insight we're gonna use is

0:30:28.000,0:30:32.000
that if we have this in sorted order, and

0:30:32.000,0:30:34.000
we're trying to search the whole thing - we're looking for let's say the No. 75

0:30:34.000,0:30:36.000
-

0:30:36.000,0:30:38.000
is that if we just look at the middlemost element,

0:30:38.000,0:30:40.000
so we have this idea that we're looking at the whole vector right now from the

0:30:40.000,0:30:42.000
start to the end,

0:30:42.000,0:30:45.000
and I look at the middle element, and I say it's a 54. I can say

0:30:45.000,0:30:49.000
if 75 is in this vector because it's in sorted order,

0:30:49.000,0:30:51.000
it can't be

0:30:51.000,0:30:52.000
anywhere over here.

0:30:52.000,0:30:55.000
If 54's right there, everything to the left of 54

0:30:55.000,0:30:57.000
must be less than that,

0:30:57.000,0:30:59.000
and 75 wouldn't be over there.

0:30:59.000,0:31:00.000
So that means I can

0:31:00.000,0:31:03.000
just discard that half of the

0:31:03.000,0:31:06.000
vector from further consideration.

0:31:06.000,0:31:07.000
So now I have this half

0:31:07.000,0:31:10.000
to continue looking at which is the things that

0:31:10.000,0:31:13.000
are the right of 54 all the way to the end.

0:31:13.000,0:31:17.000
I use the same strategy again. I say if I'm searching a vector - so

0:31:17.000,0:31:19.000
last time I was searching a vector that had 25 elements. Now I've got one that's got

0:31:19.000,0:31:21.000
just 12.

0:31:21.000,0:31:24.000
Again, I use the same strategy. Look at the one in the middle.

0:31:24.000,0:31:26.000
I say oh, it's an 80,

0:31:26.000,0:31:29.000
and then I say well the number I'm looking for is 75.

0:31:29.000,0:31:30.000
It can't be

0:31:30.000,0:31:35.000
to the right of the 80. It must be to the left of it.

0:31:35.000,0:31:36.000
And then that lets me

0:31:36.000,0:31:40.000
get rid of another quarter of the vector. If I keep doing this, I get rid of

0:31:40.000,0:31:43.000
half, and then a quarter, and then an eighth, and

0:31:43.000,0:31:47.000
then a 16th. Very quickly, I will narrow in on the position where if 75 is in

0:31:47.000,0:31:53.000
this vector, it has to be. Or I'll be able to conclude it wasn't there at all. Then I kind of

0:31:53.000,0:31:55.000
work on my in, and

0:31:55.000,0:31:59.000
I found a 74 and a 76 over there, then I'm done.

0:31:59.000,0:32:02.000
That base case comes when I have

0:32:02.000,0:32:04.000
such a small little vector there

0:32:04.000,0:32:05.000
where

0:32:05.000,0:32:09.000
my bounds have crossed in such a way that I can say I never found what I

0:32:09.000,0:32:16.000
was looking for.

0:32:16.000,0:32:19.000
So let's walk through

0:32:19.000,0:32:21.000
this

0:32:21.000,0:32:22.000


0:32:22.000,0:32:25.000
bit of code that kind of puts into C++ the thing I just described here.

0:32:25.000,0:32:29.000
I've got a vector. In this case, I'm using a vector that's containing strings.

0:32:29.000,0:32:32.000
It could be ints. It could be anything. It doesn't really matter.

0:32:32.000,0:32:36.000
I've got a start and a stop, which I identify the sub-portion of the vector

0:32:36.000,0:32:40.000
that we're interested in. So the start is the first index to consider. The stop is the

0:32:40.000,0:32:41.000
last index to consider.

0:32:41.000,0:32:46.000
So the very first call to this will have start set to zero and stop set to the vector's

0:32:46.000,0:32:48.000
size minus one. I

0:32:48.000,0:32:51.000
compute the midpoint index

0:32:51.000,0:32:54.000
which is just the sum of the start and stop divided by two,

0:32:54.000,0:32:56.000
and then I compare

0:32:56.000,0:32:58.000
the key that I'm looking for to

0:32:58.000,0:33:02.000
the value at that index. I'm looking right in the middle. If it happens to match,

0:33:02.000,0:33:06.000
then I return that. The goal of binary search in this case is to return the index of a

0:33:06.000,0:33:08.000
matching element within the vector,

0:33:08.000,0:33:11.000
or to return this not found negative one constant

0:33:11.000,0:33:14.000
if it was unable to find any match anywhere.

0:33:14.000,0:33:16.000
So when we do find it

0:33:16.000,0:33:20.000
at whatever the level the recursion is, we can just immediately return that. We're done.

0:33:20.000,0:33:21.000
We found it. It's good.

0:33:21.000,0:33:24.000
Otherwise, we're gonna make this recursive call

0:33:24.000,0:33:28.000
that looks at either the left half or the right half based on if key is

0:33:28.000,0:33:28.000
less than

0:33:28.000,0:33:30.000
the value we found at the midpoint,

0:33:30.000,0:33:32.000
then the place we're searching is -

0:33:32.000,0:33:34.000
still has the same start position,

0:33:34.000,0:33:36.000
but is now capped

0:33:36.000,0:33:38.000
by the

0:33:38.000,0:33:40.000
element exactly to the left of the midpoint,

0:33:40.000,0:33:43.000
and then the right one,

0:33:43.000,0:33:45.000
the inversion of that,

0:33:45.000,0:33:47.000
one to the right of the midpoint

0:33:47.000,0:33:49.000
and the stop unchanged.

0:33:49.000,0:33:53.000
So taking off half of the elements under consideration at each stage,

0:33:53.000,0:33:54.000
eventually

0:33:54.000,0:33:58.000
I will get down to the simplest possible case. And the simplest possible case isn't that I have a one-element

0:33:58.000,0:34:02.000
vector and I found it or not. The really simple case is actually that I have zero

0:34:02.000,0:34:03.000
elements in my vector

0:34:03.000,0:34:04.000
that I just kept

0:34:04.000,0:34:06.000
moving in the

0:34:06.000,0:34:09.000
upper and lower bound point until they crossed,

0:34:09.000,0:34:12.000
which meant that I ran out of elements to check.

0:34:12.000,0:34:15.000
And that happens when the start index is greater than the stop index. So the start

0:34:15.000,0:34:19.000
and the stop if they're equal to each other mean that you have a one element vector left

0:34:19.000,0:34:20.000
to search,

0:34:20.000,0:34:22.000
but if you - For

0:34:22.000,0:34:24.000
example, if you got to that case where you have that one element vector left to search, you'll look

0:34:24.000,0:34:28.000
at that one, and if it matches, you'll be done. Otherwise, you'll end up

0:34:28.000,0:34:30.000
either decrementing the stop

0:34:30.000,0:34:33.000
to move past the start, or incrementing the start to move past the

0:34:33.000,0:34:34.000
stop.

0:34:34.000,0:34:36.000
And then that next iteration will hit this base case that said

0:34:36.000,0:34:40.000
I looked at the element in a one-element vector. It didn't work

0:34:40.000,0:34:40.000
out.

0:34:40.000,0:34:43.000
I can tell you for sure it's not found.

0:34:43.000,0:34:45.000
If it had been here, I would've seen it.

0:34:45.000,0:34:47.000
And this is looking at just one element

0:34:47.000,0:34:49.000
each

0:34:49.000,0:34:50.000
recursive call,

0:34:50.000,0:34:55.000
and the recursive calls in this case stack up to a depth

0:34:55.000,0:34:57.000
based on the logarithm of the size

0:34:57.000,0:34:59.000
to the power of two,

0:34:59.000,0:35:02.000
so that if you have 1000 elements,

0:35:02.000,0:35:06.000
you look at one, and now you have a 500-element collection to look

0:35:06.000,0:35:09.000
at again. You look at one, you have

0:35:09.000,0:35:12.000
a 250 element, 125, 60, 30,

0:35:12.000,0:35:12.000
15.

0:35:12.000,0:35:14.000
So at each stage

0:35:14.000,0:35:15.000
half of them

0:35:15.000,0:35:19.000
remain for the further call, and the number of times you can do that

0:35:19.000,0:35:22.000
for 1000 is the number of times you can divide 1000 by two, which is

0:35:22.000,0:35:24.000
the log based two of that,

0:35:24.000,0:35:26.000
which is roughly ten.

0:35:26.000,0:35:29.000
So if you were looking at 1000 number array, if it's in sorted order,

0:35:29.000,0:35:32.000
it takes you ten comparisons

0:35:32.000,0:35:34.000
to conclusively determine

0:35:34.000,0:35:38.000
where an element is if it's in there, or that it doesn't exist in the array at

0:35:38.000,0:35:39.000
all.

0:35:39.000,0:35:41.000
If you take that 1000 element array,

0:35:41.000,0:35:43.000
and you make it twice as big,

0:35:43.000,0:35:46.000
so now I have a 2000

0:35:46.000,0:35:50.000
number array, how much longer does it take? One more step.

0:35:50.000,0:35:51.000
One more step.

0:35:51.000,0:35:55.000
Just one, right? You look at one, and you have a 1000 number array, so however long it took

0:35:55.000,0:35:56.000
you to do that 1000 number array,

0:35:56.000,0:36:00.000
it takes one additional comparison, kinda one stack frame on top of that

0:36:00.000,0:36:01.000
to get its way down.

0:36:01.000,0:36:04.000
So this means actually this is super efficient.

0:36:04.000,0:36:06.000
That you can search so for example

0:36:06.000,0:36:09.000
a million is roughly two the 20th power.

0:36:09.000,0:36:12.000
So you have a million entry

0:36:12.000,0:36:16.000
collection to search, it will take you 20 comparisons to say for sure

0:36:16.000,0:36:17.000
it's here or not,

0:36:17.000,0:36:18.000
and here's where I found it,

0:36:18.000,0:36:20.000
just 20.

0:36:20.000,0:36:22.000
You go up to two million, it takes

0:36:22.000,0:36:24.000
21. The

0:36:24.000,0:36:27.000
very slow growing function, that logarithm function, so that tells you

0:36:27.000,0:36:28.000


0:36:28.000,0:36:31.000
that this is gonna be a very efficient way of searching a sorted array. A

0:36:31.000,0:36:34.000
category

0:36:34.000,0:36:37.000
thing called the divide and conquer that

0:36:37.000,0:36:39.000
take a problem,

0:36:39.000,0:36:42.000
divide it typically in half, but sometimes in thirds or some other

0:36:42.000,0:36:45.000
way, and then kind of - in this case

0:36:45.000,0:36:49.000
it's particularly handy that we can throw away some part of the problem, so we

0:36:49.000,0:36:56.000
divide and focus on just one part to solve the problem.

0:36:56.000,0:36:58.000
All right. So this is the first one

0:36:58.000,0:37:02.000
that's gonna start to

0:37:02.000,0:37:04.000
really inspire you for how recursion can help you solve problems

0:37:04.000,0:37:05.000
that you might

0:37:05.000,0:37:08.000
have no idea how to approach any other way

0:37:08.000,0:37:10.000
than using a recursive formulation.

0:37:10.000,0:37:13.000
So this is an exercise that comes out of the reader in Chapter 4,

0:37:13.000,0:37:16.000
and the

0:37:16.000,0:37:18.000
context of it is you have N things,

0:37:18.000,0:37:22.000
so maybe it's N people in a dorm, 60 people in a dorm,

0:37:22.000,0:37:26.000
and you would like to choose K of them. Let's K a real number, four

0:37:26.000,0:37:29.000
- four people to go together to Flicks.

0:37:29.000,0:37:32.000
So of your 60 dorm mates, how many different ways could you

0:37:32.000,0:37:33.000
pick a subset

0:37:33.000,0:37:35.000
of size four

0:37:35.000,0:37:38.000
that doesn't repeat any of the others? So you can pick the

0:37:38.000,0:37:41.000
two people from the first floor, one person from the middle floor, one person

0:37:41.000,0:37:44.000
from the top floor, but then you can kind of shuffle it up. What if you took all the people from the

0:37:44.000,0:37:45.000
first floor, or

0:37:45.000,0:37:48.000
these people from that room, and whatnot? You can imagine there's a lot of kind of

0:37:48.000,0:37:50.000
machinations

0:37:50.000,0:37:53.000
that could be present here,

0:37:53.000,0:37:57.000
and counting them, it's not quite obvious

0:37:57.000,0:37:59.000
unless you kinda go back to start working on your real math

0:37:59.000,0:38:01.000
skills

0:38:01.000,0:38:05.000
how it is that you can write a formula for this.

0:38:05.000,0:38:07.000
So what I'm gonna give you is

0:38:07.000,0:38:11.000
a recursive way of thinking about this problem.

0:38:11.000,0:38:12.000
So I drew

0:38:12.000,0:38:15.000
a set of the things we're looking at?

0:38:15.000,0:38:18.000
So there are N things that we're trying to choose K out

0:38:18.000,0:38:21.000
of. So right now, I've got

0:38:21.000,0:38:23.000
12 or so

0:38:23.000,0:38:24.000
people, or

0:38:24.000,0:38:26.000
items, whatever it is.

0:38:26.000,0:38:29.000
What I'm gonna do is I'm gonna imagine just designating one

0:38:29.000,0:38:30.000
totally at random.

0:38:30.000,0:38:33.000
So pick Bob.

0:38:33.000,0:38:35.000
Bob is one of the people in the dorm. I'm gonna kind of separate him from

0:38:35.000,0:38:39.000
everybody else mentally in my mind, and draw this line, and kinda mark him with a red flag

0:38:39.000,0:38:41.000
that says that's Bob.

0:38:41.000,0:38:45.000
So Bob might go to Flicks or might not go to Flicks. Some of the subsets for

0:38:45.000,0:38:52.000
going to Flicks will include Bob, and some will not.

0:38:52.000,0:38:53.000
Okay.

0:38:53.000,0:38:57.000
So what I'd like to think about is how many different subsets can I make that will

0:38:57.000,0:38:58.000
include Bob,

0:38:58.000,0:39:01.000
and how many different subsets can I make that don't include Bob. And

0:39:01.000,0:39:05.000
if I added those together, then that should be the total number of subsets I can make

0:39:05.000,0:39:08.000
from this collection.

0:39:08.000,0:39:11.000
Okay, so the subsets that include Bob -

0:39:11.000,0:39:15.000
so once I've committed to Bob being in the set, and let's say I'm trying to pick four

0:39:15.000,0:39:16.000
members out of here,

0:39:16.000,0:39:18.000
then I have Bob

0:39:18.000,0:39:22.000
and I have to figure out how many ways can I pick three people to accompany Bob to

0:39:22.000,0:39:24.000
the Flicks.

0:39:24.000,0:39:28.000
So I'm picking from a slightly smaller population. The population went down by one,

0:39:28.000,0:39:31.000
and the number I'm seeking went down by one,

0:39:31.000,0:39:34.000
and that tells me all the ways I can pick three people to go with Bob.

0:39:34.000,0:39:38.000
The ones that don't include Bob, and Bob's just out of the running,

0:39:38.000,0:39:42.000
I look at the remaining population which is still one smaller, everybody but Bob, and I

0:39:42.000,0:39:47.000
look for the ways I can still pick four people from there.

0:39:47.000,0:39:51.000
So what I have here is trying to compute C of NK

0:39:51.000,0:39:56.000
is trying to compute C of N minus one K minus one and add it to C of N minus one K.

0:39:56.000,0:39:58.000
So this is

0:39:58.000,0:39:59.000
picking

0:39:59.000,0:40:02.000
friends to accompany Bob. This is picking people without Bob.

0:40:02.000,0:40:03.000
Add those together,

0:40:03.000,0:40:08.000
and I will have the total number of ways I can pick K things out of N.

0:40:08.000,0:40:13.000
So we're very much relying on this recursive idea of if I had the answer

0:40:13.000,0:40:16.000
- I don't feel smart enough to know the answer directly,

0:40:16.000,0:40:20.000
but if I could defer it to someone who was actually willing to do more

0:40:20.000,0:40:23.000
scrutiny into this thing, if you could tell me how many groups of three you can

0:40:23.000,0:40:28.000
join with Bob, and how many groups of four you can pick without Bob, I can tell you

0:40:28.000,0:40:30.000
what the total answer is.

0:40:30.000,0:40:33.000
The simplest possible base case

0:40:33.000,0:40:38.000
we're gonna work our way down to are when there are just no choices remaining at all.

0:40:38.000,0:40:40.000
So if you look back at my

0:40:40.000,0:40:43.000
things that are here, in

0:40:43.000,0:40:46.000
both cases the population is getting smaller,

0:40:46.000,0:40:50.000
and in one of the recursive calls, the number of things I'm trying to

0:40:50.000,0:40:52.000
pick is also getting smaller.

0:40:52.000,0:40:55.000
So they're both making progress toward kind of shrinking those down to where

0:40:55.000,0:40:56.000
there's kind of

0:40:56.000,0:40:59.000
more and more constraints on what I have to choose.

0:40:59.000,0:41:02.000
For example, on this one

0:41:02.000,0:41:05.000
as I keep shrinking the population by one

0:41:05.000,0:41:07.000
and trying to get the same number of people,

0:41:07.000,0:41:10.000
eventually I'll be trying to pick three people out of three,

0:41:10.000,0:41:14.000
where I'm trying to pick of K of K remaining. Well, there's only one way to do

0:41:14.000,0:41:17.000
that which is to take everyone.

0:41:17.000,0:41:18.000
On this one, I'll

0:41:18.000,0:41:20.000
eventually keep picking people,

0:41:20.000,0:41:23.000
so the K is always less than the N in this case. The

0:41:23.000,0:41:26.000
K will eventually kinda bottom out, or I'll say I've

0:41:26.000,0:41:30.000
already picked all the people. I've already picked four people. I need to pick zero more. And

0:41:30.000,0:41:34.000
at that point, that's also very easy, right? Picking zero out of a population, there's

0:41:34.000,0:41:38.000
only one way to do that.

0:41:38.000,0:41:41.000
So what we end up with is

0:41:41.000,0:41:45.000
a very simple base case of if K equals zero - so I'm not trying to choose anymore.

0:41:45.000,0:41:49.000
I've already committed all the slots. Or if K is equal to N

0:41:49.000,0:41:50.000
where I've

0:41:50.000,0:41:52.000
discarded a whole bunch of people, and now I'm down to where I'm

0:41:52.000,0:41:56.000
facing I've gotta get four, and I've got four left. Well, there's only one way to do those things,

0:41:56.000,0:41:59.000
and that's to take everybody or to take nobody.

0:41:59.000,0:42:00.000
And then otherwise,

0:42:00.000,0:42:02.000
I make those two recursive calls

0:42:02.000,0:42:05.000
with Bob, without Bob, add them together

0:42:05.000,0:42:13.000
to get my whole result.

0:42:13.000,0:42:14.000
That's wacky. I'm gonna

0:42:14.000,0:42:18.000
read you something,

0:42:18.000,0:42:22.000
and then we'll call it a day. I

0:42:22.000,0:42:31.000
brought a book with me. I stole this from my children.