0:00:00.000,0:00:08.000 0:00:08.000,0:00:11.000 0:00:11.000,0:00:15.000 This presentation is delivered by the Stanford Center for Professional Development. 0:00:15.000,0:00:24.000 0:00:24.000,0:00:26.000 Good afternoon. 0:00:26.000,0:00:29.000 You guys are talkative. 0:00:29.000,0:00:32.000 So we talked just a little bit about the vocabulary of recursion at the 0:00:32.000,0:00:36.000 end of Friday. It was very rushed for time, so I'm 0:00:36.000,0:00:40.000 just gonna repeat some of those words to get us thinking about what the 0:00:40.000,0:00:42.000 context for solving problems recursively looks like. And then we're gonna go along and 0:00:42.000,0:00:45.000 do a lot of examples [cuts out]. 0:00:45.000,0:00:48.000 So the idea is that a recursive function is one that calls itself. 0:00:48.000,0:00:51.000 That's really all it means 0:00:51.000,0:00:54.000 at the most trivial level. It says that in the context of writing a 0:00:54.000,0:00:56.000 function binky, it's gonna make a call 0:00:56.000,0:01:00.000 or one or more calls to binky itself past 0:01:00.000,0:01:03.000 an argument as part of solving or doing some task. 0:01:03.000,0:01:04.000 The idea is that 0:01:04.000,0:01:08.000 we're using that because the problem itself has some self-similarity where the 0:01:08.000,0:01:11.000 answer I was seeking - so the idea of surveying the whole campus can actually be thought of as well if 0:01:11.000,0:01:14.000 I could get somebody to survey this part of campus and this part of campus, 0:01:14.000,0:01:17.000 somebody to survey all the freshmen, somebody to [cuts out] 0:01:17.000,0:01:18.000 and whatnot, 0:01:18.000,0:01:21.000 that those in a sense are - those surveys are just smaller instances of the same kind 0:01:21.000,0:01:24.000 of problem I was originally trying to solve. And 0:01:24.000,0:01:26.000 if I could recursively 0:01:26.000,0:01:30.000 delegate those things out, and then they themselves may in turn delegate even 0:01:30.000,0:01:32.000 further to smaller but same 0:01:32.000,0:01:36.000 structured problems to where we could eventually get to something so simple - in 0:01:36.000,0:01:40.000 that case of asking ten people for their input 0:01:40.000,0:01:44.000 that don't require any further decomposition - we will have worked our way to this base 0:01:44.000,0:01:48.000 case that then we can gather back up all the results and solve the whole thing. 0:01:48.000,0:01:51.000 This is gonna feel very mysterious at first, 0:01:51.000,0:01:53.000 and some of the examples I give you'll say I have 0:01:53.000,0:01:56.000 really easy alternatives other than recursion, so it's not gonna seem worth 0:01:56.000,0:01:58.000 the pain of trying to get your head around it. 0:01:58.000,0:02:03.000 But eventually, we're gonna work our way up to problems where recursion really is the right solution, 0:02:03.000,0:02:06.000 and there are no other alternatives that are obvious or simple 0:02:06.000,0:02:07.000 to do. 0:02:07.000,0:02:07.000 So the 0:02:07.000,0:02:11.000 idea throughout this week is actually just a lot of practice. Telling you 0:02:11.000,0:02:13.000 what the terms mean I think is not actually gonna help you understand it. I think what you 0:02:13.000,0:02:18.000 need to see is examples. So I'm gonna be doing four or five examples today, four or five 0:02:18.000,0:02:20.000 examples on Wednesday, and four or five examples on Friday 0:02:20.000,0:02:22.000 that each kind of 0:02:22.000,0:02:25.000 build on each other, kind of take the ideas 0:02:25.000,0:02:27.000 and get a little more sophisticated. 0:02:27.000,0:02:28.000 But 0:02:28.000,0:02:31.000 by the end of the week, I'm hoping that you're gonna start to see these patterns, and realize 0:02:31.000,0:02:35.000 that in some sense the recursive solutions tend to be more alike than different. 0:02:35.000,0:02:37.000 Once you have your head around how to solve 0:02:37.000,0:02:41.000 one type of problem, you may very well be able to take the exact same technique 0:02:41.000,0:02:43.000 and solve several other problems that 0:02:43.000,0:02:46.000 may sound different at first glance, but in the end, the recursive structure looks the 0:02:46.000,0:02:47.000 same. 0:02:47.000,0:02:54.000 So I'd say just hold the discomfort a little bit, and wait to see as 0:02:54.000,0:02:57.000 we keep working which example may be the one that kind of sticks out for you to 0:02:57.000,0:02:59.000 help you get it through. 0:02:59.000,0:03:02.000 So we're gonna start with things that fit in the category of 0:03:02.000,0:03:03.000 functional recursion. 0:03:03.000,0:03:05.000 The functional in this case just says that 0:03:05.000,0:03:09.000 you're writing functions that return some non-void thing, an integer, a string, 0:03:09.000,0:03:10.000 some 0:03:10.000,0:03:12.000 vector of results, or whatever that is, 0:03:12.000,0:03:14.000 and that's all it means to be a functional recursion. It's a recursive 0:03:14.000,0:03:18.000 function that has a result to it. And because of the recursive nature, it's gonna 0:03:18.000,0:03:22.000 say that the outer problem's result, the answer to the larger problem is gonna be 0:03:22.000,0:03:26.000 based on making calls, one or more calls to the function itself to get the answer 0:03:26.000,0:03:30.000 to a smaller problem, and then adding them, multiplying them, comparing them to decide 0:03:30.000,0:03:33.000 how to formulate the larger answer. 0:03:33.000,0:03:35.000 All recursive 0:03:35.000,0:03:35.000 code 0:03:35.000,0:03:38.000 follows the same decomposition 0:03:38.000,0:03:42.000 into two cases. Sometimes there's some subdivisions within there, but two general 0:03:42.000,0:03:43.000 cases. 0:03:43.000,0:03:45.000 The first thing - this base case. 0:03:45.000,0:03:48.000 That's something that the recursion eventually has to stop. 0:03:48.000,0:03:52.000 We keep saying take the task and break it down, make it a little smaller, 0:03:52.000,0:03:55.000 but at some point we really have to stop doing that. We can't 0:03:55.000,0:03:57.000 go on infinitely. 0:03:57.000,0:04:00.000 There has to be some base case, the simplest possible version of the problem that 0:04:00.000,0:04:05.000 you can directly solve. You don't need to make any further recursive calls. So the idea 0:04:05.000,0:04:10.000 is that it kinda bottoms out there, and then allows the recursion to kind of 0:04:10.000,0:04:10.000 unwind. 0:04:10.000,0:04:12.000 The recursive cases, 0:04:12.000,0:04:15.000 and there may be one or more of these, are the cases where it's not that simple, that the 0:04:15.000,0:04:20.000 answer isn't directly solvable, but if you had the answer to a smaller, simpler 0:04:20.000,0:04:20.000 version, you 0:04:20.000,0:04:24.000 would be able to assemble that answer you're looking for 0:04:24.000,0:04:27.000 using that information from the recursive call. 0:04:27.000,0:04:31.000 So that's the kind of structure that they all look like. If I'm 0:04:31.000,0:04:34.000 at the base case, then do the base case and return it. 0:04:34.000,0:04:37.000 Otherwise, make some recursive calls, and 0:04:37.000,0:04:43.000 use that to return a result from this iteration. 0:04:43.000,0:04:45.000 So let's first look at something that 0:04:45.000,0:04:48.000 - the first couple examples that I'm gonna show you actually are gonna be so 0:04:48.000,0:04:52.000 easy that in some sense they're almost gonna be a little bit counterproductive because they're gonna 0:04:52.000,0:04:54.000 teach you that recursion lets you do things that you already 0:04:54.000,0:04:57.000 know how to do. And then I'm gonna work my way up to ones that actually 0:04:57.000,0:04:59.000 get beyond that. 0:04:59.000,0:05:02.000 But let's look at first the raise to power. 0:05:02.000,0:05:05.000 C++ has no built-in exponentiation operator. There's nothing that raises 0:05:05.000,0:05:07.000 a base to a particular exponent 0:05:07.000,0:05:08.000 in the operator set. 0:05:08.000,0:05:12.000 So if you want it, you need to write it, or you can use - there's a math library called 0:05:12.000,0:05:15.000 pow for raise to power. We're gonna run our own version of it 0:05:15.000,0:05:17.000 because it's gonna give us some practice thing about this. 0:05:17.000,0:05:20.000 The first one I'm gonna show you is one that should feel very familiar and 0:05:20.000,0:05:24.000 very intuitive, which is using an iterative formulation. If I'm 0:05:24.000,0:05:27.000 trying to raise the base to the exponent, then that's really simply 0:05:27.000,0:05:30.000 multiplying base by itself exponent times. 0:05:30.000,0:05:31.000 So this one 0:05:31.000,0:05:32.000 uses a for loop 0:05:32.000,0:05:37.000 and does so. It starts the result at one, and for 0:05:37.000,0:05:40.000 iterations through the number of exponents keeps 0:05:40.000,0:05:46.000 multiplying to get there. So 0:05:46.000,0:05:48.000 that one's fine, and it will perfectly work. 0:05:48.000,0:05:51.000 I'm gonna show you this alternative way 0:05:51.000,0:05:54.000 that starts you thinking about what it means to divide a problem up in a 0:05:54.000,0:05:56.000 recursive strategy. 0:05:56.000,0:06:00.000 Base to the exponent - I wanna raise five to the tenth power. If 0:06:00.000,0:06:05.000 I had around some delegate, some 0:06:05.000,0:06:08.000 clone of myself that I could dispatch to solve 0:06:08.000,0:06:11.000 the slightly smaller problem of computing five to the ninth power, 0:06:11.000,0:06:15.000 then all I would need to do is take that answer and multiply it by one more five, and I'd get 0:06:15.000,0:06:17.000 five to the tenth. 0:06:17.000,0:06:19.000 Okay. 0:06:19.000,0:06:22.000 If I write code 0:06:22.000,0:06:25.000 that's based on that, 0:06:25.000,0:06:28.000 then I end up with something here - and I'm gonna 0:06:28.000,0:06:31.000 let these two things go through to show us 0:06:31.000,0:06:32.000 that 0:06:32.000,0:06:34.000 to compute the answer to five 0:06:34.000,0:06:36.000 to the tenth power, 0:06:36.000,0:06:38.000 what I really need is the answer to five 0:06:38.000,0:06:42.000 to the ninth power, which I do by making a recursive call to the same function I'm in 0:06:42.000,0:06:43.000 the middle of writing. 0:06:43.000,0:06:45.000 So this is raise that I'm 0:06:45.000,0:06:49.000 defining, and in the body of it, it makes a call to raise. That's what is the 0:06:49.000,0:06:51.000 mark of a recursive function here. I 0:06:51.000,0:06:55.000 pass slightly different arguments. In this case, one smaller exponent 0:06:55.000,0:06:57.000 which is getting a little bit closer 0:06:57.000,0:07:00.000 to that simplest possible case 0:07:00.000,0:07:02.000 that we will eventually terminate at where 0:07:02.000,0:07:07.000 we can say I don't need to further dispatch any delegates and any clones out there 0:07:07.000,0:07:08.000 to do the work, but if 0:07:08.000,0:07:11.000 the exponent I'm raising it to is zero, by 0:07:11.000,0:07:16.000 definition anything raised to the exponent of zero is one, I could just stop there. 0:07:16.000,0:07:19.000 So when computing five to the tenth, we're gonna 0:07:19.000,0:07:21.000 see some recursion at work. Let me 0:07:21.000,0:07:24.000 take this code into the compiler 0:07:24.000,0:07:27.000 0:07:27.000,0:07:31.000 so that we can see a little bit about how this actually works 0:07:31.000,0:07:32.000 in terms of that. 0:07:32.000,0:07:35.000 So that's exactly the code I have there, but 0:07:35.000,0:07:42.000 I can say 0:07:42.000,0:07:43.000 something that I 0:07:43.000,0:07:46.000 know the answer to. How about that? 0:07:46.000,0:07:49.000 0:07:49.000,0:07:51.000 First, we'll take a look at it 0:07:51.000,0:07:55.000 doing its work, so five times five times five should be 125. 0:07:55.000,0:07:57.000 So we can test a couple 0:07:57.000,0:07:58.000 little 0:07:58.000,0:08:01.000 values while we're at it. Two the sixth power, that should be 0:08:01.000,0:08:07.000 64, just to see a couple of the cases, just to feel good about what's going on. 0:08:07.000,0:08:10.000 And then raising say 23 to the zero power 0:08:10.000,0:08:11.000 should be one 0:08:11.000,0:08:14.000 as anything raised to the zero power should be. 0:08:14.000,0:08:16.000 So a little bit of spot testing 0:08:16.000,0:08:18.000 to feel good about what's going on. 0:08:18.000,0:08:22.000 Now I'm gonna go back to this idea like two to the sixth. 0:08:22.000,0:08:24.000 And I'm gonna set a breakpoint here. 0:08:24.000,0:08:25.000 Get my breakpoint out. 0:08:25.000,0:08:36.000 And I'm gonna run this guy in the debugger. 0:08:36.000,0:08:39.000 Takes a little bit longer to get the debugger up and running, 0:08:39.000,0:08:44.000 so I'll have to make a little padder up while we're going here. 0:08:44.000,0:08:47.000 And then it tells me right now it's breaking on raise, and 0:08:47.000,0:08:53.000 I can look around in the debugger. This is a - did not 0:08:53.000,0:08:56.000 pick up my compilation? I think it did not. I 0:08:56.000,0:08:59.000 must not have saved it right before I did it because it's actually got the base is 23 and the 0:08:59.000,0:09:00.000 exponent is zero. It turns 0:09:00.000,0:09:06.000 out I don't wanna see that case, so I'm gonna go back and 0:09:06.000,0:09:08.000 try again. I 0:09:08.000,0:09:16.000 wanna see it - 0:09:16.000,0:09:24.000 no, I did not. 0:09:24.000,0:09:28.000 And I'm just interested in knowing a little bit about the mechanics of what's gonna happen in a 0:09:28.000,0:09:29.000 recursive situation. 0:09:29.000,0:09:33.000 If I look at the first time that I hit my breakpoint, 0:09:33.000,0:09:35.000 then I'll see that there's a little bit of the beginnings of the 0:09:35.000,0:09:39.000 student main, some stuff behind it. There's a little bit of magic underneath your 0:09:39.000,0:09:43.000 stack that you don't really need to know about, but starting from main it went into raise, 0:09:43.000,0:09:46.000 and the arguments it has there is the base is two, the exponent is six. 0:09:46.000,0:09:50.000 If I continue from here, then you'll notice the stack frame got one deeper. 0:09:50.000,0:09:53.000 There's actually another indication of raise, and in fact, they're both active at the 0:09:53.000,0:09:54.000 same time. 0:09:54.000,0:09:57.000 The previous raise that was trying to compute two to the sixth 0:09:57.000,0:10:01.000 is kind of in stasis back there waiting for the answer to come back from 0:10:01.000,0:10:05.000 this version, which is looking to raise two to the fifth power. I continue again, 0:10:05.000,0:10:08.000 I get two to the fourth. I keep doing this. I'm gonna see these guys kinda stack up, 0:10:08.000,0:10:12.000 each one of those kind of waiting for the delegate or the clone 0:10:12.000,0:10:17.000 to come back with that answer, so that then it can do its further work 0:10:17.000,0:10:19.000 incorporating that result to compute the thing it needed to do. 0:10:19.000,0:10:22.000 I get down here to raising 0:10:22.000,0:10:24.000 two to the first power, 0:10:24.000,0:10:26.000 and then finally I get to two to the zero, 0:10:26.000,0:10:29.000 so now I've got these eight or so stacked frames, 0:10:29.000,0:10:30.000 six up there. 0:10:30.000,0:10:32.000 This one, if I step 0:10:32.000,0:10:34.000 from here, 0:10:34.000,0:10:37.000 it's gonna hit the base case of returning one, 0:10:37.000,0:10:39.000 and then 0:10:39.000,0:10:43.000 we will end up kind of working our way back out. So now, we are 0:10:43.000,0:10:46.000 at the end of the two to the one case 0:10:46.000,0:10:49.000 that's using the answer it got from the other one, multiplying it by two. 0:10:49.000,0:10:53.000 Now I'm at the two to the two case, and so each of them's kind of unfolding in the stack is 0:10:53.000,0:10:55.000 what's called unwinding. It's popping back off 0:10:55.000,0:10:57.000 the stack frames that are there 0:10:57.000,0:11:01.000 and kind of revisiting them, each passing up the information it got back, 0:11:01.000,0:11:04.000 and eventually telling us that 0:11:04.000,0:11:08.000 the answer was 64, so I will let that go. 0:11:08.000,0:11:15.000 0:11:15.000,0:11:17.000 So the idea that 0:11:17.000,0:11:21.000 all of those stack frames kind of exist at the same time - they're all being maintained 0:11:21.000,0:11:24.000 independently, so the idea that the compiler isn't confused by the idea that raise is 0:11:24.000,0:11:26.000 invoking raise which is invoking raise, 0:11:26.000,0:11:30.000 that each of the raise stack frames is distinct from the other ones, so the 0:11:30.000,0:11:33.000 indications are actually kept separate. So one had two to the sixth, the next one had two to 0:11:33.000,0:11:35.000 the fifth, and so on. 0:11:35.000,0:11:36.000 And then eventually 0:11:36.000,0:11:37.000 we 0:11:37.000,0:11:40.000 need to make some progress toward that base case, so that we can then 0:11:40.000,0:11:42.000 stop that recursion 0:11:42.000,0:11:43.000 and unwind. 0:11:43.000,0:11:47.000 Let me actually show you something while I'm here, which is one thing that's 0:11:47.000,0:11:48.000 a pretty common mistake 0:11:48.000,0:11:51.000 to make early on in a recursion is to somehow fail to make progress toward 0:11:51.000,0:11:53.000 that base case 0:11:53.000,0:11:57.000 or to 0:11:57.000,0:11:58.000 - 0:11:58.000,0:12:01.000 not all cases make it to the base case. For example, if 0:12:01.000,0:12:03.000 I did something where 0:12:03.000,0:12:06.000 I forgot to 0:12:06.000,0:12:07.000 subtract one [cuts out], 0:12:07.000,0:12:10.000 and I said oh yeah, I need to [cuts out]. In this case, I'm passing it exactly the 0:12:10.000,0:12:13.000 same arguments I got. 0:12:13.000,0:12:16.000 If I do this and I run this guy, 0:12:16.000,0:12:19.000 then what's gonna happen is it's gonna go two to the sixth, two to the sixth, two to the sixth, 0:12:19.000,0:12:20.000 and I'm gonna 0:12:20.000,0:12:21.000 let go of 0:12:21.000,0:12:23.000 this breakpoint here because 0:12:23.000,0:12:30.000 I don't really wanna watch it all happening. 0:12:30.000,0:12:31.000 And there it goes. 0:12:31.000,0:12:35.000 Loading 6,493 stack frames, zero percent completed, so 0:12:35.000,0:12:38.000 that's gonna take a while as you 0:12:38.000,0:12:41.000 can imaging. And usually, once you see this error message, it's your clue to say I 0:12:41.000,0:12:45.000 can cancel, I know what happened. The only reason I should've 0:12:45.000,0:12:46.000 gotten 0:12:46.000,0:12:50.000 6,500 stack frames loaded up is because I made a mistake that 0:12:50.000,0:12:53.000 caused the stack to totally overflow. 0:12:53.000,0:12:57.000 So the behavior in C++ or C is that when you have 0:12:57.000,0:13:01.000 so many of those stack frames, eventually the space that's been allocated or set 0:13:01.000,0:13:03.000 aside for the function stack will be exhausted. 0:13:03.000,0:13:05.000 It will use all the space it has, 0:13:05.000,0:13:09.000 and run up against a boundary, and typically report it in some way that 0:13:09.000,0:13:13.000 suggests - sometimes you'll see stack overflow, stack out of memory error. 0:13:13.000,0:13:14.000 0:13:14.000,0:13:15.000 In this case, it's showing you the 0:13:15.000,0:13:17.000 7,000 stack frames that 0:13:17.000,0:13:20.000 are behind you, and then if you were to examine them you would see they all have exactly the 0:13:20.000,0:13:24.000 same argument, so they weren't getting anywhere. 0:13:24.000,0:13:27.000 I'm not gonna actually let it do that because I'm 0:13:27.000,0:13:28.000 too impatient. 0:13:28.000,0:13:29.000 Let 0:13:29.000,0:13:30.000 me fix this code while I'm here. 0:13:30.000,0:13:33.000 But other things like even this code that actually looks correct for 0:13:33.000,0:13:39.000 some situations also has a subtle bug in it. Even if 0:13:39.000,0:13:40.000 I fixed this, which is 0:13:40.000,0:13:44.000 that, right now it assumed that the exponent is positive, 0:13:44.000,0:13:46.000 that it's some number 0:13:46.000,0:13:47.000 0:13:47.000,0:13:49.000 that I can subtract my way down to zero. 0:13:49.000,0:13:53.000 If I actually miscalled raise, and I gave it a negative exponent, it would 0:13:53.000,0:13:57.000 go into infinite recursion as well. If you started it 0:13:57.000,0:14:00.000 at ten to the negative first power, it would go negative first, negative second, negative 0:14:00.000,0:14:02.000 third. 0:14:02.000,0:14:05.000 6,500 stack frames later, we'd run out of space. 0:14:05.000,0:14:08.000 In this case, since we're only intending to handle those positive powers, we 0:14:08.000,0:14:12.000 could just put an error that was like if the exponent is less than zero then 0:14:12.000,0:14:14.000 raise an error and don't even 0:14:14.000,0:14:21.000 try to do anything with it. Okay. So let 0:14:21.000,0:14:25.000 me show you a slightly different way of doing this that's also recursive, 0:14:25.000,0:14:28.000 but that actually gets the answer a little bit more efficiently. 0:14:28.000,0:14:31.000 This is a different way of dividing it up, but still using a 0:14:31.000,0:14:36.000 recursive strategy which is that if I'm trying to compute five to the tenth power, 0:14:36.000,0:14:40.000 but if I have the answer not of five to ninth power, but instead I have the 0:14:40.000,0:14:42.000 answer of five to the fifth power, 0:14:42.000,0:14:44.000 and then I multiply that by itself, 0:14:44.000,0:14:48.000 I would get that five to the tenth power that I seek. 0:14:48.000,0:14:51.000 And then there's a little bit of a case though of what if the power I was trying to 0:14:51.000,0:14:54.000 get was odd, if I was trying to raise it to the eleventh power, I could compute the 0:14:54.000,0:14:58.000 half power which get me to the fifth - multiplied by the fifth, but then I need 0:14:58.000,0:14:59.000 one more 0:14:59.000,0:15:04.000 base multiplied in there to make up for that half. Okay. 0:15:04.000,0:15:05.000 And so I can write 0:15:05.000,0:15:08.000 another recursive formulation here. Same 0:15:08.000,0:15:11.000 sort of base case about 0:15:11.000,0:15:14.000 detecting when we've gotten down, but then in this case the recursive call we make is 0:15:14.000,0:15:17.000 to base of the exponent divided by two, 0:15:17.000,0:15:19.000 and then we use it with an 0:15:19.000,0:15:23.000 if else whether the exponent was even or odd, it decided whether to just square 0:15:23.000,0:15:25.000 that number or whether to square it and 0:15:25.000,0:15:28.000 tack in another power of the base while we're at it. 0:15:28.000,0:15:31.000 So this one is gonna be much more quick about getting down to it, 0:15:31.000,0:15:34.000 whereas the one we saw was gonna put one stack frame down for every 0:15:34.000,0:15:35.000 0:15:35.000,0:15:39.000 successive exponent power. So if you wanted to raise something to the 10th, or 20th, 0:15:39.000,0:15:41.000 or 30th power, then you were 0:15:41.000,0:15:43.000 giving yourself 10, 20, 30 stack frames. 0:15:43.000,0:15:46.000 Something like 30 stack frames is not really something to be worried 0:15:46.000,0:15:48.000 about, but if you were really trying to make this work on much larger 0:15:48.000,0:15:49.000 numbers, 0:15:49.000,0:15:52.000 which would require some other work because exponent 0:15:52.000,0:15:55.000 is a very rapidly growing operator and would overflow your integers quickly, 0:15:55.000,0:15:57.000 but this way 0:15:57.000,0:16:00.000 very quickly divides in half. So it goes from a power of 100, to a power of 0:16:00.000,0:16:05.000 50, to 25, to 12, to 6, to 3, to 2, to 1, so 0:16:05.000,0:16:08.000 that dividing in half is a much quicker way to work our way down to that base case 0:16:08.000,0:16:12.000 and get our answer back, and we're doing a lot fewer calculations 0:16:12.000,0:16:14.000 than all those multiplies, 0:16:14.000,0:16:16.000 one per 0:16:16.000,0:16:19.000 base. So just a little 0:16:19.000,0:16:21.000 diversion. 0:16:21.000,0:16:22.000 So let me tell you something, 0:16:22.000,0:16:23.000 just a little bit about 0:16:23.000,0:16:27.000 kind of style as it applies to recursion. 0:16:27.000,0:16:28.000 Recursion is 0:16:28.000,0:16:34.000 really best when you can express it in the most kind of direct, clear, and 0:16:34.000,0:16:36.000 simple code. 0:16:36.000,0:16:38.000 It's hard enough to get your head around a recursive formulation 0:16:38.000,0:16:42.000 without complicating it by having a bunch of extraneous parts where 0:16:42.000,0:16:43.000 you're 0:16:43.000,0:16:47.000 doing more work than necessary, or redundantly handling certain things. And 0:16:47.000,0:16:49.000 one thing that's actually very easy to fall in the trap of 0:16:49.000,0:16:52.000 is thinking about there's lots of other base cases you might be able to 0:16:52.000,0:16:56.000 easily handle, so why not just go ahead and call out for them, 0:16:56.000,0:16:57.000 test for 0:16:57.000,0:16:59.000 - you're at the base case. You're close to the base case. 0:16:59.000,0:17:02.000 Checking before you might make a recursive call, if you're gonna hit the base case when you 0:17:02.000,0:17:08.000 make that call, then why make the call. I'll just anticipate and get the answer it 0:17:08.000,0:17:09.000 would've returned anyway. 0:17:09.000,0:17:13.000 It can lead to code that looks a little bit like this before you're done. If the 0:17:13.000,0:17:16.000 exponent's zero, that's one. If the exponent's one, then I can just 0:17:16.000,0:17:19.000 return the base. If it's two, then I can just multiply the base by itself. If it's 0:17:19.000,0:17:20.000 three, I can start doing this. 0:17:20.000,0:17:22.000 Certainly, you 0:17:22.000,0:17:25.000 can follow this to the point of absurdity, and even for some of the simple 0:17:25.000,0:17:28.000 cases, it might seem like you're saving yourself a little bit of extra time. You're like 0:17:28.000,0:17:31.000 why go back around and let it make another recursive call. I could stop it right 0:17:31.000,0:17:33.000 here. It's easy to know that answer. 0:17:33.000,0:17:36.000 But as you do this, it complicates the code. It's a little harder to read. It's 0:17:36.000,0:17:38.000 a little harder to 0:17:38.000,0:17:39.000 debug. 0:17:39.000,0:17:43.000 Really, the expense of making that one extra call, two extra calls 0:17:43.000,0:17:46.000 is not the thing to be worried about. 0:17:46.000,0:17:50.000 What we really want is the cleanest code that expresses what we do, has the 0:17:50.000,0:17:53.000 simplest cases to test, and to examine, and to follow, 0:17:53.000,0:17:56.000 and not muck it up with things that in 0:17:56.000,0:17:57.000 effect 0:17:57.000,0:18:01.000 don't change the computational power of this but just introduce the opportunity for error. 0:18:01.000,0:18:02.000 Instead of a 0:18:02.000,0:18:05.000 multiply here, I accidentally put a plus. 0:18:05.000,0:18:07.000 It might be easy to overlook it 0:18:07.000,0:18:08.000 and not realize what I've done, 0:18:08.000,0:18:10.000 and then 0:18:10.000,0:18:14.000 end up computing the wrong answer when it gets to the base case of two. In fact, if 0:18:14.000,0:18:15.000 you do it this 0:18:15.000,0:18:17.000 way, most things would stop at three, 0:18:17.000,0:18:20.000 but these would suddenly become kind of their own special cases that were only 0:18:20.000,0:18:25.000 hit if you directly called them with the two. The recursive cases will all stop earlier. It 0:18:25.000,0:18:28.000 just complicates your testing pass now because not everything is going through 0:18:28.000,0:18:31.000 the same code. 0:18:31.000,0:18:34.000 So we call this arm's length recursion. I put a big X on that 0:18:34.000,0:18:35.000 looking ahead, 0:18:35.000,0:18:36.000 testing before you get there. 0:18:36.000,0:18:41.000 Just let the code fall through. 0:18:41.000,0:18:43.000 So 0:18:43.000,0:18:45.000 Dan, he's not here today, 0:18:45.000,0:18:48.000 but he talked to me at the end of Friday's class, and it made 0:18:48.000,0:18:49.000 me 0:18:49.000,0:18:53.000 wanna actually just give you a little bit of 0:18:53.000,0:18:54.000 insight into 0:18:54.000,0:18:57.000 recursion as it relates to 0:18:57.000,0:18:58.000 0:18:58.000,0:19:02.000 efficiency. Recursion by itself, just the idea of applying a recursive technique to solving a 0:19:02.000,0:19:06.000 problem, does not give you any guarantee that it's gonna be the best solution, the 0:19:06.000,0:19:09.000 most efficient solution, or the fact that it's gonna give you a very inefficient solution. 0:19:09.000,0:19:11.000 Sometimes people have kind of 0:19:11.000,0:19:11.000 a 0:19:11.000,0:19:15.000 bad rap on recursion. It's like recursion will definitely be inefficient. 0:19:15.000,0:19:17.000 That actually is not guaranteed. 0:19:17.000,0:19:21.000 Recursion often requires exactly the same resources as the iterative approach. 0:19:21.000,0:19:24.000 It takes the same amount of effort. Surveying the campus - if you're gonna 0:19:24.000,0:19:27.000 survey the 10,000 people on campus and get everybody's information back, 0:19:27.000,0:19:30.000 whether you're doing it divide and conquer, or whether you're sitting out there in White 0:19:30.000,0:19:30.000 Plaza 0:19:30.000,0:19:34.000 asking each person one by one, in the end 10,000 people got survey. 0:19:34.000,0:19:36.000 The recursion is not 0:19:36.000,0:19:39.000 part of what made that longer or shorter. It might actually depending on how you 0:19:39.000,0:19:42.000 have resources work out better. Like if you could get a bunch of people in parallel 0:19:42.000,0:19:46.000 surveying, you might end up completing the whole thing in less time, but it required 0:19:46.000,0:19:49.000 more people, and more clipboards, and more paper 0:19:49.000,0:19:52.000 while the process is ongoing than me standing there with one piece of paper and a 0:19:52.000,0:19:57.000 clipboard. But then again, it took lots of my time to do it. 0:19:57.000,0:20:00.000 So in many situations, it's really no better or no worse 0:20:00.000,0:20:02.000 than the alternative. It makes a little bit of some tradeoffs of where the time is 0:20:02.000,0:20:03.000 spent. 0:20:03.000,0:20:06.000 There are situations though where recursion can actually make something that was 0:20:06.000,0:20:09.000 efficient inefficient. There are situations where it can take something that was inefficient and 0:20:09.000,0:20:11.000 make it efficient. So 0:20:11.000,0:20:15.000 it's more of a case-by-case basis to decide whether recursion is the 0:20:15.000,0:20:18.000 right tool if efficiency is one of your primary concerns. 0:20:18.000,0:20:21.000 I will say that for problems with simple iterative solutions 0:20:21.000,0:20:24.000 that operate relatively efficiently, iteration 0:20:24.000,0:20:26.000 is probably the best way to solve it. 0:20:26.000,0:20:27.000 So like 0:20:27.000,0:20:31.000 the raise to power, yeah you could certainly do that 0:20:31.000,0:20:32.000 iteratively. There's not 0:20:32.000,0:20:35.000 some huge advantage that recursion is offering. I'm using them here because they're 0:20:35.000,0:20:37.000 simple enough to get our head around. 0:20:37.000,0:20:39.000 We're gonna work our way toward problems where 0:20:39.000,0:20:42.000 we're gonna find things that require recursion, which 0:20:42.000,0:20:45.000 is kind of one of the third points I put in. Why do we learn recursion? What's recursion gonna be good 0:20:45.000,0:20:48.000 for? First off, recursion is awesome. 0:20:48.000,0:20:52.000 There are some problems that you just can't solve using anything but recursion, 0:20:52.000,0:20:55.000 so that the alternatives like I'll just write it iteratively won't work. 0:20:55.000,0:20:59.000 You'll try, but you'll fail. They inherently have a recursive 0:20:59.000,0:20:59.000 structure to them 0:20:59.000,0:21:02.000 where recursion is the right tool for the job. 0:21:02.000,0:21:06.000 Often, it produces the most beautiful, direct, and clear elegant code. 0:21:06.000,0:21:08.000 The next assignment that will go out 0:21:08.000,0:21:11.000 has these problems that you do in recursion, and they're each about ten lines 0:21:11.000,0:21:13.000 long. Some of them are like five lines long. 0:21:13.000,0:21:18.000 They do incredible things in five lines because of the descriptive power of 0:21:18.000,0:21:19.000 describing it as 0:21:19.000,0:21:22.000 here's a base case, and here's a recursive case, and everything else just follows 0:21:22.000,0:21:23.000 from 0:21:23.000,0:21:26.000 applying this, and reducing the problem step by step. 0:21:26.000,0:21:28.000 So you will see things where 0:21:28.000,0:21:31.000 the recursive code is just beautiful, clean, elegant, easy to understand, easy to follow, 0:21:31.000,0:21:32.000 easy to test, 0:21:32.000,0:21:34.000 solves the recursive problem. And in 0:21:34.000,0:21:37.000 those cases, it really is a much better answer than trying to hack up some 0:21:37.000,0:21:39.000 other iterative form 0:21:39.000,0:21:41.000 that may in the end be 0:21:41.000,0:21:44.000 no more efficient. It may be even less efficient. So don't let efficiency be 0:21:44.000,0:21:46.000 kind of a big 0:21:46.000,0:21:52.000 fear of what recursion is for you. So I'm gonna do more examples. 0:21:52.000,0:21:54.000 0:21:54.000,0:21:56.000 I've got three more examples, 0:21:56.000,0:21:58.000 or four I think today, so 0:21:58.000,0:22:01.000 I will just keep 0:22:01.000,0:22:03.000 showing you different things, and hopefully 0:22:03.000,0:22:05.000 the patterns will start to kind of 0:22:05.000,0:22:07.000 come out of the mist for you. 0:22:07.000,0:22:10.000 A palindrome string is one that reads the same forwards and backwards 0:22:10.000,0:22:14.000 when reversed. So was it a car or a cat I saw, if you read that 0:22:14.000,0:22:16.000 backwards, it turns out 0:22:16.000,0:22:17.000 it 0:22:17.000,0:22:21.000 says the same thing. Go hang a salami, I'm a lasagna 0:22:21.000,0:22:22.000 hog. Also 0:22:22.000,0:22:26.000 handy when you need to have a bar bet over what the longest palindrome is 0:22:26.000,0:22:28.000 to have these things around. 0:22:28.000,0:22:31.000 There are certainly ways to do this iteratively. If you were given a string and you were interested to 0:22:31.000,0:22:33.000 know is it a palindrome, 0:22:33.000,0:22:36.000 you could kind of do this marching - you're looking outside and kind of marching your way 0:22:36.000,0:22:38.000 into the middle. 0:22:38.000,0:22:42.000 But we're gonna go ahead and let recursion kinda help us 0:22:42.000,0:22:45.000 deal with the subproblem of this, 0:22:45.000,0:22:47.000 and imagine that at the simplest possible form - you could say that 0:22:47.000,0:22:49.000 a palindrome 0:22:49.000,0:22:51.000 consists of 0:22:51.000,0:22:53.000 an interior palindrome, 0:22:53.000,0:22:56.000 and then the same letter tacked on to the front and the back. 0:22:56.000,0:22:58.000 So if you look at was it a car or a cat I saw, 0:22:58.000,0:23:01.000 there are two Ws there. It starts and it ends with a W, 0:23:01.000,0:23:04.000 so all palindromes must start and end with the same letter. Okay. 0:23:04.000,0:23:05.000 Let's check that, 0:23:05.000,0:23:11.000 and if that matches, then extract that middle and see if it's a palindrome. 0:23:11.000,0:23:13.000 So it feels like I didn't really do anything. It's like 0:23:13.000,0:23:17.000 all I did was match two letters, and then I said by the way 0:23:17.000,0:23:19.000 delegate this problem 0:23:19.000,0:23:23.000 back to myself, making a call to a function I haven't even finished writing 0:23:23.000,0:23:27.000 to examine the rest of the letters. 0:23:27.000,0:23:29.000 And then I need a base case. 0:23:29.000,0:23:31.000 I've got a recursive case, right? 0:23:31.000,0:23:34.000 Take off the outer two letters. Make sure they match. 0:23:34.000,0:23:37.000 Recur on the inside. 0:23:37.000,0:23:42.000 What is the simplest possible palindrome? One letter? One 0:23:42.000,0:23:44.000 letter. One letter makes a good palindrome. One letter is by 0:23:44.000,0:23:47.000 definition first and last letter are the same letter, so it matches. That's 0:23:47.000,0:23:53.000 a great base case. Is it the only base case? [Inaudible]. Two 0:23:53.000,0:23:56.000 letters is also kind of important, but there's actually an even 0:23:56.000,0:23:58.000 simpler form, right? 0:23:58.000,0:24:02.000 It's the empty string. So both the empty string and the single character string are by 0:24:02.000,0:24:05.000 definition the world's simplest palindromes. They meet the 0:24:05.000,0:24:08.000 requirements that they read the same forward and backwards. Empty string forwards 0:24:08.000,0:24:10.000 and backwards is trivially the 0:24:10.000,0:24:14.000 same, so that makes it even easier than doing the two-letter case. 0:24:14.000,0:24:17.000 So if I write this code to look like that where 0:24:17.000,0:24:19.000 if the length of the strength is 0:24:19.000,0:24:24.000 one or fewer, so that handles both the zero and the one case, then I return true. 0:24:24.000,0:24:25.000 Those are trivial 0:24:25.000,0:24:26.000 0:24:26.000,0:24:28.000 palindromes of the easiest 0:24:28.000,0:24:30.000 immediate detection. 0:24:30.000,0:24:31.000 Otherwise, 0:24:31.000,0:24:33.000 I've got a return here that says 0:24:33.000,0:24:35.000 if the first character 0:24:35.000,0:24:38.000 is the same as the last character, 0:24:38.000,0:24:39.000 and the 0:24:39.000,0:24:42.000 middle - so the substring starting at Position 1 0:24:42.000,0:24:47.000 that goes for length minus two characters that picks up the interior 0:24:47.000,0:24:50.000 discarding the first and last characters - if that's also a palindrome, then we've got a 0:24:50.000,0:24:51.000 palindrome. 0:24:51.000,0:24:54.000 So given the short circuiting nature of the and, if 0:24:54.000,0:24:56.000 it looks at the outer two characters and they don't match, it immediately just 0:24:56.000,0:24:58.000 stops right there and says false. 0:24:58.000,0:25:01.000 If they do match, then it goes on looking 0:25:01.000,0:25:03.000 at the next interior pair 0:25:03.000,0:25:07.000 which will stack up a recursive call looking at its two things, and eventually we 0:25:07.000,0:25:10.000 will either catch a pair that doesn't match, 0:25:10.000,0:25:14.000 and then this false will immediately return its way out, or it will keep 0:25:14.000,0:25:17.000 going all the way down to that base case, hit a true, 0:25:17.000,0:25:21.000 and know that we do have a full palindrome there. 0:25:21.000,0:25:24.000 So you could certainly write this with a for loop. Actually, writing 0:25:24.000,0:25:27.000 it with a for loop is almost a little bit trickier because you 0:25:27.000,0:25:30.000 have to keep track of which part of the string are you on, and what happens when you 0:25:30.000,0:25:34.000 get to the middle and things like that. In some sense, the recursive 0:25:34.000,0:25:35.000 form really kinda 0:25:35.000,0:25:38.000 sidesteps that by really thinking about it in a more holistic way 0:25:38.000,0:25:41.000 of like the outer letters plus an inner palindrome 0:25:41.000,0:25:43.000 gives me the answer I'm looking for. 0:25:43.000,0:25:46.000 So this idea of 0:25:46.000,0:25:49.000 taking a function you're in the middle of writing and making a call to it as 0:25:49.000,0:25:50.000 though it worked 0:25:50.000,0:25:53.000 is something that - they require this idea of the leap of faith. 0:25:53.000,0:25:55.000 You haven't even finished describing 0:25:55.000,0:25:58.000 how is palindrome operates, 0:25:58.000,0:26:01.000 and there you are making a call to it depending on it working in order to get 0:26:01.000,0:26:02.000 your 0:26:02.000,0:26:04.000 function working. It's a very kind of 0:26:04.000,0:26:06.000 wacky thing to get your head around. 0:26:06.000,0:26:08.000 It feels a little bit 0:26:08.000,0:26:09.000 mystical at first. 0:26:09.000,0:26:11.000 0:26:11.000,0:26:13.000 That feeling you feel a little bit 0:26:13.000,0:26:16.000 discombobulated about this is probably pretty normal, 0:26:16.000,0:26:17.000 but we're gonna 0:26:17.000,0:26:18.000 keep seeing examples, 0:26:18.000,0:26:21.000 and hope that 0:26:21.000,0:26:23.000 it starts to feel a little less unsettling. 0:26:23.000,0:26:26.000 Anybody wanna ask a question about this so far? Yeah? 0:26:26.000,0:26:28.000 So I guess create your 0:26:28.000,0:26:30.000 base case first, then test it? [Inaudible]. That's a great question. 0:26:30.000,0:26:35.000 So I would say typically that's a great strategy. Get a base 0:26:35.000,0:26:38.000 case and test against the base case, so the one character and the two character 0:26:38.000,0:26:39.000 strings. 0:26:39.000,0:26:42.000 And then imagine one layer out, things that will make one recursive 0:26:42.000,0:26:46.000 call only. So in this case for the palindrome, it's like what's a two-character 0:26:46.000,0:26:47.000 string? 0:26:47.000,0:26:50.000 One has AB. One has AA. So one that is a palindrome, one that isn't, 0:26:50.000,0:26:52.000 and watch it go through. 0:26:52.000,0:26:55.000 Then from there you have to almost take that leap and say 0:26:55.000,0:26:58.000 it worked for the base case. It worked for one out. And then you have to start 0:26:58.000,0:27:02.000 imagining if it worked for a string of length N, it'll work for one of N plus one, 0:27:02.000,0:27:03.000 and 0:27:03.000,0:27:07.000 that in some sense testing it exhaustively across all strings is 0:27:07.000,0:27:09.000 of course impossible, so you have to kind of 0:27:09.000,0:27:12.000 move to a sort of larger case where you can't just sit there and trace the whole 0:27:12.000,0:27:15.000 thing. You'll have to in some sense take a faith thing that says 0:27:15.000,0:27:18.000 if it could have computed whether the interior's a palindrome, then adding two 0:27:18.000,0:27:21.000 characters on the outside 0:27:21.000,0:27:21.000 0:27:21.000,0:27:24.000 and massaging that with that answer should produce the right thing. So some bigger 0:27:24.000,0:27:26.000 tests 0:27:26.000,0:27:28.000 that 0:27:28.000,0:27:30.000 verify that the recursive case 0:27:30.000,0:27:32.000 when exercised does the right thing, 0:27:32.000,0:27:34.000 then the pair together - All your code is going through 0:27:34.000,0:27:38.000 these same lines. The outer case going down to the recursive case, down 0:27:38.000,0:27:39.000 to that base case, and that's 0:27:39.000,0:27:43.000 one of the beauty of writing it recursively is that in some sense 0:27:43.000,0:27:46.000 once this piece of code works for some simple cases, the idea that setting it to 0:27:46.000,0:27:48.000 larger cases 0:27:48.000,0:27:51.000 is almost - I don't wanna 0:27:51.000,0:27:54.000 say guaranteed. That maybe makes it sound too easy, but in fact, if it 0:27:54.000,0:27:55.000 works for 0:27:55.000,0:27:58.000 cases of N and then N plus one, then it'll work for 100, and 101, and 0:27:58.000,0:28:02.000 6,000, and 6,001, and all the way through all the numbers by 0:28:02.000,0:28:06.000 induction. Question? 0:28:06.000,0:28:10.000 You have to remove all the [inaudible], all the spaces? Yeah. So definitely like the was it a car to match I should definitely 0:28:10.000,0:28:13.000 be taking my spaces out of there to make it right. You 0:28:13.000,0:28:21.000 are totally correct on that. 0:28:21.000,0:28:22.000 So let me show you 0:28:22.000,0:28:25.000 one where recursion is definitely 0:28:25.000,0:28:27.000 gonna buy us some real efficiency 0:28:27.000,0:28:30.000 and some real clarity in solving a 0:28:30.000,0:28:32.000 search problem. I've got a 0:28:32.000,0:28:35.000 vector. Let's say it's a vector of strings. It's a vector 0:28:35.000,0:28:37.000 of numbers. A vector of anything, it doesn't really matter. 0:28:37.000,0:28:40.000 And I wanna see if I can find a particular entry in it. So unlike the 0:28:40.000,0:28:44.000 set which can do a fast contains for you, in vector 0:28:44.000,0:28:46.000 if I haven't done anything special with it, 0:28:46.000,0:28:49.000 then there's no guarantee about where to find something. So if I wanna say 0:28:49.000,0:28:51.000 0:28:51.000,0:28:54.000 did somebody score 75 on the exam, then I'm gonna have 0:28:54.000,0:28:56.000 to just walk through the vector starting at Slot 0:28:56.000,0:28:59.000 0, walk my way to the end, and 0:28:59.000,0:29:02.000 compare to see if I find a 75. If I get to the end and I haven't found one, then 0:29:02.000,0:29:04.000 I can say no. 0:29:04.000,0:29:05.000 So that's what's called linear search. Linear 0:29:05.000,0:29:10.000 kind of implies this left to right sequential processing. 0:29:10.000,0:29:11.000 And linear search 0:29:11.000,0:29:15.000 has the property that as the size of the input grows, the amount of time 0:29:15.000,0:29:18.000 taken to search it grows in proportion. 0:29:18.000,0:29:20.000 So if you have a 1000 number array, 0:29:20.000,0:29:23.000 and you doubled its size, you would expect that doing a linear search on it should take 0:29:23.000,0:29:30.000 twice as long for an array that's twice as large. 0:29:30.000,0:29:34.000 The strategy we're gonna look at today is binary search 0:29:34.000,0:29:37.000 which is gonna try to avoid this looking in every one of those boxes to 0:29:37.000,0:29:38.000 find something. 0:29:38.000,0:29:42.000 It's gonna take a divide and conquer strategy, and it's gonna require a sorted 0:29:42.000,0:29:42.000 vector. 0:29:42.000,0:29:45.000 So in order to do an efficient lookup, 0:29:45.000,0:29:48.000 it helps if we've done some pre-rearrangement 0:29:48.000,0:29:49.000 0:29:49.000,0:29:51.000 of the data. In this case, putting it into sorted order 0:29:51.000,0:29:55.000 is gonna make it much easier for us to be able to find something in it because 0:29:55.000,0:29:55.000 we have 0:29:55.000,0:29:58.000 better information about where to look, 0:29:58.000,0:29:59.000 so much faster. 0:29:59.000,0:30:03.000 So we'll see that surely there was some cost to that, 0:30:03.000,0:30:05.000 but typically binary search is gonna be used when you have an array where you 0:30:05.000,0:30:07.000 don't do a lot of changes to it so 0:30:07.000,0:30:11.000 that putting it in a sorted order can be done once, and then from that point forward you can 0:30:11.000,0:30:15.000 search it many times, getting the benefit of the work you did to put it in 0:30:15.000,0:30:16.000 sorted order. 0:30:16.000,0:30:19.000 If you plan to sort it just to search it, then in some sense all the time you spent 0:30:19.000,0:30:20.000 sorting it 0:30:20.000,0:30:25.000 would kind of count against you and unlikely to come out ahead. 0:30:25.000,0:30:28.000 So the insight we're gonna use is 0:30:28.000,0:30:32.000 that if we have this in sorted order, and 0:30:32.000,0:30:34.000 we're trying to search the whole thing - we're looking for let's say the No. 75 0:30:34.000,0:30:36.000 - 0:30:36.000,0:30:38.000 is that if we just look at the middlemost element, 0:30:38.000,0:30:40.000 so we have this idea that we're looking at the whole vector right now from the 0:30:40.000,0:30:42.000 start to the end, 0:30:42.000,0:30:45.000 and I look at the middle element, and I say it's a 54. I can say 0:30:45.000,0:30:49.000 if 75 is in this vector because it's in sorted order, 0:30:49.000,0:30:51.000 it can't be 0:30:51.000,0:30:52.000 anywhere over here. 0:30:52.000,0:30:55.000 If 54's right there, everything to the left of 54 0:30:55.000,0:30:57.000 must be less than that, 0:30:57.000,0:30:59.000 and 75 wouldn't be over there. 0:30:59.000,0:31:00.000 So that means I can 0:31:00.000,0:31:03.000 just discard that half of the 0:31:03.000,0:31:06.000 vector from further consideration. 0:31:06.000,0:31:07.000 So now I have this half 0:31:07.000,0:31:10.000 to continue looking at which is the things that 0:31:10.000,0:31:13.000 are the right of 54 all the way to the end. 0:31:13.000,0:31:17.000 I use the same strategy again. I say if I'm searching a vector - so 0:31:17.000,0:31:19.000 last time I was searching a vector that had 25 elements. Now I've got one that's got 0:31:19.000,0:31:21.000 just 12. 0:31:21.000,0:31:24.000 Again, I use the same strategy. Look at the one in the middle. 0:31:24.000,0:31:26.000 I say oh, it's an 80, 0:31:26.000,0:31:29.000 and then I say well the number I'm looking for is 75. 0:31:29.000,0:31:30.000 It can't be 0:31:30.000,0:31:35.000 to the right of the 80. It must be to the left of it. 0:31:35.000,0:31:36.000 And then that lets me 0:31:36.000,0:31:40.000 get rid of another quarter of the vector. If I keep doing this, I get rid of 0:31:40.000,0:31:43.000 half, and then a quarter, and then an eighth, and 0:31:43.000,0:31:47.000 then a 16th. Very quickly, I will narrow in on the position where if 75 is in 0:31:47.000,0:31:53.000 this vector, it has to be. Or I'll be able to conclude it wasn't there at all. Then I kind of 0:31:53.000,0:31:55.000 work on my in, and 0:31:55.000,0:31:59.000 I found a 74 and a 76 over there, then I'm done. 0:31:59.000,0:32:02.000 That base case comes when I have 0:32:02.000,0:32:04.000 such a small little vector there 0:32:04.000,0:32:05.000 where 0:32:05.000,0:32:09.000 my bounds have crossed in such a way that I can say I never found what I 0:32:09.000,0:32:16.000 was looking for. 0:32:16.000,0:32:19.000 So let's walk through 0:32:19.000,0:32:21.000 this 0:32:21.000,0:32:22.000 0:32:22.000,0:32:25.000 bit of code that kind of puts into C++ the thing I just described here. 0:32:25.000,0:32:29.000 I've got a vector. In this case, I'm using a vector that's containing strings. 0:32:29.000,0:32:32.000 It could be ints. It could be anything. It doesn't really matter. 0:32:32.000,0:32:36.000 I've got a start and a stop, which I identify the sub-portion of the vector 0:32:36.000,0:32:40.000 that we're interested in. So the start is the first index to consider. The stop is the 0:32:40.000,0:32:41.000 last index to consider. 0:32:41.000,0:32:46.000 So the very first call to this will have start set to zero and stop set to the vector's 0:32:46.000,0:32:48.000 size minus one. I 0:32:48.000,0:32:51.000 compute the midpoint index 0:32:51.000,0:32:54.000 which is just the sum of the start and stop divided by two, 0:32:54.000,0:32:56.000 and then I compare 0:32:56.000,0:32:58.000 the key that I'm looking for to 0:32:58.000,0:33:02.000 the value at that index. I'm looking right in the middle. If it happens to match, 0:33:02.000,0:33:06.000 then I return that. The goal of binary search in this case is to return the index of a 0:33:06.000,0:33:08.000 matching element within the vector, 0:33:08.000,0:33:11.000 or to return this not found negative one constant 0:33:11.000,0:33:14.000 if it was unable to find any match anywhere. 0:33:14.000,0:33:16.000 So when we do find it 0:33:16.000,0:33:20.000 at whatever the level the recursion is, we can just immediately return that. We're done. 0:33:20.000,0:33:21.000 We found it. It's good. 0:33:21.000,0:33:24.000 Otherwise, we're gonna make this recursive call 0:33:24.000,0:33:28.000 that looks at either the left half or the right half based on if key is 0:33:28.000,0:33:28.000 less than 0:33:28.000,0:33:30.000 the value we found at the midpoint, 0:33:30.000,0:33:32.000 then the place we're searching is - 0:33:32.000,0:33:34.000 still has the same start position, 0:33:34.000,0:33:36.000 but is now capped 0:33:36.000,0:33:38.000 by the 0:33:38.000,0:33:40.000 element exactly to the left of the midpoint, 0:33:40.000,0:33:43.000 and then the right one, 0:33:43.000,0:33:45.000 the inversion of that, 0:33:45.000,0:33:47.000 one to the right of the midpoint 0:33:47.000,0:33:49.000 and the stop unchanged. 0:33:49.000,0:33:53.000 So taking off half of the elements under consideration at each stage, 0:33:53.000,0:33:54.000 eventually 0:33:54.000,0:33:58.000 I will get down to the simplest possible case. And the simplest possible case isn't that I have a one-element 0:33:58.000,0:34:02.000 vector and I found it or not. The really simple case is actually that I have zero 0:34:02.000,0:34:03.000 elements in my vector 0:34:03.000,0:34:04.000 that I just kept 0:34:04.000,0:34:06.000 moving in the 0:34:06.000,0:34:09.000 upper and lower bound point until they crossed, 0:34:09.000,0:34:12.000 which meant that I ran out of elements to check. 0:34:12.000,0:34:15.000 And that happens when the start index is greater than the stop index. So the start 0:34:15.000,0:34:19.000 and the stop if they're equal to each other mean that you have a one element vector left 0:34:19.000,0:34:20.000 to search, 0:34:20.000,0:34:22.000 but if you - For 0:34:22.000,0:34:24.000 example, if you got to that case where you have that one element vector left to search, you'll look 0:34:24.000,0:34:28.000 at that one, and if it matches, you'll be done. Otherwise, you'll end up 0:34:28.000,0:34:30.000 either decrementing the stop 0:34:30.000,0:34:33.000 to move past the start, or incrementing the start to move past the 0:34:33.000,0:34:34.000 stop. 0:34:34.000,0:34:36.000 And then that next iteration will hit this base case that said 0:34:36.000,0:34:40.000 I looked at the element in a one-element vector. It didn't work 0:34:40.000,0:34:40.000 out. 0:34:40.000,0:34:43.000 I can tell you for sure it's not found. 0:34:43.000,0:34:45.000 If it had been here, I would've seen it. 0:34:45.000,0:34:47.000 And this is looking at just one element 0:34:47.000,0:34:49.000 each 0:34:49.000,0:34:50.000 recursive call, 0:34:50.000,0:34:55.000 and the recursive calls in this case stack up to a depth 0:34:55.000,0:34:57.000 based on the logarithm of the size 0:34:57.000,0:34:59.000 to the power of two, 0:34:59.000,0:35:02.000 so that if you have 1000 elements, 0:35:02.000,0:35:06.000 you look at one, and now you have a 500-element collection to look 0:35:06.000,0:35:09.000 at again. You look at one, you have 0:35:09.000,0:35:12.000 a 250 element, 125, 60, 30, 0:35:12.000,0:35:12.000 15. 0:35:12.000,0:35:14.000 So at each stage 0:35:14.000,0:35:15.000 half of them 0:35:15.000,0:35:19.000 remain for the further call, and the number of times you can do that 0:35:19.000,0:35:22.000 for 1000 is the number of times you can divide 1000 by two, which is 0:35:22.000,0:35:24.000 the log based two of that, 0:35:24.000,0:35:26.000 which is roughly ten. 0:35:26.000,0:35:29.000 So if you were looking at 1000 number array, if it's in sorted order, 0:35:29.000,0:35:32.000 it takes you ten comparisons 0:35:32.000,0:35:34.000 to conclusively determine 0:35:34.000,0:35:38.000 where an element is if it's in there, or that it doesn't exist in the array at 0:35:38.000,0:35:39.000 all. 0:35:39.000,0:35:41.000 If you take that 1000 element array, 0:35:41.000,0:35:43.000 and you make it twice as big, 0:35:43.000,0:35:46.000 so now I have a 2000 0:35:46.000,0:35:50.000 number array, how much longer does it take? One more step. 0:35:50.000,0:35:51.000 One more step. 0:35:51.000,0:35:55.000 Just one, right? You look at one, and you have a 1000 number array, so however long it took 0:35:55.000,0:35:56.000 you to do that 1000 number array, 0:35:56.000,0:36:00.000 it takes one additional comparison, kinda one stack frame on top of that 0:36:00.000,0:36:01.000 to get its way down. 0:36:01.000,0:36:04.000 So this means actually this is super efficient. 0:36:04.000,0:36:06.000 That you can search so for example 0:36:06.000,0:36:09.000 a million is roughly two the 20th power. 0:36:09.000,0:36:12.000 So you have a million entry 0:36:12.000,0:36:16.000 collection to search, it will take you 20 comparisons to say for sure 0:36:16.000,0:36:17.000 it's here or not, 0:36:17.000,0:36:18.000 and here's where I found it, 0:36:18.000,0:36:20.000 just 20. 0:36:20.000,0:36:22.000 You go up to two million, it takes 0:36:22.000,0:36:24.000 21. The 0:36:24.000,0:36:27.000 very slow growing function, that logarithm function, so that tells you 0:36:27.000,0:36:28.000 0:36:28.000,0:36:31.000 that this is gonna be a very efficient way of searching a sorted array. A 0:36:31.000,0:36:34.000 category 0:36:34.000,0:36:37.000 thing called the divide and conquer that 0:36:37.000,0:36:39.000 take a problem, 0:36:39.000,0:36:42.000 divide it typically in half, but sometimes in thirds or some other 0:36:42.000,0:36:45.000 way, and then kind of - in this case 0:36:45.000,0:36:49.000 it's particularly handy that we can throw away some part of the problem, so we 0:36:49.000,0:36:56.000 divide and focus on just one part to solve the problem. 0:36:56.000,0:36:58.000 All right. So this is the first one 0:36:58.000,0:37:02.000 that's gonna start to 0:37:02.000,0:37:04.000 really inspire you for how recursion can help you solve problems 0:37:04.000,0:37:05.000 that you might 0:37:05.000,0:37:08.000 have no idea how to approach any other way 0:37:08.000,0:37:10.000 than using a recursive formulation. 0:37:10.000,0:37:13.000 So this is an exercise that comes out of the reader in Chapter 4, 0:37:13.000,0:37:16.000 and the 0:37:16.000,0:37:18.000 context of it is you have N things, 0:37:18.000,0:37:22.000 so maybe it's N people in a dorm, 60 people in a dorm, 0:37:22.000,0:37:26.000 and you would like to choose K of them. Let's K a real number, four 0:37:26.000,0:37:29.000 - four people to go together to Flicks. 0:37:29.000,0:37:32.000 So of your 60 dorm mates, how many different ways could you 0:37:32.000,0:37:33.000 pick a subset 0:37:33.000,0:37:35.000 of size four 0:37:35.000,0:37:38.000 that doesn't repeat any of the others? So you can pick the 0:37:38.000,0:37:41.000 two people from the first floor, one person from the middle floor, one person 0:37:41.000,0:37:44.000 from the top floor, but then you can kind of shuffle it up. What if you took all the people from the 0:37:44.000,0:37:45.000 first floor, or 0:37:45.000,0:37:48.000 these people from that room, and whatnot? You can imagine there's a lot of kind of 0:37:48.000,0:37:50.000 machinations 0:37:50.000,0:37:53.000 that could be present here, 0:37:53.000,0:37:57.000 and counting them, it's not quite obvious 0:37:57.000,0:37:59.000 unless you kinda go back to start working on your real math 0:37:59.000,0:38:01.000 skills 0:38:01.000,0:38:05.000 how it is that you can write a formula for this. 0:38:05.000,0:38:07.000 So what I'm gonna give you is 0:38:07.000,0:38:11.000 a recursive way of thinking about this problem. 0:38:11.000,0:38:12.000 So I drew 0:38:12.000,0:38:15.000 a set of the things we're looking at? 0:38:15.000,0:38:18.000 So there are N things that we're trying to choose K out 0:38:18.000,0:38:21.000 of. So right now, I've got 0:38:21.000,0:38:23.000 12 or so 0:38:23.000,0:38:24.000 people, or 0:38:24.000,0:38:26.000 items, whatever it is. 0:38:26.000,0:38:29.000 What I'm gonna do is I'm gonna imagine just designating one 0:38:29.000,0:38:30.000 totally at random. 0:38:30.000,0:38:33.000 So pick Bob. 0:38:33.000,0:38:35.000 Bob is one of the people in the dorm. I'm gonna kind of separate him from 0:38:35.000,0:38:39.000 everybody else mentally in my mind, and draw this line, and kinda mark him with a red flag 0:38:39.000,0:38:41.000 that says that's Bob. 0:38:41.000,0:38:45.000 So Bob might go to Flicks or might not go to Flicks. Some of the subsets for 0:38:45.000,0:38:52.000 going to Flicks will include Bob, and some will not. 0:38:52.000,0:38:53.000 Okay. 0:38:53.000,0:38:57.000 So what I'd like to think about is how many different subsets can I make that will 0:38:57.000,0:38:58.000 include Bob, 0:38:58.000,0:39:01.000 and how many different subsets can I make that don't include Bob. And 0:39:01.000,0:39:05.000 if I added those together, then that should be the total number of subsets I can make 0:39:05.000,0:39:08.000 from this collection. 0:39:08.000,0:39:11.000 Okay, so the subsets that include Bob - 0:39:11.000,0:39:15.000 so once I've committed to Bob being in the set, and let's say I'm trying to pick four 0:39:15.000,0:39:16.000 members out of here, 0:39:16.000,0:39:18.000 then I have Bob 0:39:18.000,0:39:22.000 and I have to figure out how many ways can I pick three people to accompany Bob to 0:39:22.000,0:39:24.000 the Flicks. 0:39:24.000,0:39:28.000 So I'm picking from a slightly smaller population. The population went down by one, 0:39:28.000,0:39:31.000 and the number I'm seeking went down by one, 0:39:31.000,0:39:34.000 and that tells me all the ways I can pick three people to go with Bob. 0:39:34.000,0:39:38.000 The ones that don't include Bob, and Bob's just out of the running, 0:39:38.000,0:39:42.000 I look at the remaining population which is still one smaller, everybody but Bob, and I 0:39:42.000,0:39:47.000 look for the ways I can still pick four people from there. 0:39:47.000,0:39:51.000 So what I have here is trying to compute C of NK 0:39:51.000,0:39:56.000 is trying to compute C of N minus one K minus one and add it to C of N minus one K. 0:39:56.000,0:39:58.000 So this is 0:39:58.000,0:39:59.000 picking 0:39:59.000,0:40:02.000 friends to accompany Bob. This is picking people without Bob. 0:40:02.000,0:40:03.000 Add those together, 0:40:03.000,0:40:08.000 and I will have the total number of ways I can pick K things out of N. 0:40:08.000,0:40:13.000 So we're very much relying on this recursive idea of if I had the answer 0:40:13.000,0:40:16.000 - I don't feel smart enough to know the answer directly, 0:40:16.000,0:40:20.000 but if I could defer it to someone who was actually willing to do more 0:40:20.000,0:40:23.000 scrutiny into this thing, if you could tell me how many groups of three you can 0:40:23.000,0:40:28.000 join with Bob, and how many groups of four you can pick without Bob, I can tell you 0:40:28.000,0:40:30.000 what the total answer is. 0:40:30.000,0:40:33.000 The simplest possible base case 0:40:33.000,0:40:38.000 we're gonna work our way down to are when there are just no choices remaining at all. 0:40:38.000,0:40:40.000 So if you look back at my 0:40:40.000,0:40:43.000 things that are here, in 0:40:43.000,0:40:46.000 both cases the population is getting smaller, 0:40:46.000,0:40:50.000 and in one of the recursive calls, the number of things I'm trying to 0:40:50.000,0:40:52.000 pick is also getting smaller. 0:40:52.000,0:40:55.000 So they're both making progress toward kind of shrinking those down to where 0:40:55.000,0:40:56.000 there's kind of 0:40:56.000,0:40:59.000 more and more constraints on what I have to choose. 0:40:59.000,0:41:02.000 For example, on this one 0:41:02.000,0:41:05.000 as I keep shrinking the population by one 0:41:05.000,0:41:07.000 and trying to get the same number of people, 0:41:07.000,0:41:10.000 eventually I'll be trying to pick three people out of three, 0:41:10.000,0:41:14.000 where I'm trying to pick of K of K remaining. Well, there's only one way to do 0:41:14.000,0:41:17.000 that which is to take everyone. 0:41:17.000,0:41:18.000 On this one, I'll 0:41:18.000,0:41:20.000 eventually keep picking people, 0:41:20.000,0:41:23.000 so the K is always less than the N in this case. The 0:41:23.000,0:41:26.000 K will eventually kinda bottom out, or I'll say I've 0:41:26.000,0:41:30.000 already picked all the people. I've already picked four people. I need to pick zero more. And 0:41:30.000,0:41:34.000 at that point, that's also very easy, right? Picking zero out of a population, there's 0:41:34.000,0:41:38.000 only one way to do that. 0:41:38.000,0:41:41.000 So what we end up with is 0:41:41.000,0:41:45.000 a very simple base case of if K equals zero - so I'm not trying to choose anymore. 0:41:45.000,0:41:49.000 I've already committed all the slots. Or if K is equal to N 0:41:49.000,0:41:50.000 where I've 0:41:50.000,0:41:52.000 discarded a whole bunch of people, and now I'm down to where I'm 0:41:52.000,0:41:56.000 facing I've gotta get four, and I've got four left. Well, there's only one way to do those things, 0:41:56.000,0:41:59.000 and that's to take everybody or to take nobody. 0:41:59.000,0:42:00.000 And then otherwise, 0:42:00.000,0:42:02.000 I make those two recursive calls 0:42:02.000,0:42:05.000 with Bob, without Bob, add them together 0:42:05.000,0:42:13.000 to get my whole result. 0:42:13.000,0:42:14.000 That's wacky. I'm gonna 0:42:14.000,0:42:18.000 read you something, 0:42:18.000,0:42:22.000 and then we'll call it a day. I 0:42:22.000,0:42:31.000 brought a book with me. I stole this from my children.