WEBVTT 00:00:00.000 --> 00:00:10.000 00:00:10.000 --> 00:00:11.000 00:00:11.000 --> 00:00:14.000 This presentation is delivered by the Stanford Center for Professional 00:00:14.000 --> 00:00:24.000 Development. 00:00:24.000 --> 00:00:26.000 Welcome to Friday. I keep telling you I'm gonna talk 00:00:26.000 --> 00:00:29.000 about linked list and then we don't ever get very far, but 00:00:29.000 --> 00:00:33.000 today really we are gonna talk about linked list and do some tracing and see some 00:00:33.000 --> 00:00:36.000 recursive work we can do with linked list and then 00:00:36.000 --> 00:00:40.000 I'll probably give you a little preview of the next topic, which is the introduction 00:00:40.000 --> 00:00:44.000 algorithm analysis in Big O toward the end. We won't get very far in that, but we will be talking 00:00:44.000 --> 00:00:47.000 about that all next week, which is Chapter 7 in its full glory. That 00:00:47.000 --> 00:00:48.000 will be 00:00:48.000 --> 00:00:51.000 pretty much everything we cover next week is coming out of Chapter 7, so it's all 00:00:51.000 --> 00:00:53.000 about algorithm and sorting and different 00:00:53.000 --> 00:00:54.000 algorithms. 00:00:54.000 --> 00:00:58.000 I will not be able to go - hang out with you today after class because I have an undergraduate 00:00:58.000 --> 00:00:59.000 counsel meeting. 00:00:59.000 --> 00:01:00.000 00:01:00.000 --> 00:01:03.000 But it's a nice day so hopefully you'll have some more 00:01:03.000 --> 00:01:06.000 fun outdoor thing to do anyway. 00:01:06.000 --> 00:01:09.000 We will meet again next Friday. So 00:01:09.000 --> 00:01:11.000 put that on your calendar if you are 00:01:11.000 --> 00:01:16.000 the type who keeps a calendar. Okay. 00:01:16.000 --> 00:01:20.000 So actually I'm gonna not even really work in the things. I'm gonna go back to my 00:01:20.000 --> 00:01:22.000 editing and talk about kind of where we were at 00:01:22.000 --> 00:01:25.000 at the end of 00:01:25.000 --> 00:01:27.000 Wednesday's lecture. Was 00:01:27.000 --> 00:01:29.000 we were doing a little bit of 00:01:29.000 --> 00:01:31.000 coding with the linked list itself. 00:01:31.000 --> 00:01:35.000 So having defined that recursive structure that has the information from 00:01:35.000 --> 00:01:37.000 one entry plus a pointer to the next 00:01:37.000 --> 00:01:40.000 and then so far the little pieces we have is something that we will print an individual 00:01:40.000 --> 00:01:41.000 entry, 00:01:41.000 --> 00:01:44.000 something that will create a new entry out in the heap, filling in with data 00:01:44.000 --> 00:01:47.000 that was read from the console typed in by the user, 00:01:47.000 --> 00:01:51.000 and then the last bit of code that we were looking at was this idea of building the 00:01:51.000 --> 00:01:52.000 list 00:01:52.000 --> 00:01:55.000 by getting a new entry. 00:01:55.000 --> 00:01:58.000 So a pointer coming back from there that points to a new heap allocated 00:01:58.000 --> 00:02:01.000 structure that has the name and the address of someone 00:02:01.000 --> 00:02:03.000 and then attaching it to the 00:02:03.000 --> 00:02:06.000 front part of the list. So we talked about why that was 00:02:06.000 --> 00:02:08.000 the easiest thing to do and I'm gonna 00:02:08.000 --> 00:02:11.000 just re-draw this picture again because we're gonna need it 00:02:11.000 --> 00:02:16.000 as we go through and do stuff. Here's the main stack frame. 00:02:16.000 --> 00:02:19.000 It's got this pointer list, which 00:02:19.000 --> 00:02:23.000 is expected to point to an entry out in the heap. It starts pointing at null. 00:02:23.000 --> 00:02:26.000 We have this local variable down here, new one 00:02:26.000 --> 00:02:29.000 that is also a pointer to an entry 00:02:29.000 --> 00:02:32.000 and based on the result from get new entry, which will either return null to say 00:02:32.000 --> 00:02:36.000 there's no more entries or a new heap allocated thing if it gives us this new 00:02:36.000 --> 00:02:38.000 entry that says 00:02:38.000 --> 00:02:43.000 here's Jason and his phone number. 00:02:43.000 --> 00:02:48.000 That the lines there - the new ones next equals list so 00:02:48.000 --> 00:02:52.000 new ones next field, right? Gets assigned the value of list, which affectively just copies 00:02:52.000 --> 00:02:54.000 a null on top of a null 00:02:54.000 --> 00:02:59.000 and then sets list to point to the same place that new 1 does. 00:02:59.000 --> 00:03:01.000 So on the first iteration, 00:03:01.000 --> 00:03:04.000 Jason is the new front most cell on the list, 00:03:04.000 --> 00:03:08.000 has a null terminator in the next field, which says there's no further cells. So we 00:03:08.000 --> 00:03:10.000 have a singleton list of just one cell. 00:03:10.000 --> 00:03:13.000 Now the subsequent iteration 00:03:13.000 --> 00:03:17.000 we call get new entry returns us a new 00:03:17.000 --> 00:03:23.000 pointer to another cell out here on the list. This one's gonna be Joel. 00:03:23.000 --> 00:03:27.000 He starts off as his own singleton list and then here's the attach again. 00:03:27.000 --> 00:03:29.000 New ones 00:03:29.000 --> 00:03:32.000 next field gets the value of list, 00:03:32.000 --> 00:03:35.000 so that changes 00:03:35.000 --> 00:03:38.000 Joel's next to 00:03:38.000 --> 00:03:41.000 point to where list does now. So I have two different ways to get to Jason's cell 00:03:41.000 --> 00:03:45.000 now. Directly, it's the front most cell of the list still, but it's now following Joel. 00:03:45.000 --> 00:03:47.000 And then I update list 00:03:47.000 --> 00:03:49.000 to point to new one, 00:03:49.000 --> 00:03:54.000 so doing pointer assignment on this. Copying the pointers, making an alias, 00:03:54.000 --> 00:03:55.000 and now 00:03:55.000 --> 00:03:59.000 at the bottom of that loop, right? I now have list pointing to Joel, which points 00:03:59.000 --> 00:04:02.000 to Jason, which points to null. So my two entry list. 00:04:02.000 --> 00:04:05.000 Every subsequent cell that's created is 00:04:05.000 --> 00:04:06.000 prepended, right? 00:04:06.000 --> 00:04:08.000 Placed in the front most position of a list 00:04:08.000 --> 00:04:11.000 and then the remainder of the list trails behind it. 00:04:11.000 --> 00:04:13.000 So we should expect that if we 00:04:13.000 --> 00:04:16.000 put in the list A, B, C that if we were to go back and traverse it we'll 00:04:16.000 --> 00:04:18.000 discover it goes C, B, A. And 00:04:18.000 --> 00:04:22.000 so I was just about to show you that and then we ran out of time. So I'm gonna go ahead and 00:04:22.000 --> 00:04:22.000 finish that because I'm gonna 00:04:22.000 --> 00:04:31.000 write something that will print an entire linked list. 00:04:31.000 --> 00:04:33.000 It will take the list that we have and I'm gonna 00:04:33.000 --> 00:04:35.000 show you 00:04:35.000 --> 00:04:39.000 that. 00:04:39.000 --> 00:04:42.000 So the idea is to 00:04:42.000 --> 00:04:44.000 traverse a pointer 00:04:44.000 --> 00:04:47.000 down the list 00:04:47.000 --> 00:04:49.000 and print each one in 00:04:49.000 --> 00:04:51.000 turn. I'm gonna do that using a four loop. It 00:04:51.000 --> 00:04:55.000 may seem a little bit like an odd use of the four loop, but in fact what 00:04:55.000 --> 00:04:59.000 we're doing is really comparable to how you would do iteration down a link, an 00:04:59.000 --> 00:05:00.000 array, 00:05:00.000 --> 00:05:03.000 using kind of zero to n, you know, I++. 00:05:03.000 --> 00:05:06.000 We're doing the same thing, but instead of advancing an integer, which 00:05:06.000 --> 00:05:10.000 indexes into that, we're actually keeping track of a pointer, which moves down. 00:05:10.000 --> 00:05:14.000 So the initial state of that pointer as we assign it to be the value of the list, 00:05:14.000 --> 00:05:17.000 so we alias to the first cell of the list. 00:05:17.000 --> 00:05:20.000 While that pointer points this invalid cell not to null, 00:05:20.000 --> 00:05:23.000 so as long as there are still cells to process 00:05:23.000 --> 00:05:25.000 then we'll print to 00:05:25.000 --> 00:05:28.000 cell. Then we'll advance the kind of equivalent of the I++. 00:05:28.000 --> 00:05:30.000 Here's the Kerr equals Kerr next. 00:05:30.000 --> 00:05:34.000 So advancing - so starting from Kerr equals Joel, right? To Kerr equals Jason and then Kerr 00:05:34.000 --> 00:05:38.000 equals null, which will terminate that. So as long as our linked list is properly 00:05:38.000 --> 00:05:40.000 terminated, right? We should print all the cells 00:05:40.000 --> 00:05:41.000 from front to back 00:05:41.000 --> 00:05:43.000 using this one loop. 00:05:43.000 --> 00:05:47.000 And so if I change this down here to be print list instead of just print to the front 00:05:47.000 --> 00:05:48.000 most entry 00:05:48.000 --> 00:05:56.000 and then I run a little test on this. 00:05:56.000 --> 00:05:58.000 So I enter Jake 00:05:58.000 --> 00:06:00.000 and his phone number, 00:06:00.000 --> 00:06:02.000 and I enter 00:06:02.000 --> 00:06:04.000 Carl and his phone number, and then I 00:06:04.000 --> 00:06:06.000 enter Ilia and 00:06:06.000 --> 00:06:07.000 his phone number, and I 00:06:07.000 --> 00:06:10.000 see that's all I have. Then they come back out in the opposite order, 00:06:10.000 --> 00:06:13.000 right? That Ilia, Carl, Jake 00:06:13.000 --> 00:06:17.000 because of the way I was placing them in the front of the list, right? Kind of effectively reversed, right? Them 00:06:17.000 --> 00:06:22.000 from the order they were inserted. Do you have a question? Do people ever 00:06:22.000 --> 00:06:22.000 make blank lists so that you can traverse backwards 00:06:22.000 --> 00:06:25.000 through them? 00:06:25.000 --> 00:06:28.000 They certainly do. So the simplest form of the linked list, right? Has only these 00:06:28.000 --> 00:06:33.000 forward links, right? So each cell gets to the one that follows it in the list, 00:06:33.000 --> 00:06:36.000 but that is gonna create certain asymmetries in how you process that list. 00:06:36.000 --> 00:06:38.000 You're always gonna start at the front and move your way to the back. Well, what happens if you're 00:06:38.000 --> 00:06:40.000 at a cell and you'd like to back up, right? 00:06:40.000 --> 00:06:42.000 It doesn't have the information 00:06:42.000 --> 00:06:46.000 in the single chain to do that. So you can build lists that either 00:06:46.000 --> 00:06:49.000 link in the other direction or link in both directions, right? To where you 00:06:49.000 --> 00:06:53.000 can, from a particular cell, find who proceeded it and who followed it. It just makes 00:06:53.000 --> 00:06:56.000 more pointers, right? And more complication. We'll see reasons why that actually 00:06:56.000 --> 00:06:59.000 becomes valuable. At this stage it turns out that it would just be 00:06:59.000 --> 00:07:02.000 complication that we wouldn't really be taking advantage of, but 00:07:02.000 --> 00:07:05.000 in a week or two we're gonna get to some place where that turns out to be a very, very 00:07:05.000 --> 00:07:08.000 useful addition to the list because it turns out that we really will need to be 00:07:08.000 --> 00:07:11.000 able to kind of move easily in both directions and right now we're not too 00:07:11.000 --> 00:07:12.000 worried about 00:07:12.000 --> 00:07:18.000 any direction other than the front ways. 00:07:18.000 --> 00:07:21.000 So this idea of 00:07:21.000 --> 00:07:23.000 the four loop, right? Is 00:07:23.000 --> 00:07:27.000 just one of the ways I could have done this. I could do this with a Y loop, 00:07:27.000 --> 00:07:28.000 right? Other sort of forms of iteration. 00:07:28.000 --> 00:07:31.000 I could also 00:07:31.000 --> 00:07:35.000 capitalize on the recursive nature of the list, and this is partly why I've chosen to introduce 00:07:35.000 --> 00:07:37.000 this topic now, is because I really do want to stress that this idea that it's a 00:07:37.000 --> 00:07:39.000 recursive structure 00:07:39.000 --> 00:07:42.000 gives us a little bit of an insight into ways that operations on it 00:07:42.000 --> 00:07:45.000 might be able to take advantage of that recursive nature. 00:07:45.000 --> 00:07:48.000 So let me write this alternate print list, and I'll 00:07:48.000 --> 00:07:52.000 call it recursive print list, 00:07:52.000 --> 00:07:55.000 that is designed to use recursion to get the job done. 00:07:55.000 --> 00:08:00.000 So in terms of thinking about it recursively, you can think of, well, a linked list is 00:08:00.000 --> 00:08:03.000 the sequence of cells and iteration kind of thinks of the whole sequence at one 00:08:03.000 --> 00:08:04.000 time. 00:08:04.000 --> 00:08:08.000 Recursion tends to take the strategy of, well, I'm just gonna separate it into 00:08:08.000 --> 00:08:12.000 something small I can deal with right now and then some instance of the same 00:08:12.000 --> 00:08:14.000 problem, but in a slightly smaller simpler form. 00:08:14.000 --> 00:08:18.000 The easy way to do that with a linked list is imagine there's the front 00:08:18.000 --> 00:08:20.000 most cell. 00:08:20.000 --> 00:08:23.000 Which I could print by calling print entry, which prints a single entry. 00:08:23.000 --> 00:08:26.000 And then there is this recursive 00:08:26.000 --> 00:08:29.000 structure left behind, which is the remainder of the list and I can say, well, 00:08:29.000 --> 00:08:32.000 I can recursively print the list 00:08:32.000 --> 00:08:34.000 that follows. 00:08:34.000 --> 00:08:36.000 So print the front most cell, 00:08:36.000 --> 00:08:40.000 let recursion work its magic on what remains. 00:08:40.000 --> 00:08:43.000 Looks pretty good. Something really critical missing? 00:08:43.000 --> 00:08:45.000 Base case. Better have a base case here. 00:08:45.000 --> 00:08:49.000 And the simplest possible base case is list equals equals null. 00:08:49.000 --> 00:08:53.000 If list equals equals null then it won't have nothing to do, so I can just return or I 00:08:53.000 --> 00:08:55.000 can actually just do it like this. Say if list 00:08:55.000 --> 00:08:57.000 does not equal null, so it has some 00:08:57.000 --> 00:08:59.000 valid contents, 00:08:59.000 --> 00:09:00.000 something to look at, 00:09:00.000 --> 00:09:03.000 then we'll print the front most one and then let recursion take care of the 00:09:03.000 --> 00:09:05.000 rest. So 00:09:05.000 --> 00:09:11.000 I'll change my called print list to be a recursive print list. And I should 00:09:11.000 --> 00:09:12.000 see some [inaudible] results here. 00:09:12.000 --> 00:09:15.000 I say that 00:09:15.000 --> 00:09:17.000 Nathan is here, I 00:09:17.000 --> 00:09:20.000 say Jason is here, I 00:09:20.000 --> 00:09:21.000 Sara, 00:09:21.000 --> 00:09:23.000 Sara are you an H? I don't 00:09:23.000 --> 00:09:24.000 remember. 00:09:24.000 --> 00:09:26.000 Today it doesn't. 00:09:26.000 --> 00:09:30.000 Sara, Jason, Nathan coming out the other way. Okay. 00:09:30.000 --> 00:09:33.000 So I want to do something kind of funny with this recursive call. 00:09:33.000 --> 00:09:33.000 So it's 00:09:33.000 --> 00:09:37.000 not really in this case - both of these look equally simple to write. We're 00:09:37.000 --> 00:09:40.000 gonna start to look at some things where this actually might buy 00:09:40.000 --> 00:09:43.000 us something. For example, lets imagine that what I wanted to do was print the 00:09:43.000 --> 00:09:47.000 list in the other order. So I don't have the list 00:09:47.000 --> 00:09:48.000 that links backwards. 00:09:48.000 --> 00:09:50.000 So if I really did want to print them 00:09:50.000 --> 00:09:52.000 with the last most element first, 00:09:52.000 --> 00:09:56.000 the way that they were added at the console, 00:09:56.000 --> 00:09:58.000 and the way I'm building the list, right? Is putting them in the other order. So, well, could I just take 00:09:58.000 --> 00:10:00.000 the list and print it back to front? 00:10:00.000 --> 00:10:03.000 I mean, it's simple enough to do that on a vector, if you had one, to work from 00:10:03.000 --> 00:10:06.000 the way back. Could I do it with the linked list? 00:10:06.000 --> 00:10:09.000 In the iterate formulation it's gonna actually get pretty messy because I'm 00:10:09.000 --> 00:10:12.000 gonna have to walk my way all the way down to the end and then print the last one, and then I'm 00:10:12.000 --> 00:10:15.000 gonna have to walk my way down to the one that points to the last one and 00:10:15.000 --> 00:10:18.000 print it, and then I'm gonna have to walk my way down to the one that printed to the penultimate 00:10:18.000 --> 00:10:19.000 one and print it, 00:10:19.000 --> 00:10:21.000 but it does involve a lot of traversal and a lot of work. 00:10:21.000 --> 00:10:24.000 I can make the recursive print one do it 00:10:24.000 --> 00:10:28.000 with just one tiny change of the code. 00:10:28.000 --> 00:10:30.000 Take this line 00:10:30.000 --> 00:10:34.000 and put it there. So 00:10:34.000 --> 00:10:37.000 what I'm doing here is letting recursion do the work for me of 00:10:37.000 --> 00:10:42.000 saying, well, print everything that follows me in the list before you print this cell. So 00:10:42.000 --> 00:10:46.000 if the list is A, B, C it says when you get to the A node it says 00:10:46.000 --> 00:10:49.000 okay, please print the things that follow me before you get back to me. Well, recursion 00:10:49.000 --> 00:10:52.000 being applied to the B, C list says, well, B says, well, hold onto B, print everything 00:10:52.000 --> 00:10:56.000 that follows me before you print B, when C gets there it says, well, print everything that 00:10:56.000 --> 00:10:59.000 follows me, which turns out to be nothing, which causes it to come back and 00:10:59.000 --> 00:11:02.000 say, well, okay, I'm done with the part that followed C, why don't we print C, 00:11:02.000 --> 00:11:03.000 and then print B, and then print A. 00:11:03.000 --> 00:11:09.000 So if I do this and 00:11:09.000 --> 00:11:10.000 print it through I put 00:11:10.000 --> 00:11:12.000 in A, 00:11:12.000 --> 00:11:16.000 B, C. 00:11:16.000 --> 00:11:18.000 Then they get printed out in the order I inserted them. 00:11:18.000 --> 00:11:23.000 They happen to be stored internally as C, B, A, but then I just do a 00:11:23.000 --> 00:11:25.000 change-a-roo was printing 00:11:25.000 --> 00:11:29.000 to print from the back of the list to the front of the list in the other order. 00:11:29.000 --> 00:11:32.000 But just the magic of recursion, right? A very simple little change, right? In terms of 00:11:32.000 --> 00:11:34.000 doing the work before the recursive call 00:11:34.000 --> 00:11:36.000 versus after 00:11:36.000 --> 00:11:39.000 gave me exactly what I wanted without any real craziness 00:11:39.000 --> 00:11:42.000 inserted in the code. We'll do a couple 00:11:42.000 --> 00:11:44.000 of other little things with you 00:11:44.000 --> 00:11:46.000 and 00:11:46.000 --> 00:11:49.000 I'm gonna do them recursively just 00:11:49.000 --> 00:11:50.000 for practice. 00:11:50.000 --> 00:11:53.000 00:11:53.000 --> 00:11:57.000 If I wanted to count them, I wanted to know how many cells 00:11:57.000 --> 00:12:02.000 are in my list, then thinking about 00:12:02.000 --> 00:12:03.000 it recursively 00:12:03.000 --> 00:12:06.000 is a matter of saying, well, there's a cell in the front 00:12:06.000 --> 00:12:06.000 00:12:06.000 --> 00:12:10.000 and then there's some count of the cells that follow it. If I add those together, 00:12:10.000 --> 00:12:12.000 that tells me how long this list is. 00:12:12.000 --> 00:12:13.000 00:12:13.000 --> 00:12:16.000 Having a base case there that says when I get to the completely empty list where the list 00:12:16.000 --> 00:12:17.000 is null 00:12:17.000 --> 00:12:20.000 then there are no more cells to count that returns my zero. So it will work its way all the way down 00:12:20.000 --> 00:12:22.000 to the bottom, 00:12:22.000 --> 00:12:25.000 find that trailing null that sentinels the end, and then kind of count on its way 00:12:25.000 --> 00:12:28.000 back out. Okay, there was one before that and then one before that and add those 00:12:28.000 --> 00:12:29.000 all up 00:12:29.000 --> 00:12:33.000 to tell me how many entries are in my address book. So I can do 00:12:33.000 --> 00:12:36.000 that here. I can say what is the count in 00:12:36.000 --> 00:12:38.000 my 00:12:38.000 --> 00:12:43.000 address book? And maybe I'll stop printing it because I'm - and so 00:12:43.000 --> 00:12:46.000 I can test it on a couple of things. Well, what if I put 00:12:46.000 --> 00:12:50.000 an empty cell in so I have a null? If I get 00:12:50.000 --> 00:12:54.000 one cell in there, right? Then I should have one and I can even do a couple more. A, B, C. 00:12:54.000 --> 00:12:55.000 Got 00:12:55.000 --> 00:12:59.000 three cells. I'll 00:12:59.000 --> 00:13:00.000 do 00:13:00.000 --> 00:13:06.000 another little one while I'm here. 00:13:06.000 --> 00:13:08.000 Is that when I'm done with the linked list, 00:13:08.000 --> 00:13:11.000 all those heap allocated cells, right? 00:13:11.000 --> 00:13:14.000 Are just out there clogging up my heap. So when I'm done with this list 00:13:14.000 --> 00:13:16.000 the appropriate thing to do is to deallocate it. 00:13:16.000 --> 00:13:18.000 I will note, actually, just to be clear though, 00:13:18.000 --> 00:13:23.000 that any memory that's allocated by main and kind of in the process of working the program that 00:13:23.000 --> 00:13:27.000 when you actually exit the program it automatically deallocated. So 00:13:27.000 --> 00:13:29.000 when you are completed with the program and you're exiting 00:13:29.000 --> 00:13:33.000 you can go around and be tidy and clean stuff up, but, in fact, there's not a lot of point 00:13:33.000 --> 00:13:33.000 to it. 00:13:33.000 --> 00:13:36.000 The more important reason, right? To deallocation would be during the 00:13:36.000 --> 00:13:40.000 running of the program as you're playing games or monitoring things or doing the 00:13:40.000 --> 00:13:41.000 data. 00:13:41.000 --> 00:13:43.000 If you are not deallocating - if the program, especially if it's 00:13:43.000 --> 00:13:44.000 long 00:13:44.000 --> 00:13:45.000 running, 00:13:45.000 --> 00:13:49.000 will eventually have problems related to its kind of gargantuan memory size if it's 00:13:49.000 --> 00:13:52.000 not being careful about releasing memory. 00:13:52.000 --> 00:13:55.000 When you're done deleting it manually or just letting the system take it down as 00:13:55.000 --> 00:13:57.000 part of the process 00:13:57.000 --> 00:13:58.000 there's not much 00:13:58.000 --> 00:14:00.000 advantage to. 00:14:00.000 --> 00:14:03.000 So if I'm gonna deallocate a list, 00:14:03.000 --> 00:14:07.000 if the list does not equal null there's something deallocate. 00:14:07.000 --> 00:14:07.000 I'm 00:14:07.000 --> 00:14:11.000 gonna do this wrong first and then we'll talk about why 00:14:11.000 --> 00:14:15.000 it's not what I want to do. 00:14:15.000 --> 00:14:19.000 That's good exercise remembering things. So I think, okay, well, what I need to do 00:14:19.000 --> 00:14:22.000 is delete the front most cell 00:14:22.000 --> 00:14:26.000 and then deallocate all those cells that follow it. So 00:14:26.000 --> 00:14:29.000 using recursion, right? To kind of divide the problem up. 00:14:29.000 --> 00:14:32.000 It's a matter of deleting the front cell and then saying, okay, whatever else needs to 00:14:32.000 --> 00:14:35.000 be done needs to be done to everything that remains. List dot next gives me a pointer to that 00:14:35.000 --> 00:14:37.000 smaller sub list to work on. 00:14:37.000 --> 00:14:41.000 As written, this does try to make the right delete calls, 00:14:41.000 --> 00:14:44.000 but it does something really dangerous in terms of use of free memory. Anybody want 00:14:44.000 --> 00:14:46.000 to help me out with that? After it 00:14:46.000 --> 00:14:48.000 deletes the first 00:14:48.000 --> 00:14:50.000 element it doesn't know where to find the rest of it. 00:14:50.000 --> 00:14:51.000 That's exactly right. 00:14:51.000 --> 00:14:56.000 So this one said delete list. So if I think of it in terms of a picture here 00:14:56.000 --> 00:14:59.000 coming into the call 00:14:59.000 --> 00:15:01.000 list to some pointer 00:15:01.000 --> 00:15:05.000 to these nodes 00:15:05.000 --> 00:15:07.000 that are out here, A, B, and C, 00:15:07.000 --> 00:15:09.000 followed by null and I 00:15:09.000 --> 00:15:11.000 said delete list. 00:15:11.000 --> 00:15:14.000 So delete list says follow list to find that piece of memory out there in 00:15:14.000 --> 00:15:18.000 the heap and mark this cell 00:15:18.000 --> 00:15:19.000 as deleted, 00:15:19.000 --> 00:15:22.000 no longer in use, ready to be reclaimed. 00:15:22.000 --> 00:15:24.000 The very next line 00:15:24.000 --> 00:15:26.000 says deallocate list arrow next. 00:15:26.000 --> 00:15:30.000 And so that is actually using list, right? To point back into this piece of freed 00:15:30.000 --> 00:15:31.000 memory 00:15:31.000 --> 00:15:33.000 and try to read this number out of here. 00:15:33.000 --> 00:15:37.000 It may work in some situations. It may work long enough that you almost 00:15:37.000 --> 00:15:40.000 think it's actually correct and then some day cause some problem, 00:15:40.000 --> 00:15:45.000 right? Just in a different circumstances where this didn't succeed by accident. 00:15:45.000 --> 00:15:47.000 That I'm digging into 00:15:47.000 --> 00:15:51.000 that piece of freed memory and trying to access a field out of it, right? Which is 00:15:51.000 --> 00:15:54.000 not a reliable thing to do. C++ will let me do it, 00:15:54.000 --> 00:15:58.000 it doesn't complain about this. Ether it will compile time or run time in an obvious way, 00:15:58.000 --> 00:16:01.000 but it's just, sort of, a little bit of ticking time bomb 00:16:01.000 --> 00:16:04.000 to have that kind of code that's there. 00:16:04.000 --> 00:16:06.000 So there's two different ways I could fix this. 00:16:06.000 --> 00:16:08.000 One way would be to, before 00:16:08.000 --> 00:16:10.000 I do the delete 00:16:10.000 --> 00:16:14.000 is to just hold onto the pointer that I'm gonna need. 00:16:14.000 --> 00:16:21.000 So pull it out of the memory before we're done here. 00:16:21.000 --> 00:16:22.000 So I read it in there. 00:16:22.000 --> 00:16:26.000 So the piece of memory that's behind it is still good. 00:16:26.000 --> 00:16:29.000 The delete actually just deleted that individual cell, so whatever is previously 00:16:29.000 --> 00:16:32.000 allocated with new, which was one entry structure, 00:16:32.000 --> 00:16:34.000 was what was deleted by that call. 00:16:34.000 --> 00:16:36.000 But what I was 00:16:36.000 --> 00:16:38.000 needing here was to keep track of something in 00:16:38.000 --> 00:16:42.000 that piece of memory before, right? I obliterated it, 00:16:42.000 --> 00:16:44.000 so that I can make a further use of it. 00:16:44.000 --> 00:16:46.000 The other alternative to how to fix this 00:16:46.000 --> 00:16:48.000 would be to just merely 00:16:48.000 --> 00:16:53.000 rearrange these two lines. Same kind of fix I did up above where 00:16:53.000 --> 00:16:54.000 go ahead and delete 00:16:54.000 --> 00:16:58.000 everything that follows so when I'm doing a list like this it would say, well, 00:16:58.000 --> 00:17:02.000 delete the D and C and only after all of that has been cleaned up 00:17:02.000 --> 00:17:05.000 come back and delete the one on the front. So it would actually delete from the rear 00:17:05.000 --> 00:17:08.000 forward. Work its way all the way down to the bottom, delete the last cell, then the next to 00:17:08.000 --> 00:17:09.000 last, 00:17:09.000 --> 00:17:10.000 and work its way out. 00:17:10.000 --> 00:17:12.000 Which would also be 00:17:12.000 --> 00:17:13.000 a completely 00:17:13.000 --> 00:17:20.000 correct way to solve this problem. Okay. Any questions 00:17:20.000 --> 00:17:22.000 on that one so far? Let me go 00:17:22.000 --> 00:17:25.000 back over here and see what 00:17:25.000 --> 00:17:27.000 I want to do with you next. 00:17:27.000 --> 00:17:31.000 So I talked about this, talked about these. Okay. 00:17:31.000 --> 00:17:32.000 Now, 00:17:32.000 --> 00:17:34.000 I'm gonna do something that's actually 00:17:34.000 --> 00:17:36.000 really pretty tricky 00:17:36.000 --> 00:17:47.000 that I want you guys to watch with me. All right. 00:17:47.000 --> 00:17:51.000 So this is the same code that we had for building the address book. 00:17:51.000 --> 00:17:52.000 00:17:52.000 --> 00:17:54.000 The listhead, the new one, and so on. 00:17:54.000 --> 00:17:57.000 And what I did was, I just did a little bit of code factoring. Where I took the steps 00:17:57.000 --> 00:18:01.000 that were designed to splice that new cell onto the front of the list 00:18:01.000 --> 00:18:04.000 and I looped them into their own function called prepend 00:18:04.000 --> 00:18:08.000 that takes two pointers. The pointer to the new entry and the pointer to the current first 00:18:08.000 --> 00:18:11.000 cell of the list and it's designed to wire them 00:18:11.000 --> 00:18:12.000 up by putting it in the front. 00:18:12.000 --> 00:18:15.000 So it looks like the code that was here, just moved up here, and then the 00:18:15.000 --> 00:18:19.000 variable names change because the parameters have slightly different names. 00:18:19.000 --> 00:18:22.000 The code as I'm showing it right here is buggy. 00:18:22.000 --> 00:18:24.000 It's almost, but not quite, what we want 00:18:24.000 --> 00:18:28.000 and I want to do this very carefully. Kind of trace through what's happening here, so you 00:18:28.000 --> 00:18:29.000 can watch with me 00:18:29.000 --> 00:18:32.000 what's happening in this process. So 00:18:32.000 --> 00:18:37.000 the variable is actually called listhead and this - I guess let me go ahead and do this. 00:18:37.000 --> 00:18:40.000 Okay. So 00:18:40.000 --> 00:18:43.000 let's imagine that we have gone through a couple of alliterations and we've got a good linked 00:18:43.000 --> 00:18:43.000 list 00:18:43.000 --> 00:18:44.000 00:18:44.000 --> 00:18:46.000 kind of in place, so I have something to work with her. 00:18:46.000 --> 00:18:49.000 So let's imagine that listhead 00:18:49.000 --> 00:18:52.000 points to a 00:18:52.000 --> 00:18:53.000 cell A to B 00:18:53.000 --> 00:18:56.000 and then its got two cells, let's say, 00:18:56.000 --> 00:18:58.000 and then it's empty. 00:18:58.000 --> 00:19:00.000 Let's say I get a new one 00:19:00.000 --> 00:19:01.000 00:19:01.000 --> 00:19:06.000 and this ones gonna be a J, let's say. Okay. 00:19:06.000 --> 00:19:09.000 So that's the state, let's say, that I'm coming into here and hit the 00:19:09.000 --> 00:19:09.000 prepend 00:19:09.000 --> 00:19:13.000 and I'm ready to take this J cell and put it on the front so that I have J, A, B on 00:19:13.000 --> 00:19:16.000 my list. Okay. So let me 00:19:16.000 --> 00:19:20.000 do a little dotted line here that distinguishes my two stack 00:19:20.000 --> 00:19:25.000 rings. I'm gonna make this call to prepend 00:19:25.000 --> 00:19:31.000 and prepend has two variables, ENT and first, 00:19:31.000 --> 00:19:33.000 that were copied from 00:19:33.000 --> 00:19:37.000 the variables new one and listhead that came out of the build address book. Actually, 00:19:37.000 --> 00:19:40.000 it's not called main here. Let me 00:19:40.000 --> 00:19:43.000 call this build 00:19:43.000 --> 00:19:47.000 book. Okay. 00:19:47.000 --> 00:19:50.000 So I pass the value of new one as 00:19:50.000 --> 00:19:51.000 ENT. 00:19:51.000 --> 00:19:53.000 So what we get is a pointer 00:19:53.000 --> 00:19:55.000 to this same cell, 00:19:55.000 --> 00:19:57.000 a copy, and so this case, right? 00:19:57.000 --> 00:20:00.000 We copied the pointer, so whatever address was being held there is actually 00:20:00.000 --> 00:20:03.000 copied into here as this parameter. 00:20:03.000 --> 00:20:04.000 Similarly, 00:20:04.000 --> 00:20:07.000 listhead is copied into the second parameter, which is the pointer to the 00:20:07.000 --> 00:20:12.000 first cell. Okay. 00:20:12.000 --> 00:20:15.000 I want to do this in such a way that you can follow what's going on. So let's see 00:20:15.000 --> 00:20:16.000 if I can get this 00:20:16.000 --> 00:20:19.000 to work out. So it's pointing up there to first. Okay. 00:20:19.000 --> 00:20:21.000 So I've got 00:20:21.000 --> 00:20:24.000 two pointers to this cell A. 00:20:24.000 --> 00:20:26.000 One that came off the original listhead and one that came there from the copy in first. 00:20:26.000 --> 00:20:29.000 And then I've got two pointers to this J. 00:20:29.000 --> 00:20:32.000 One that came through the variable new one in the build an address book frame and 00:20:32.000 --> 00:20:35.000 one that was ENT in the prepend frame. Now, I'm gonna 00:20:35.000 --> 00:20:37.000 trace through the code of prepend. 00:20:37.000 --> 00:20:39.000 It says ent 00:20:39.000 --> 00:20:41.000 next field equals first. Okay. 00:20:41.000 --> 00:20:43.000 So ent's 00:20:43.000 --> 00:20:44.000 next field 00:20:44.000 --> 00:20:48.000 is right here. 00:20:48.000 --> 00:20:53.000 It gets the value of first. Well, first points to this cell. Okay. 00:20:53.000 --> 00:20:56.000 So we go ahead 00:20:56.000 --> 00:20:59.000 and make the next field of J point to that cell. 00:20:59.000 --> 00:21:03.000 So that looks pretty good. That first line worked out just fine. It changed 00:21:03.000 --> 00:21:05.000 the cell J 00:21:05.000 --> 00:21:08.000 to point to what was previously the front of the list. So it looks like now it's the 00:21:08.000 --> 00:21:10.000 first cell that's followed by those. 00:21:10.000 --> 00:21:13.000 And then the next line I need to do is I need to update the listhead 00:21:13.000 --> 00:21:16.000 to point to J. 00:21:16.000 --> 00:21:17.000 So it says first 00:21:17.000 --> 00:21:19.000 equals ENT. 00:21:19.000 --> 00:21:20.000 So first 00:21:20.000 --> 00:21:29.000 gets the value of ENT. Well this just does pointer assignment. 00:21:29.000 --> 00:21:31.000 00:21:31.000 --> 00:21:34.000 Sorry, I erased where it used to point to. 00:21:34.000 --> 00:21:36.000 And then I make it point to there, too. 00:21:36.000 --> 00:21:39.000 So at the bottom of prepend, 00:21:39.000 --> 00:21:43.000 if I were to use first as the pointer to the initial cell of the list it looks good. 00:21:43.000 --> 00:21:47.000 It points to J, which points to A, which points to B, which points to null. 00:21:47.000 --> 00:21:49.000 Everything looks fine 00:21:49.000 --> 00:21:50.000 here, 00:21:50.000 --> 00:21:51.000 right? At the end of prepend. 00:21:51.000 --> 00:21:53.000 But when I get back here 00:21:53.000 --> 00:21:55.000 to come back around on this loop; 00:21:55.000 --> 00:22:03.000 where is listhead pointing? It's pointing to A. 00:22:03.000 --> 00:22:09.000 It's pointing to A. Did anything happen to listhead in the process of this? No. 00:22:09.000 --> 00:22:12.000 That listhead points where it always pointed, which was whatever cell, right? Was 00:22:12.000 --> 00:22:14.000 previously pointed to. 00:22:14.000 --> 00:22:18.000 This attempt to pass it into prepend and have prepend update it didn't 00:22:18.000 --> 00:22:21.000 stick. 00:22:21.000 --> 00:22:22.000 Prepend 00:22:22.000 --> 00:22:24.000 has two pointers 00:22:24.000 --> 00:22:27.000 that are copies of these pointers. 00:22:27.000 --> 00:22:31.000 So like other variables, ents, and strings and 00:22:31.000 --> 00:22:32.000 vectors and anything else we have, 00:22:32.000 --> 00:22:36.000 if we just pass the variable itself into a function call 00:22:36.000 --> 00:22:41.000 then that pass by value mechanism kicks in, which causes there to be two new 00:22:41.000 --> 00:22:43.000 variables. In this case, the ENT and the first variables, 00:22:43.000 --> 00:22:47.000 that are copies of the two variables listhead and new one 00:22:47.000 --> 00:22:49.000 that were present in build address book. 00:22:49.000 --> 00:22:54.000 And the changes I make down here, if I really try to change ENT or change first, 00:22:54.000 --> 00:22:58.000 so I try to make ENT point somewhere new or make first point somewhere new, 00:22:58.000 --> 00:22:59.000 don't stick. 00:22:59.000 --> 00:23:03.000 They change the copies, not the originals. 00:23:03.000 --> 00:23:05.000 This is tricky. 00:23:05.000 --> 00:23:08.000 This is very tricky because it's entering that the first line 00:23:08.000 --> 00:23:10.000 was actually okay. 00:23:10.000 --> 00:23:14.000 That what the first line tried to do actually did have a persistent affect and 00:23:14.000 --> 00:23:15.000 the affect we wanted, 00:23:15.000 --> 00:23:18.000 which is it followed the ENT pointer 00:23:18.000 --> 00:23:21.000 to this shared location and changed its next field. So 00:23:21.000 --> 00:23:25.000 dereferencing that pointer and mucking around with the struct itself 00:23:25.000 --> 00:23:28.000 did have a persistent affect. It wasn't another copy of the entry struck. There 00:23:28.000 --> 00:23:33.000 really is just this one entry struck that new one and ENT are both pointing 00:23:33.000 --> 00:23:34.000 to. 00:23:34.000 --> 00:23:35.000 So both of them, right? 00:23:35.000 --> 00:23:39.000 Are viewing the same piece of heap memory. Changes made out there at the 00:23:39.000 --> 00:23:42.000 structure, right? Are being perceived from both ends, 00:23:42.000 --> 00:23:45.000 but the pointers themselves - the fact that I had two different pointers that 00:23:45.000 --> 00:23:48.000 pointed to the same place, I changed one of my pointers, 00:23:48.000 --> 00:23:51.000 the one whose name is first, to 00:23:51.000 --> 00:23:55.000 point somewhere new. I didn't change listhead by that action. That listhead and 00:23:55.000 --> 00:23:55.000 first 00:23:55.000 --> 00:23:59.000 were just two aliases of the same location, but they had no further 00:23:59.000 --> 00:24:04.000 relationship that is implied by that parameter passing there. How do 00:24:04.000 --> 00:24:07.000 you feel about that? Question? And 00:24:07.000 --> 00:24:09.000 first and ent 00:24:09.000 --> 00:24:11.000 are gonna disappear now, right, afterwards? No. First and ent would go away, 00:24:11.000 --> 00:24:15.000 but that just means their pointers would go. So when prepend goes away, 00:24:15.000 --> 00:24:18.000 right? This space gets deallocated, so it's 00:24:18.000 --> 00:24:20.000 like this 00:24:20.000 --> 00:24:23.000 pointer goes away 00:24:23.000 --> 00:24:26.000 and all this space down here goes 00:24:26.000 --> 00:24:30.000 away. Then what we have is list still pointing to A and B and then new one, 00:24:30.000 --> 00:24:33.000 right? Is pointing to this J, which points to A, but, in fact, right? 00:24:33.000 --> 00:24:36.000 Is effectively gonna be orphaned by the next come around of the loop when we 00:24:36.000 --> 00:24:37.000 reassign new one. 00:24:37.000 --> 00:24:39.000 So that J didn't get prepended. 00:24:39.000 --> 00:24:43.000 It got half of its attachment done. The outgoing attachment from J got done, 00:24:43.000 --> 00:24:45.000 but the incoming one 00:24:45.000 --> 00:24:48.000 didn't stick. 00:24:48.000 --> 00:24:51.000 This is pretty tricky to think about because it really requires really kind of 00:24:51.000 --> 00:24:53.000 carefully analyzing what the code is doing 00:24:53.000 --> 00:24:57.000 and not letting this idea of the pointer confuse you from what you all 00:24:57.000 --> 00:25:00.000 ready know about variables. If I pass - if you see a function call 00:25:00.000 --> 00:25:01.000 where I say 00:25:01.000 --> 00:25:03.000 binky 00:25:03.000 --> 00:25:05.000 of 00:25:05.000 --> 00:25:07.000 A and B, where A and B are integer values, 00:25:07.000 --> 00:25:10.000 if these are not by reference coming into binky then you know that when they 00:25:10.000 --> 00:25:13.000 come back A and B are what they were before. If A was 10 and B was 20, 00:25:13.000 --> 00:25:16.000 they're still that. That the only way that binky can have some persistent affect on 00:25:16.000 --> 00:25:20.000 the values of A and B would be that they were being passed by reference. 00:25:20.000 --> 00:25:22.000 If I change this piece of code, 00:25:22.000 --> 00:25:25.000 and it's just one tiny change I'm gonna 00:25:25.000 --> 00:25:31.000 make here, to add an ampersand on that first, 00:25:31.000 --> 00:25:38.000 then I'm gonna get what I wanted. So I want 00:25:38.000 --> 00:25:41.000 to draw my reference a little bit differently to remind myself about 00:25:41.000 --> 00:25:45.000 what it is. When I call prepend passing new one it got passed normally. New one 00:25:45.000 --> 00:25:49.000 is still just an ordinary pointer. So now I have two pointers 00:25:49.000 --> 00:25:52.000 where ENT and new one both point to that J. 00:25:52.000 --> 00:25:56.000 This one currently doesn't yet point up there. It's gonna in a minute, but I'll go ahead 00:25:56.000 --> 00:25:58.000 and put it back to its original state. 00:25:58.000 --> 00:25:59.000 Now, first 00:25:59.000 --> 00:26:03.000 is gonna be an alias for listhead, so that first 00:26:03.000 --> 00:26:09.000 is not going to be a copy of what listhead is. It's actually gonna be a 00:26:09.000 --> 00:26:11.000 reference back to here. 00:26:11.000 --> 00:26:15.000 And a reference looks a lot like a pointer in the way that I draw it, 00:26:15.000 --> 00:26:18.000 but the difference is gonna be I'm gonna shade or, sort of, cross hatch this box 00:26:18.000 --> 00:26:21.000 to remind myself that 00:26:21.000 --> 00:26:24.000 first is attached a listhead and there's nothing you can do to change it. 00:26:24.000 --> 00:26:27.000 That first becomes a synonym for listhead. 00:26:27.000 --> 00:26:31.000 That in the context of the prepend function, any access to first is really 00:26:31.000 --> 00:26:35.000 just an access to listhead. So trying to read from it or write to it, you know, do you 00:26:35.000 --> 00:26:37.000 reference it, it all goes back to the original 00:26:37.000 --> 00:26:40.000 that this is just standing in for. 00:26:40.000 --> 00:26:43.000 That there actually is a relationship that's permanent from the 00:26:43.000 --> 00:26:47.000 time of that call that means that yeah, there really is now new variable first. First is 00:26:47.000 --> 00:26:51.000 actually just another name for an existing variable, this one of listhead. 00:26:51.000 --> 00:26:55.000 So when I got through the ENT it gets the value - ENT's next gets the value of first. 00:26:55.000 --> 00:26:56.000 Well, 00:26:56.000 --> 00:26:57.000 first's value really is 00:26:57.000 --> 00:27:01.000 what listhead is. So that means it points up to A, 00:27:01.000 --> 00:27:02.000 so that part 00:27:02.000 --> 00:27:04.000 works as we were hoping before. 00:27:04.000 --> 00:27:08.000 Then when I make the change to first equals ENT, 00:27:08.000 --> 00:27:10.000 first is a synonym for listhead 00:27:10.000 --> 00:27:12.000 and that made listhead 00:27:12.000 --> 00:27:15.000 stop pointing to A and 00:27:15.000 --> 00:27:18.000 start pointing straight to J. 00:27:18.000 --> 00:27:22.000 And so this is still attached here. Let's do that. Let 00:27:22.000 --> 00:27:25.000 me see if I can just erase this part and make it a little clearer what's going on. 00:27:25.000 --> 00:27:28.000 So new one's pointing to J now, ENT 00:27:28.000 --> 00:27:31.000 is pointing to J, 00:27:31.000 --> 00:27:33.000 and listhead is pointing to J, 00:27:33.000 --> 00:27:34.000 and then this is still 00:27:34.000 --> 00:27:37.000 the reference pointing back to there. 00:27:37.000 --> 00:27:40.000 So now when prepend goes away, 00:27:40.000 --> 00:27:42.000 this extra pointer 00:27:42.000 --> 00:27:43.000 is removed, 00:27:43.000 --> 00:27:45.000 this reference is removed, 00:27:45.000 --> 00:27:47.000 all this space 00:27:47.000 --> 00:27:48.000 is reclaimed here in the stack, 00:27:48.000 --> 00:27:51.000 but listhead now really points to J, 00:27:51.000 --> 00:27:53.000 which points to A, which points to B as before. 00:27:53.000 --> 00:27:55.000 So we wired in two pointers, right? 00:27:55.000 --> 00:27:59.000 One going out of the new cell into what was previously the first cell and then the 00:27:59.000 --> 00:28:00.000 first cell 00:28:00.000 --> 00:28:04.000 pointer now points to the new one rather than the original first cell. 00:28:04.000 --> 00:28:07.000 It was all a matter of this one little ampersand 00:28:07.000 --> 00:28:09.000 that makes the difference, right? 00:28:09.000 --> 00:28:12.000 Without that, right? Then we are only operating on a copy. 00:28:12.000 --> 00:28:15.000 We're changing our local copy of a pointer to point somewhere new; 00:28:15.000 --> 00:28:17.000 having no permanent affect 00:28:17.000 --> 00:28:17.000 00:28:17.000 --> 00:28:19.000 on the actual state of it. 00:28:19.000 --> 00:28:22.000 So, actually, without that ampersand in there it turns out 00:28:22.000 --> 00:28:26.000 nothing ever gets added to the list. The listhead starts as null and stays null, 00:28:26.000 --> 00:28:29.000 but all these things will copy the value of listhead as a 00:28:29.000 --> 00:28:33.000 null, change it in the local context of the called function, but then listhead 00:28:33.000 --> 00:28:36.000 will stay null. So, in fact, if I just ran this a bunch of times and I threw in a bunch of 00:28:36.000 --> 00:28:37.000 things without the ampersand 00:28:37.000 --> 00:28:41.000 I would find that all my cells just get discarded. I'd end up with an 00:28:41.000 --> 00:28:42.000 empty list when I was done. All the orphaned out 00:28:42.000 --> 00:28:45.000 in the heap 00:28:45.000 --> 00:28:49.000 attached and not really recorded permanently. 00:28:49.000 --> 00:28:52.000 That's pretty tricky. Question? Why is the 00:28:52.000 --> 00:28:55.000 ampersand after this [inaudible]? 00:28:55.000 --> 00:28:56.000 Because it is the - the 00:28:56.000 --> 00:29:00.000 ampersand applies to the - think of it more as applying to the name and it's saying this 00:29:00.000 --> 00:29:04.000 name is a reference to something of that type. So on the left-hand side 00:29:04.000 --> 00:29:05.000 is the type. 00:29:05.000 --> 00:29:09.000 What type of thing is it? Is it a string of anti-vector events or whatever? And then the 00:29:09.000 --> 00:29:13.000 ampersand can go between the type and the name to say and it is a reference to an 00:29:13.000 --> 00:29:17.000 existing variable of that type as opposed to a copy of a variable of that type. So 00:29:17.000 --> 00:29:19.000 without the ampersand it's always a copy. 00:29:19.000 --> 00:29:21.000 With the ampersand, you're saying 00:29:21.000 --> 00:29:25.000 I'm introducing first as a synonym for an existing entry star variable somewhere 00:29:25.000 --> 00:29:28.000 else and that in the context of this function 00:29:28.000 --> 00:29:31.000 access to this named parameter really is reaching out 00:29:31.000 --> 00:29:35.000 to change a variable stored elsewhere. 00:29:35.000 --> 00:29:41.000 It's quite, quite tricky to kind of get your head around what's going on here. 00:29:41.000 --> 00:29:44.000 I'm gonna use this, actually, in running another piece of code, so I want to 00:29:44.000 --> 00:29:46.000 be sure that at this point everybody feels 00:29:46.000 --> 00:29:54.000 at least reasonably okay with what I just showed you. Question? What happens if we stop 00:29:54.000 --> 00:29:58.000 ampersand [inaudible]? Jordan, you want to invert them or something like that? Yes. It turns out that that won't work. 00:29:58.000 --> 00:30:02.000 Basically, it won't compile because in this case you can't take a pointer out to an 00:30:02.000 --> 00:30:04.000 entry reference. In fact, it just won't let you. So 00:30:04.000 --> 00:30:07.000 it doesn't really make sense is the truth. 00:30:07.000 --> 00:30:08.000 But think of it as like the - sometimes people 00:30:08.000 --> 00:30:12.000 actually will draw it with the ampersand attached even without the 00:30:12.000 --> 00:30:15.000 space onto the variable name. It's just a notation. A really good clue that this name, 00:30:15.000 --> 00:30:18.000 right? Is a reference to an existing variable and that its type is kind 00:30:18.000 --> 00:30:19.000 of 00:30:19.000 --> 00:30:21.000 over here. So type, 00:30:21.000 --> 00:30:25.000 name, and then name as a reference has that ampersand attached. Question? Is 00:30:25.000 --> 00:30:27.000 there a wrong side to passing both the pointers by 00:30:27.000 --> 00:30:29.000 reference? Nope, 00:30:29.000 --> 00:30:32.000 not really. If I pass it by reference, right? 00:30:32.000 --> 00:30:35.000 Then it - nothing would change about what really happens, right? It would still leach into 00:30:35.000 --> 00:30:39.000 the thing and copy it. There is - for variables that are large, sometimes we 00:30:39.000 --> 00:30:42.000 pass them by reference just to avoid the overhead of a copy. Since a pointer is pretty small, that 00:30:42.000 --> 00:30:45.000 making a copy versus taking a reference it turns out doesn't really save you anything and 00:30:45.000 --> 00:30:48.000 actually it makes a little bit more work 00:30:48.000 --> 00:30:52.000 to access it because it has to kind of go one level out to get it. But 00:30:52.000 --> 00:30:54.000 effectively no. In 00:30:54.000 --> 00:30:56.000 general though, I would say that 00:30:56.000 --> 00:30:58.000 passing things by reference, 00:30:58.000 --> 00:31:01.000 as a habit, is probably not one you want to assume for other kinds of variables. 00:31:01.000 --> 00:31:02.000 Just 00:31:02.000 --> 00:31:05.000 because that once you've passed it by reference you've opened up the access to where if it 00:31:05.000 --> 00:31:08.000 changes it it does persistent and if you didn't intend for that that could be kind of a mystical thing 00:31:08.000 --> 00:31:09.000 to try to track down. So I 00:31:09.000 --> 00:31:11.000 think you should be quite deliberate about - 00:31:11.000 --> 00:31:14.000 I plan on changing this and I want to be sure I see that persistent 00:31:14.000 --> 00:31:22.000 effect. That I need that pass by reference there. Okay. 00:31:22.000 --> 00:31:26.000 So let me go on to kind of solve a problem with the linked list that helps to motivate what a linked 00:31:26.000 --> 00:31:30.000 list is good for. Why would you use a linked list versus a 00:31:30.000 --> 00:31:35.000 vector or an array style access for something that was a collection? 00:31:35.000 --> 00:31:39.000 And so the built in array, which is what vector is built on, so they have the same 00:31:39.000 --> 00:31:40.000 characteristics in this respect. 00:31:40.000 --> 00:31:42.000 They store elements in contiguous memory. 00:31:42.000 --> 00:31:44.000 You allocate an array of ten elements. 00:31:44.000 --> 00:31:48.000 There's a block that is ten integers wide, let's say, 00:31:48.000 --> 00:31:52.000 where they were all sitting next to each other. In address 1,000 as one, 00:31:52.000 --> 00:31:54.000 1,004 is the next, 1,008 and so on. 00:31:54.000 --> 00:31:58.000 What that gives you is fast direct access by index. You can ask for the third 00:31:58.000 --> 00:32:02.000 member, the seventh member, the 28th member, the 6 millionth member 00:32:02.000 --> 00:32:03.000 depending on how big it is, right? 00:32:03.000 --> 00:32:05.000 By, sort of, directly computing 00:32:05.000 --> 00:32:07.000 where it will be in memory. You know where it starts and you know how big each 00:32:07.000 --> 00:32:11.000 element is. Then you can say here's where I can find the 10th element. 00:32:11.000 --> 00:32:15.000 That's an advantage, right? To the array vector style of arrangement. 00:32:15.000 --> 00:32:16.000 However, 00:32:16.000 --> 00:32:20.000 the fact that it's in this big contiguous block is actually very bulky. 00:32:20.000 --> 00:32:21.000 It means that 00:32:21.000 --> 00:32:26.000 if you were to want to put a new element in the front of an array or vector that 00:32:26.000 --> 00:32:27.000 has 1,000 elements 00:32:27.000 --> 00:32:30.000 you have to move over the 1,000 elements to make space. If 00:32:30.000 --> 00:32:33.000 it's starting to address 1,000 and you have them all laid out and you want to put a new 00:32:33.000 --> 00:32:37.000 one in the front, everybody's gotta get picked up and moved over a slot to make 00:32:37.000 --> 00:32:39.000 space. So imagine you're all 00:32:39.000 --> 00:32:40.000 00:32:40.000 --> 00:32:43.000 sitting across a lecture hall and I decided I wanted to put a new person there, everybody 00:32:43.000 --> 00:32:45.000 gets up and moves one chair to the left, 00:32:45.000 --> 00:32:47.000 but there's not a way to just stick a new chair 00:32:47.000 --> 00:32:49.000 on one end of it 00:32:49.000 --> 00:32:51.000 using the array style 00:32:51.000 --> 00:32:52.000 of layout. 00:32:52.000 --> 00:32:55.000 Same for removing, right? So any kind of access to where you want to take 00:32:55.000 --> 00:32:59.000 somebody out of the array and close over the gap or insert in the middle and the beginning and 00:32:59.000 --> 00:32:59.000 whatnot 00:32:59.000 --> 00:33:02.000 requires everybody else moving around in memory, 00:33:02.000 --> 00:33:06.000 which gets expensive. Especially for large things, right? That's a lot of work. 00:33:06.000 --> 00:33:09.000 It also can't easily grow and shrink. 00:33:09.000 --> 00:33:12.000 You allocate an array, your vector, to be a certain size 00:33:12.000 --> 00:33:13.000 00:33:13.000 --> 00:33:15.000 underneath it. That's what vector's doing on your behalf. 00:33:15.000 --> 00:33:19.000 If you have ten elements and now you need to put in ten more 00:33:19.000 --> 00:33:22.000 what you need to do is go allocate a piece of memory that's twice as big, copy over the 00:33:22.000 --> 00:33:23.000 ten you have, 00:33:23.000 --> 00:33:26.000 and now have bigger space. If you need to shrink it down you'll have to make a 00:33:26.000 --> 00:33:29.000 smaller piece, copy it down. There's not a way to take a piece of memory and just kind 00:33:29.000 --> 00:33:30.000 of 00:33:30.000 --> 00:33:33.000 in place in memory where it is kind of extended or shrink it 00:33:33.000 --> 00:33:35.000 in the C++ language. 00:33:35.000 --> 00:33:38.000 So that the process of growing and shrinking, right? Requires this extra 00:33:38.000 --> 00:33:41.000 effort. So behind the scenes that's what vector is doing. When you keep adding to a 00:33:41.000 --> 00:33:43.000 vector, eventually it will fill to capacity. It 00:33:43.000 --> 00:33:46.000 will internally make more space and copy things over and you're 00:33:46.000 --> 00:33:49.000 paying that cost behind the scenes 00:33:49.000 --> 00:33:54.000 when you're doing a lot of additions and removals from a vector. The 00:33:54.000 --> 00:33:57.000 linked list, right? Uses this wiring them together using pointers, 00:33:57.000 --> 00:33:59.000 so it has a lot of flexibility 00:33:59.000 --> 00:34:00.000 in it that the array does not. 00:34:00.000 --> 00:34:04.000 Each element individually allocated. So that means you can pick and choose when and 00:34:04.000 --> 00:34:07.000 where you're ready to take some more space on and when you don't. 00:34:07.000 --> 00:34:10.000 So when you're ready to add a new person to the vector there's no worry 00:34:10.000 --> 00:34:12.000 about if there's a million elements there and now you don't have any 00:34:12.000 --> 00:34:15.000 space you have to copy them. It's like you don't - you can leave the other million elements alone. 00:34:15.000 --> 00:34:18.000 You just take the heap, ask for one new element, 00:34:18.000 --> 00:34:20.000 and then splice it in. 00:34:20.000 --> 00:34:23.000 So the only allocation of the allocation concerns the individual element you're 00:34:23.000 --> 00:34:26.000 trying to do something with. You want to take one out of the middle; you just need to delete 00:34:26.000 --> 00:34:30.000 that one and close around the gap by wiring the pointers around it. 00:34:30.000 --> 00:34:33.000 So all the insert and remove actions, right? Are 00:34:33.000 --> 00:34:35.000 only a matter of wiring pointers. 00:34:35.000 --> 00:34:37.000 Wiring around something you're taking out, 00:34:37.000 --> 00:34:41.000 attaching one to the end or into the middle is splicing a pointer in, and is splicing the pointer 00:34:41.000 --> 00:34:45.000 out. So basically you typically have two pointers you need to reassign 00:34:45.000 --> 00:34:46.000 to do that kind of adjustment. 00:34:46.000 --> 00:34:49.000 If the element has ten members, it has a million members, right? Same amount of work to 00:34:49.000 --> 00:34:51.000 put a cell in there. 00:34:51.000 --> 00:34:52.000 00:34:52.000 --> 00:34:57.000 No dependency on how big it is, right? Causing those operations to bog down. 00:34:57.000 --> 00:35:00.000 The downside then is 00:35:00.000 --> 00:35:02.000 exactly the, sort of, opposite of where it came out in the array in the vector, 00:35:02.000 --> 00:35:06.000 which is you don't have direct access to the 10th element, to the 15th 00:35:06.000 --> 00:35:09.000 element, to the 260th element. If I want to see what that 260 of 00:35:09.000 --> 00:35:13.000 my linked list is I've got to start at the beginning and walk. 00:35:13.000 --> 00:35:17.000 I go next, next, next, next, next, next 260 times and then I will get 00:35:17.000 --> 00:35:19.000 to the 260th element. 00:35:19.000 --> 00:35:20.000 So I don't have 00:35:20.000 --> 00:35:22.000 that easy access to say 00:35:22.000 --> 00:35:25.000 right here at this spot because we don't know. They're all over the place 00:35:25.000 --> 00:35:28.000 in memory. Each linked list cell is allocated individually and the only way to get to 00:35:28.000 --> 00:35:29.000 them 00:35:29.000 --> 00:35:32.000 is to follow those links. 00:35:32.000 --> 00:35:35.000 Often that's not as bad a disadvantage as it sounds because 00:35:35.000 --> 00:35:36.000 typically, right? 00:35:36.000 --> 00:35:38.000 That you're doing things like walking down the array or the vector to look at each of 00:35:38.000 --> 00:35:41.000 the elements at the end you would also be walking down the linked list while you did it isn't 00:35:41.000 --> 00:35:42.000 really that bad. 00:35:42.000 --> 00:35:45.000 So in the case where you happen to be storing stuff in an index and you really want to 00:35:45.000 --> 00:35:46.000 reach back in there 00:35:46.000 --> 00:35:50.000 is where the linked list starts to be a bad call. Can you take it on, like, 00:35:50.000 --> 00:35:52.000 from the list and find something? 00:35:52.000 --> 00:35:55.000 Well, it can take - the thing is it - with a question with me like a long time. Like if I were 00:35:55.000 --> 00:35:59.000 looking through to see if there was an existence of a particular score in a 00:35:59.000 --> 00:36:02.000 vector of integers I have to look at them all from the beginning to the end. 00:36:02.000 --> 00:36:05.000 Now, the way I can access them is subzero, subone, subtwo, right? 00:36:05.000 --> 00:36:09.000 But if I'm walking down a linked list I'm doing the same sort of work. I have to look at each element 00:36:09.000 --> 00:36:12.000 but the way I get to them is by traversing these pointers. 00:36:12.000 --> 00:36:15.000 It might be that that's a little bit more expensive relative to the 00:36:15.000 --> 00:36:18.000 vector, but they're still gonna be about the same. If there's a million elements 00:36:18.000 --> 00:36:20.000 they're all gonna look at a million elements and whether it looked at them in contiguous memory 00:36:20.000 --> 00:36:24.000 or looked at them by jumping around it ends up being in the end kind 00:36:24.000 --> 00:36:27.000 of a very comparable amount of time spent doing those things. 00:36:27.000 --> 00:36:31.000 What would be tricky is if I knew I wanted to get to the 100th element. 00:36:31.000 --> 00:36:34.000 Like I wasn't interested in looking at the other 99 that preceded it. I really just wanted 00:36:34.000 --> 00:36:37.000 to go straight to the thing at slot 100. 00:36:37.000 --> 00:36:39.000 The array gives me that access immediately. 00:36:39.000 --> 00:36:43.000 Just doing a little calculation it says it's here. Go to this place in memory. 00:36:43.000 --> 00:36:47.000 And the linked list would require this walking of 100 steps to get there. 00:36:47.000 --> 00:36:49.000 And so if there are situations, right? Where one of these is 00:36:49.000 --> 00:36:53.000 just clearly the better answer because you know you're gonna do that operation a lot, right? You're 00:36:53.000 --> 00:36:54.000 gonna definitely prefer this. 00:36:54.000 --> 00:36:58.000 If you know you're gonna do a lot of inserting and rearranging within the list, 00:36:58.000 --> 00:37:00.000 the linked list may buy you 00:37:00.000 --> 00:37:03.000 the ease of flexibility of a kind of rewiring 00:37:03.000 --> 00:37:10.000 that does not require all this shuffling to move everything around. Question here? But if you were to 00:37:10.000 --> 00:37:14.000 know the size of the linked list? Typically a linked list won't limit the size unless you tell it, but that's actually true about an 00:37:14.000 --> 00:37:15.000 array too, right? Which is you 00:37:15.000 --> 00:37:18.000 - the vector happens to it on your behalf, but underneath it, right? That the 00:37:18.000 --> 00:37:22.000 array is tracking how many elements are in there, so would a linked list. So 00:37:22.000 --> 00:37:25.000 you could do a manual count when you needed, which would be expensive, or you could 00:37:25.000 --> 00:37:27.000 just track along with that 00:37:27.000 --> 00:37:29.000 outer most pointer. Well, how many elements have I put in there? 00:37:29.000 --> 00:37:31.000 And then update it as you add and remove. So 00:37:31.000 --> 00:37:34.000 you'd likely want to cache that if it was important to you to know that. If 00:37:34.000 --> 00:37:38.000 a lot of your operations depended on knowing that size you'd probably want to keep it stored 00:37:38.000 --> 00:37:41.000 somewhere. Over here? 00:37:41.000 --> 00:37:42.000 Why does 00:37:42.000 --> 00:37:45.000 the vector use arrays then instead of linked 00:37:45.000 --> 00:37:46.000 list? 00:37:46.000 --> 00:37:49.000 That is a great question. It's one that we actually will talk about kind of in about a week. 00:37:49.000 --> 00:37:52.000 It's like, well, why? So sometimes array - the vector is kind of taking the 00:37:52.000 --> 00:37:56.000 path of, well, it's just presenting you to the array in a slightly more convenient 00:37:56.000 --> 00:37:56.000 form. 00:37:56.000 --> 00:37:58.000 We will see situations where 00:37:58.000 --> 00:38:01.000 then the vector is just as bad as a choice as an array and then we'll 00:38:01.000 --> 00:38:05.000 see, well, why we might prefer to use a linked list and we will have - we will build linked 00:38:05.000 --> 00:38:06.000 lists instead basically. 00:38:06.000 --> 00:38:10.000 It has to make a choice and it turns out that for most usages 00:38:10.000 --> 00:38:11.000 the array 00:38:11.000 --> 00:38:15.000 is a pretty good match for a lot of ordinary tasks. The linked list happens to be a match 00:38:15.000 --> 00:38:18.000 for other tasks, right? And then we will build stuff out of that when 00:38:18.000 --> 00:38:20.000 that context comes up. Well, if you just, like, 00:38:20.000 --> 00:38:22.000 then you just, sort of, like 00:38:22.000 --> 00:38:25.000 in a vector class, like encapsulate all that and I - its final 00:38:25.000 --> 00:38:29.000 functionality would be the same. You certainly could. So you could totally build a vector class that 00:38:29.000 --> 00:38:31.000 actually was backed by a linked list and we will see how to do that, 00:38:31.000 --> 00:38:34.000 right? And then it would have different characteristics in terms of what 00:38:34.000 --> 00:38:36.000 operations were efficient and which ones were inefficient 00:38:36.000 --> 00:38:39.000 relative to do an array backed one. And 00:38:39.000 --> 00:38:41.000 over here was also. Yeah? 00:38:41.000 --> 00:38:43.000 Well, I, just to - for using linked 00:38:43.000 --> 00:38:44.000 lists, like, 00:38:44.000 --> 00:38:48.000 so first in, first out or first in, last out 00:38:48.000 --> 00:38:48.000 seems to make 00:38:48.000 --> 00:38:51.000 really good sense, but are there - am I 00:38:51.000 --> 00:38:52.000 like 00:38:52.000 --> 00:38:55.000 missing - You are way ahead of us, but, yeah, you're right on. 00:38:55.000 --> 00:38:57.000 The things like the stack and the Q, right? 00:38:57.000 --> 00:39:00.000 Seem like they're gonna be natural things that will work very, very well with 00:39:00.000 --> 00:39:01.000 a linked list. 00:39:01.000 --> 00:39:04.000 Things that required a lot of shuffling in the middle, right? It's a 00:39:04.000 --> 00:39:08.000 little unclear because you have to find the place in the middle, which is expensive, 00:39:08.000 --> 00:39:11.000 but then the insert is fast. The question is, well, which of these is actually gonna 00:39:11.000 --> 00:39:13.000 work out better? They both have 00:39:13.000 --> 00:39:16.000 some things they can do well and some things they can't do well. So it will have to have 00:39:16.000 --> 00:39:19.000 know maybe a little bit more about the context to be sure which ones seem 00:39:19.000 --> 00:39:22.000 like the best solution and maybe in the end it's just a coin toss, right? 00:39:22.000 --> 00:39:25.000 Some parts will be fast and some parts could be slow and anyway I can get the inverted form 00:39:25.000 --> 00:39:26.000 of that 00:39:26.000 --> 00:39:28.000 as an alternative that 00:39:28.000 --> 00:39:32.000 doesn't dominate in any clear sense. It's just like, well, they both have advantages 00:39:32.000 --> 00:39:35.000 and disadvantages. Question 00:39:35.000 --> 00:39:36.000 back here? Why 00:39:36.000 --> 00:39:37.000 was 00:39:37.000 --> 00:39:38.000 that infused 00:39:38.000 --> 00:39:41.000 to be based on a vector instead of a linked list? 00:39:41.000 --> 00:39:44.000 So they - I think your question is kind of like, yeah, they likely are built on linked lists. 00:39:44.000 --> 00:39:48.000 We're gonna see how we can do it both ways, right? We'll see how we can do it on array, we'll see how we can do it 00:39:48.000 --> 00:39:50.000 on a linked list, and then we'll talk about what those tradeoffs look like. Well, which one 00:39:50.000 --> 00:39:54.000 ends up being the better call 00:39:54.000 --> 00:39:57.000 when we're there? So that will be about a week. So don't worry too much about it now, 00:39:57.000 --> 00:40:00.000 but when we get there we'll definitely see kind of both variants and then we'll 00:40:00.000 --> 00:40:06.000 talk about them in great detail. Okay. 00:40:06.000 --> 00:40:10.000 So I'm gonna show you, just the last operation I want to show you on a linked list 00:40:10.000 --> 00:40:12.000 is doing an insert in sorted order and this 00:40:12.000 --> 00:40:15.000 is gonna require some pretty careful 00:40:15.000 --> 00:40:17.000 understanding of what's going on with the pointers and then we're also gonna 00:40:17.000 --> 00:40:20.000 see a recursive form of it that actually is a 00:40:20.000 --> 00:40:22.000 really clever 00:40:22.000 --> 00:40:25.000 and very elegant way to solve the same problem. So I'm gonna first do it 00:40:25.000 --> 00:40:25.000 iteratively 00:40:25.000 --> 00:40:28.000 just to kind of talk about. What's the mechanism for doing stuff iteratively on 00:40:28.000 --> 00:40:32.000 a linked list? It gets a little bit messy and so just be prepared for that. 00:40:32.000 --> 00:40:36.000 That I'm planning on taking my address book and building it up inserted order. So 00:40:36.000 --> 00:40:40.000 instead of doing a prepend where I just take each cell and put it on the front I'm actually assume 00:40:40.000 --> 00:40:42.000 that I'm gonna keep them all in order by name 00:40:42.000 --> 00:40:45.000 and whenever I get a new cell I want to put it in the right position relative to 00:40:45.000 --> 00:40:48.000 the other neighbors in the list that are all ready there. 00:40:48.000 --> 00:40:51.000 So this is one of those places where it seems like the linked list is actually gonna do very well. I have 00:40:51.000 --> 00:40:53.000 to search to find the spot. 00:40:53.000 --> 00:40:55.000 Okay? Well, you're gonna have to search to find the spot in a vector, too. 00:40:55.000 --> 00:40:59.000 Once I find the spot it should be very easy to just wire it in 00:40:59.000 --> 00:41:02.000 without shuffling and rearranging anything around. It should be a pointer 00:41:02.000 --> 00:41:04.000 in and a pointer out. 00:41:04.000 --> 00:41:06.000 So I'm gonna start this insert sorted. 00:41:06.000 --> 00:41:08.000 It's gonna take the 00:41:08.000 --> 00:41:11.000 head of the list, right? As a bi-reference parameter because there are 00:41:11.000 --> 00:41:14.000 situations, right? Where we're gonna be changing where the list points to when the cells in the 00:41:14.000 --> 00:41:15.000 first one 00:41:15.000 --> 00:41:18.000 and then the new cell that's coming in is the second parameter. 00:41:18.000 --> 00:41:21.000 So we're gonna search to find the place where it goes. 00:41:21.000 --> 00:41:24.000 So let's watch this go down it's 00:41:24.000 --> 00:41:28.000 loop. I do a four loop here that's like, well, Kerr starts this list while it's not null, 00:41:28.000 --> 00:41:29.000 advance by this, 00:41:29.000 --> 00:41:34.000 and if the cell that I'm trying to place, which in this case is the cell Rain, 00:41:34.000 --> 00:41:36.000 proceeds the one we're currently looking at, 00:41:36.000 --> 00:41:38.000 then this must be the place that goes in the list. 00:41:38.000 --> 00:41:41.000 So I look at Rain, I compare it to Cullaf 00:41:41.000 --> 00:41:44.000 and if Rain goes in front of Cullaf then this is where Rain should go in the list. 00:41:44.000 --> 00:41:47.000 Well, it doesn't actually proceed it, 00:41:47.000 --> 00:41:48.000 so I advance again. Then 00:41:48.000 --> 00:41:51.000 I look at Cullaf - Rain versus Lowery. 00:41:51.000 --> 00:41:54.000 Does Rain proceed Lowery? It does not. So I just keep going. So I'm gonna break out of 00:41:54.000 --> 00:41:56.000 the loop as soon as I get to 00:41:56.000 --> 00:41:59.000 the place I compare Rain to Matt. 00:41:59.000 --> 00:42:00.000 Doesn't come out. 00:42:00.000 --> 00:42:03.000 I compare Rain to Thomas. 00:42:03.000 --> 00:42:07.000 Now, this is the first time when new ones name came up as being less than 00:42:07.000 --> 00:42:08.000 Kerr's name. 00:42:08.000 --> 00:42:13.000 So that says, okay, well, Rain needs to go in front of Thomas. 00:42:13.000 --> 00:42:15.000 Okay. All is well and good 00:42:15.000 --> 00:42:18.000 almost. 00:42:18.000 --> 00:42:20.000 I have a pointer to Thomas. 00:42:20.000 --> 00:42:23.000 That's where my curve points to, right? 00:42:23.000 --> 00:42:29.000 And I want to put Rain kind of in the list right in front of Thomas. How do 00:42:29.000 --> 00:42:38.000 I get from Thomas to Matt? But it's 00:42:38.000 --> 00:42:41.000 not just enough to know Thomas, right? Because Thomas tells me like what's gonna follow 00:42:41.000 --> 00:42:42.000 Rain. 00:42:42.000 --> 00:42:44.000 So this is the cell where 00:42:44.000 --> 00:42:47.000 Rain's next is gonna point to where Thomas is because Rain is kind of gonna replace 00:42:47.000 --> 00:42:51.000 Thomas in the list and push Thomas one element down, 00:42:51.000 --> 00:42:55.000 but the situation is that I also need to know what proceeds Thomas. The 00:42:55.000 --> 00:42:58.000 cell that led into Thomas is the one whose next field needs to get reassigned 00:42:58.000 --> 00:43:00.000 to point to Rain. 00:43:00.000 --> 00:43:01.000 So the way the loop is going, right? 00:43:01.000 --> 00:43:04.000 It has to go kind of one too far 00:43:04.000 --> 00:43:07.000 to know that it was there because, like, if I look at Matt I can say, well, does Rain go 00:43:07.000 --> 00:43:10.000 after Matt? Yes, it does. Right? But that's not good enough to know that it goes 00:43:10.000 --> 00:43:13.000 right after Matt. The only way I can tell if it goes right after Matt is to say, 00:43:13.000 --> 00:43:17.000 well, if the cell that comes up next, right? Is behind Rain is that it goes in between 00:43:17.000 --> 00:43:21.000 these two. 00:43:21.000 --> 00:43:22.000 [Inaudible] Thomas and then 00:43:22.000 --> 00:43:24.000 [inaudible] Thomas? 00:43:24.000 --> 00:43:27.000 Yes. So at least I can do it by maybe kind of 00:43:27.000 --> 00:43:30.000 overwriting Thomas with Rain and then overwriting Rain with Thomas and kind of 00:43:30.000 --> 00:43:31.000 wiring them up. I'm gonna try to avoid that. 00:43:31.000 --> 00:43:34.000 I'm gonna actually say, well, let's just think about can we do it with 00:43:34.000 --> 00:43:37.000 leaving the information in the cells where it is? 00:43:37.000 --> 00:43:40.000 I don't know who else might point to Thomas, right? It would seem kind of weird if all of 00:43:40.000 --> 00:43:44.000 a sudden Thomas turned into Rain in some other way. That I don't know for sure 00:43:44.000 --> 00:43:47.000 who else is using this pointer. So lets assume that I can't do that 00:43:47.000 --> 00:43:50.000 variation of it, but I'd like to put Rain in between Matt and Thomas 00:43:50.000 --> 00:43:53.000 and the pointer I have is not quite good enough. 00:43:53.000 --> 00:43:56.000 The cells as is, right? Have this asymmetry. They know what follows or 00:43:56.000 --> 00:43:58.000 not. What precedes them. 00:43:58.000 --> 00:44:02.000 So, in fact, what I'm gonna change here is I'm going to run two pointers down 00:44:02.000 --> 00:44:03.000 the list, 00:44:03.000 --> 00:44:05.000 kind of in parallel, 00:44:05.000 --> 00:44:08.000 where they're always one step apart. 00:44:08.000 --> 00:44:11.000 Then I'm gonna track what was the one I last looked at 00:44:11.000 --> 00:44:13.000 and what is the one I'm currently looking at 00:44:13.000 --> 00:44:16.000 and I call this dragging a previous pointer 00:44:16.000 --> 00:44:19.000 or trailing a previous pointer behind. 00:44:19.000 --> 00:44:20.000 So 00:44:20.000 --> 00:44:24.000 each time I'm about to advance Kerr equals Kerr next, the thing I'll do right 00:44:24.000 --> 00:44:27.000 before it is to assign previous to be what Kerr is now. 00:44:27.000 --> 00:44:31.000 So then it'll advance to the next one, test against null, and keep going. So at 00:44:31.000 --> 00:44:33.000 any given state, right? 00:44:33.000 --> 00:44:36.000 Previous is exactly one behind where Kerr is 00:44:36.000 --> 00:44:39.000 and there's one special case, right? Which is that 00:44:39.000 --> 00:44:44.000 at the very first time through the loop, right? Kerr is set to be the list. Well, 00:44:44.000 --> 00:44:48.000 what's the previous of the head of the list? There actually is no previous, so actually 00:44:48.000 --> 00:44:50.000 initialize previous to be null. 00:44:50.000 --> 00:44:52.000 So the first time through, right? 00:44:52.000 --> 00:44:55.000 We've assigned previous to null and Kerr gets to go to the first one and then we 00:44:55.000 --> 00:44:57.000 say, well, is Cullaf - 00:44:57.000 --> 00:44:58.000 does 00:44:58.000 --> 00:45:00.000 Rain come before Cullaf? No. 00:45:00.000 --> 00:45:04.000 And so I'll advance both so move previous up to where Kerr is and then move 00:45:04.000 --> 00:45:06.000 Kerr up to the next one. 00:45:06.000 --> 00:45:08.000 So now they're pointing at Cullaf and Lowery together 00:45:08.000 --> 00:45:12.000 and, again, I'm comparing Lowery to Rain. Does Rain go in front of Lowery? No. So then I 00:45:12.000 --> 00:45:15.000 move up previous and move up Kerr. 00:45:15.000 --> 00:45:17.000 Does Rain go in front of Matt? 00:45:17.000 --> 00:45:19.000 No. I advance them both, right? 00:45:19.000 --> 00:45:21.000 And then I have 00:45:21.000 --> 00:45:22.000 00:45:22.000 --> 00:45:25.000 previous at the Matt entry, Kerr at the Thomas entry, 00:45:25.000 --> 00:45:28.000 and then Rain does go right in between those, right? 00:45:28.000 --> 00:45:32.000 We know that it didn't precede Matt because we all ready made it through the loop. 00:45:32.000 --> 00:45:37.000 We know it does precede Thomas and that tells us it goes exactly in between them, right? 00:45:37.000 --> 00:45:41.000 That Matt should lead into Rain and Rain should lead into Thomas and that will have 00:45:41.000 --> 00:45:44.000 spliced it in the list at the right position 00:45:44.000 --> 00:45:45.000 relative to the 00:45:45.000 --> 00:45:47.000 other sorted entries. So I 00:45:47.000 --> 00:45:55.000 give myself just a little note here. Well, what are the possible values for previous? 00:45:55.000 --> 00:46:00.000 If the cell is gonna go anywhere in the list other than the very front, 00:46:00.000 --> 00:46:05.000 so as long as Rain follows at least one cell on the list, it's not the 00:46:05.000 --> 00:46:05.000 alphabetically 00:46:05.000 --> 00:46:09.000 first cell of all the ones I've seen, then previous will be some valid cell 00:46:09.000 --> 00:46:12.000 and potentially Kerr could be null, 00:46:12.000 --> 00:46:17.000 but there is a non-null prev in all of those cases. 00:46:17.000 --> 00:46:20.000 There's also the possibility though that previous is null 00:46:20.000 --> 00:46:23.000 in the case where the new cell was the alphabetically first. So if that one actually began 00:46:23.000 --> 00:46:27.000 with an A, let's say it was Alena. 00:46:27.000 --> 00:46:30.000 Then the first test would be if Alena is less then Cullaf. It is. It would 00:46:30.000 --> 00:46:31.000 immediately break and 00:46:31.000 --> 00:46:35.000 so previous would be null, Kerr would point to the front most cell, 00:46:35.000 --> 00:46:38.000 and I've got a little bit of a special case there 00:46:38.000 --> 00:46:43.000 where inserting at the front of the list isn't quite the same thing as inserting 00:46:43.000 --> 00:46:44.000 further down. 00:46:44.000 --> 00:46:49.000 So let me look at the code that I added to this and I ran out of room to draw pictures, so we'll have to do a little 00:46:49.000 --> 00:46:51.000 bit on the board here. 00:46:51.000 --> 00:46:54.000 That there are two pointers we need to assign 00:46:54.000 --> 00:46:55.000 for that 00:46:55.000 --> 00:47:01.000 new cell that's coming in. One is it's outgoing one 00:47:01.000 --> 00:47:06.000 and then another is it's incoming one. And so 00:47:06.000 --> 00:47:11.000 I have this list and, 00:47:11.000 --> 00:47:16.000 let's say, this new cell is going here. So this is like K, L, M, 00:47:16.000 --> 00:47:17.000 T, R. 00:47:17.000 --> 00:47:22.000 That the first line splices in the outgoing pointer in all cases. It says 00:47:22.000 --> 00:47:24.000 that the new ones next field is Kerr. 00:47:24.000 --> 00:47:29.000 So Kerr is the first cell, right? That we found that follows the cell we're trying 00:47:29.000 --> 00:47:32.000 to insert. It might be that Kerr is null, 00:47:32.000 --> 00:47:33.000 but that's totally fine too. 00:47:33.000 --> 00:47:37.000 In this case it happened to point - this is where Kerr was and this is where previous 00:47:37.000 --> 00:47:39.000 was. 00:47:39.000 --> 00:47:41.000 That it will make 00:47:41.000 --> 00:47:44.000 R point to T, so that is the outgoing pointer for the new cell 00:47:44.000 --> 00:47:46.000 and now there are two cases here. 00:47:46.000 --> 00:47:51.000 The ordinary case is that there is some previous cell. In which case the previous 00:47:51.000 --> 00:47:54.000 cell's next field needs to get updated to point to the new one. 00:47:54.000 --> 00:47:58.000 So we use the pointer to this one. It's next field 00:47:58.000 --> 00:48:00.000 no longer points to T, 00:48:00.000 --> 00:48:05.000 but instead points to R. And so I inserted R in between M and T with that 00:48:05.000 --> 00:48:06.000 adjustment. 00:48:06.000 --> 00:48:09.000 In the case where the new cell, 00:48:09.000 --> 00:48:11.000 let's say, was the A, 00:48:11.000 --> 00:48:13.000 then Kerr would be here 00:48:13.000 --> 00:48:17.000 and previous would be null. 00:48:17.000 --> 00:48:20.000 I will set A as next field to the Kerr, 00:48:20.000 --> 00:48:23.000 which spliced the outgoing pointer of A 00:48:23.000 --> 00:48:25.000 to attach to the rest of the list. 00:48:25.000 --> 00:48:29.000 But here's the special case. There is no previous next field that's gonna point to A. 00:48:29.000 --> 00:48:32.000 A actually needs to get reassigned to be the front most cell of the list. So it 00:48:32.000 --> 00:48:36.000 looks like our prepend operation where I'm saying list equals new one 00:48:36.000 --> 00:48:40.000 and the list is some reference parameter 00:48:40.000 --> 00:48:42.000 to some list elsewhere 00:48:42.000 --> 00:48:46.000 that then gets permanently set to the new cell. So I would call insert sorted 00:48:46.000 --> 00:48:49.000 from, like, the build address book function 00:48:49.000 --> 00:48:52.000 to attach it to the very front in this case. 00:48:52.000 --> 00:48:56.000 This is very common in linked list code, right? Is to have there be a bit of a special 00:48:56.000 --> 00:49:00.000 case about either the first cell or the last cell or on the empty list, right? But 00:49:00.000 --> 00:49:03.000 that there is a large 00:49:03.000 --> 00:49:06.000 - all the cells in the interior kind of behave the same. They have a cell in 00:49:06.000 --> 00:49:10.000 front of them and a cell in back of them and that insert has certain properties, but 00:49:10.000 --> 00:49:13.000 there's a little bit of an oddness with that maybe that first cell is sometimes the 00:49:13.000 --> 00:49:16.000 very last cell that requires a little special case handling. 00:49:16.000 --> 00:49:20.000 In this case, right? Putting A in the front means somebody doesn't point 00:49:20.000 --> 00:49:22.000 into A. It's the listhead that points to A 00:49:22.000 --> 00:49:24.000 and so it's not somebody's next field 00:49:24.000 --> 00:49:30.000 that we're updating. It really is the listhead pointer itself. 00:49:30.000 --> 00:49:32.000 Questions about that one? 00:49:32.000 --> 00:49:35.000 So I need to show you the recursive one. I'm not gonna be able to talk about it very much, but 00:49:35.000 --> 00:49:38.000 I'm gonna encourage you to take a look at it. 00:49:38.000 --> 00:49:40.000 Which is the 00:49:40.000 --> 00:49:41.000 00:49:41.000 --> 00:49:46.000 same activity of inserting a cell 00:49:46.000 --> 00:49:48.000 into a list that's being kept in sorted order. 00:49:48.000 --> 00:49:53.000 Now, not using this iterative, not using this dragging of our previous pointer, 00:49:53.000 --> 00:49:58.000 but instead using a strong dense recursive powerful formulation here 00:49:58.000 --> 00:50:01.000 that basically has the idea that says, well, 00:50:01.000 --> 00:50:03.000 given a list that I'm inserting into, 00:50:03.000 --> 00:50:06.000 let's say it's the M through Z part of the list. 00:50:06.000 --> 00:50:11.000 I'm trying to insert R is I compare R to the front of the list and I say, 00:50:11.000 --> 00:50:14.000 well, does that become the new front of this list? That's what my base case is 00:50:14.000 --> 00:50:18.000 looking at. Is the list empty or does this one go in front of it? If so, prepend it. 00:50:18.000 --> 00:50:19.000 Otherwise 00:50:19.000 --> 00:50:22.000 just insert it in the remainder of this list. 00:50:22.000 --> 00:50:24.000 So if it doesn't insert 00:50:24.000 --> 00:50:28.000 at the front of M then we pass the N or the O whatever is following it and 00:50:28.000 --> 00:50:31.000 have it kind of recursively insert into the 00:50:31.000 --> 00:50:34.000 remainder of the list. Eventually it will hit the space case 00:50:34.000 --> 00:50:38.000 when it either gets to the end of the list or when there is 00:50:38.000 --> 00:50:40.000 the cell we're trying to insert 00:50:40.000 --> 00:50:43.000 belongs in front of these remaining cells. In which case, we do a prepend right 00:50:43.000 --> 00:50:44.000 there 00:50:44.000 --> 00:50:46.000 and stop the recursion. 00:50:46.000 --> 00:50:49.000 A dense little piece of code. One I'll 00:50:49.000 --> 00:50:53.000 encourage you to kind of trace a little bit on your own to kind of 00:50:53.000 --> 00:50:56.000 get an idea of how it's working, but it kind of shows in this case that the recursion is 00:50:56.000 --> 00:50:57.000 really buying you something 00:50:57.000 --> 00:51:00.000 by making it, I think, a much simpler and more direct way to 00:51:00.000 --> 00:51:02.000 express what you want to do and then kind of the 00:51:02.000 --> 00:51:06.000 more mechanical means of doing it iteratively. 00:51:06.000 --> 00:51:08.000 What we will talk about on Monday 00:51:08.000 --> 00:51:12.000 will be algorithms. We'll talk about algorithms, Chapter 7. Start talking about 00:51:12.000 --> 00:51:16.000 Big O and formal announcements algorithms and do a little bit of sorting. So I will see 00:51:16.000 --> 00:51:18.000 you on Monday 00:51:18.000 --> 00:51:23.000 and have a good weekend.