WEBVTT 00:00:00.000 --> 00:00:13.000 00:00:13.000 --> 00:00:16.000 This presentation is delivered by the Stanford Center for Professional 00:00:16.000 --> 00:00:26.000 Development. 00:00:26.000 --> 00:00:27.000 Hi, there. 00:00:27.000 --> 00:00:29.000 It's 00:00:29.000 --> 00:00:36.000 Wednesday. We are coming to the end of this fine quarter. So 00:00:36.000 --> 00:00:39.000 what I'm going to do today is I 00:00:39.000 --> 00:00:43.000 brought a cup filled with candy. We'll start with that 00:00:43.000 --> 00:00:45.000 because that's going to be helpful. What I'd like to do is it's just a couple 00:00:45.000 --> 00:00:50.000 bits of loose ends, some things that I wanted to touch on that 00:00:50.000 --> 00:00:53.000 have come and gone this quarter. I maybe wanted to pull them together and reach a bit of closure 00:00:53.000 --> 00:00:54.000 on. 00:00:54.000 --> 00:00:57.000 Then hopefully, if we have time, we'll move on to what 00:00:57.000 --> 00:01:00.000 happens after this. If you liked 106b, what do you do? If you don't like 00:01:00.000 --> 00:01:01.000 106b, what do you do? 00:01:01.000 --> 00:01:02.000 00:01:02.000 --> 00:01:06.000 What kind of options and future opportunities are there 00:01:06.000 --> 00:01:08.000 after this 00:01:08.000 --> 00:01:10.000 ends? I'm going to talk a little about some of the things that 00:01:10.000 --> 00:01:12.000 [inaudible], but I do want to talk about the final just 00:01:12.000 --> 00:01:15.000 for a second because it is kind of a 00:01:15.000 --> 00:01:17.000 last 00:01:17.000 --> 00:01:19.000 opportunity for you to show us your mastery of the material 00:01:19.000 --> 00:01:22.000 and get the grade that you've been working hard for all quarter. 00:01:22.000 --> 00:01:26.000 It does carry a certain amount of weight. 30 percent of the weight in the default 00:01:26.000 --> 00:01:27.000 system is actually placed on the final, 00:01:27.000 --> 00:01:30.000 and more of it can move to that if you have a better 00:01:30.000 --> 00:01:33.000 showing on the final than you did in the midterm. We're happy to move some of that 00:01:33.000 --> 00:01:34.000 weight over so it could actually 00:01:34.000 --> 00:01:36.000 account for even more of your total grade. 00:01:36.000 --> 00:01:38.000 So there's a lot riding on it, and you want to be sure you come 00:01:38.000 --> 00:01:40.000 ready to show us the 00:01:40.000 --> 00:01:44.000 things that you know and that you have comprehension and mastery over. 00:01:44.000 --> 00:01:47.000 To note, though, it is going to look a little bit different than the midterm in terms of its 00:01:47.000 --> 00:01:51.000 emphasis. We are going to see a lot more of this nitty gritty dynamic data 00:01:51.000 --> 00:01:52.000 structure, 00:01:52.000 --> 00:01:53.000 Linklis, trees, 00:01:53.000 --> 00:01:56.000 graphs, pointer-based 00:01:56.000 --> 00:01:58.000 structures with the 00:01:58.000 --> 00:02:01.000 inherent complications in dealing with that kind of dynamic structure. So bringing 00:02:01.000 --> 00:02:03.000 your best game to this exam 00:02:03.000 --> 00:02:05.000 is certainly going to pay off here. The 00:02:05.000 --> 00:02:08.000 coding that you will see will look a lot like what's in the assignments. If you already had a 00:02:08.000 --> 00:02:11.000 chance to look at the practice exam, you'll see that there are 00:02:11.000 --> 00:02:13.000 themes there that echo the things you did on the work. You want to try to make sure 00:02:13.000 --> 00:02:15.000 that's we're connecting to the 00:02:15.000 --> 00:02:16.000 00:02:16.000 --> 00:02:17.000 effort you've put in all along. 00:02:17.000 --> 00:02:20.000 There will actually be some opportunity to think a little bit at a higher level 00:02:20.000 --> 00:02:23.000 about making and analyzing choices for design and implementation. I'm going to talk a 00:02:23.000 --> 00:02:25.000 little bit about that today because I feel like 00:02:25.000 --> 00:02:26.000 at any given moment, we're talking about 00:02:26.000 --> 00:02:30.000 how we might implement a stack and how we might implement a queue. Sometimes it helps to 00:02:30.000 --> 00:02:32.000 step back a level and think a little bit about the choices for 00:02:32.000 --> 00:02:37.000 the bigger picture of when to use a stack and queue and how to make 00:02:37.000 --> 00:02:40.000 decisions in the large about those tradeoffs that are intelligent and 00:02:40.000 --> 00:02:42.000 grounded. So I'll do a little bit of that today, and then 00:02:42.000 --> 00:02:45.000 some of that has been built up all quarter long. 00:02:45.000 --> 00:02:48.000 It is open book and open notes. You can bring all your printouts, your readers, 00:02:48.000 --> 00:02:50.000 sections, all those sort of things. 00:02:50.000 --> 00:02:53.000 For some people, that translates to this idea that, well, I don't need to 00:02:53.000 --> 00:02:56.000 prepare because I'll have all my stuff there. If I need to know something 00:02:56.000 --> 00:02:59.000 in the exam, I'll just go look it up. That's certainly true for looking up details. You want to remember how 00:02:59.000 --> 00:03:02.000 this works or you had a piece of code that you think was similar. It could help 00:03:02.000 --> 00:03:05.000 you to brainstorm what you're trying to do this time. But 00:03:05.000 --> 00:03:08.000 you're not likely to have time to learn new things n the exam. So if there's 00:03:08.000 --> 00:03:11.000 something that you really feel you have not 00:03:11.000 --> 00:03:12.000 gotten your head around, 00:03:12.000 --> 00:03:17.000 the investment upfront before you get into the exam is where to get that 00:03:17.000 --> 00:03:19.000 understanding in place rather than trying, in the exam, to be flipping 00:03:19.000 --> 00:03:22.000 through chapter ten, trying to learn something. It 00:03:22.000 --> 00:03:23.000 may not really play out. 00:03:23.000 --> 00:03:27.000 I highly recommend practicing on paper in an exam-like environment. So really 00:03:27.000 --> 00:03:28.000 working 00:03:28.000 --> 00:03:31.000 the way you will be working in the exam to give yourself 00:03:31.000 --> 00:03:33.000 the best, 00:03:33.000 --> 00:03:35.000 consistent prep for what's happening there. 00:03:35.000 --> 00:03:39.000 Then some people have asked about extra problems, just wanting more of them to work on. I'll give you a 00:03:39.000 --> 00:03:41.000 couple places where you can look for things. One is that cs106 is 00:03:41.000 --> 00:03:44.000 being offered this same quarter, and they are just the honors version of our class. 00:03:44.000 --> 00:03:45.000 So they have some 00:03:45.000 --> 00:03:49.000 slightly notched up problems. But their practice and real midterm are posted on their 00:03:49.000 --> 00:03:51.000 website as well as their practice final. 00:03:51.000 --> 00:03:52.000 00:03:52.000 --> 00:03:55.000 They have very similar coverage. They did some assignments that were like 00:03:55.000 --> 00:03:58.000 ours, some that were a little bit different, but still go places to kind of 00:03:58.000 --> 00:03:59.000 just grab 00:03:59.000 --> 00:04:00.000 actual exam problems with their solution. 00:04:00.000 --> 00:04:04.000 The chapter exercises and section problems often are old 00:04:04.000 --> 00:04:05.000 exam problems or 00:04:05.000 --> 00:04:08.000 eventually became exam problems. So they kind of come and go in terms of being used as 00:04:08.000 --> 00:04:12.000 examples or used as exam problems. So they're definitely places to 00:04:12.000 --> 00:04:17.000 pick up extra mileage. Okay. So let me tell 00:04:17.000 --> 00:04:18.000 you a little bit about 00:04:18.000 --> 00:04:21.000 - any questions about the final? Okay. Let me talk about 00:04:21.000 --> 00:04:26.000 this line 00:04:26.000 --> 00:04:28.000 just for a second because I feel like we have 00:04:28.000 --> 00:04:31.000 touched on these themes again and again, but maybe it's good to have 00:04:31.000 --> 00:04:32.000 at least one 00:04:32.000 --> 00:04:35.000 moment of trying to pull this together and think a little bit about 00:04:35.000 --> 00:04:38.000 the closure and the big picture of all these things. 00:04:38.000 --> 00:04:41.000 A lot of what we're talking about in the beginning was we 00:04:41.000 --> 00:04:44.000 were giving you these abstractions, these cool things to do stuff with, a map, a 00:04:44.000 --> 00:04:45.000 stack, a queue, a vector, 00:04:45.000 --> 00:04:49.000 and problems to solve where those abstractions have a role to play. 00:04:49.000 --> 00:04:50.000 Then we spent a long time saying, 00:04:50.000 --> 00:04:53.000 okay, now that it's our job to make those abstractions work, 00:04:53.000 --> 00:04:56.000 what techniques can we use? What algorithms 00:04:56.000 --> 00:04:59.000 and data structures will help us manipulate and 00:04:59.000 --> 00:05:02.000 model those structures in efficient ways. We 00:05:02.000 --> 00:05:05.000 certainly spent a lot of time talking about runtime performance as though that 00:05:05.000 --> 00:05:08.000 was maybe a little too much in your mind to think that was the only thing 00:05:08.000 --> 00:05:11.000 that really mattered. How fast can we make something? O of N is better 00:05:11.000 --> 00:05:12.000 than N squared. 00:05:12.000 --> 00:05:15.000 Log N is better than N, driving to these holy grails of 00:05:15.000 --> 00:05:16.000 bringing the time down. 00:05:16.000 --> 00:05:19.000 It's certainly a very important factor 00:05:19.000 --> 00:05:20.000 that often - 00:05:20.000 --> 00:05:23.000 the difference between something being linear and something being quadratic is a 00:05:23.000 --> 00:05:27.000 matter of it being tractable at all for the problem at hand. You cannot sort a 00:05:27.000 --> 00:05:32.000 million numbers in a quadratic sort in nearly 00:05:32.000 --> 00:05:35.000 the same time. For large enough data sets, it'd 00:05:35.000 --> 00:05:37.000 be completely unfeasible. 00:05:37.000 --> 00:05:40.000 So it is important to have that big picture, to be thinking about it and know about 00:05:40.000 --> 00:05:42.000 those tradeoffs, to understand issues 00:05:42.000 --> 00:05:44.000 of scale here. 00:05:44.000 --> 00:05:47.000 But we also did some stuff on the homework about actual empirical time 00:05:47.000 --> 00:05:50.000 results, which is another way to bolster our understanding. The big O 00:05:50.000 --> 00:05:51.000 tells us these things, but 00:05:51.000 --> 00:05:53.000 what really happens in real life 00:05:53.000 --> 00:05:57.000 matches it, but not precisely. Those constant factors, we threw away, 00:05:57.000 --> 00:06:00.000 and other artifacts that are 00:06:00.000 --> 00:06:03.000 happening in the real world often give us new insights about things and 00:06:03.000 --> 00:06:06.000 challenges that we're facing. 00:06:06.000 --> 00:06:09.000 We also need to think about things like mix of expected operations, having some be 00:06:09.000 --> 00:06:11.000 slow 00:06:11.000 --> 00:06:14.000 at the consequence of other operations being fast. It's often a tradeoff we're 00:06:14.000 --> 00:06:16.000 willing to make. 00:06:16.000 --> 00:06:20.000 On the editor rougher, we drove toward getting all the operations to be 00:06:20.000 --> 00:06:23.000 fast. In an ideal world, that's certainly the best place to be, but 00:06:23.000 --> 00:06:26.000 it's also a matter at what cost 00:06:26.000 --> 00:06:29.000 in terms of code complexity and memory used 00:06:29.000 --> 00:06:33.000 and effort put into it. It may be that we're willing to tolerate some operations 00:06:33.000 --> 00:06:34.000 being less efficient 00:06:34.000 --> 00:06:39.000 as long as the most commonly used ones are very streamlined. 00:06:39.000 --> 00:06:42.000 We talked about these worst cases and degenerates about 00:06:42.000 --> 00:06:45.000 knowing when we can expect our algorithms to degrade and whether we 00:06:45.000 --> 00:06:47.000 want to do something about that 00:06:47.000 --> 00:06:50.000 or choose a different algorithm or provide some 00:06:50.000 --> 00:06:54.000 protection against those kind of cases coming through. To a lesser 00:06:54.000 --> 00:06:58.000 extent, we already talked about memory used, how much overhead 00:06:58.000 --> 00:07:01.000 there is. Seeing that we went to a pointer-based Linklis structure, we added 00:07:01.000 --> 00:07:03.000 the four byte overhead 00:07:03.000 --> 00:07:06.000 for every element. We went to an eight byte overhead once we got minor 00:07:06.000 --> 00:07:10.000 [inaudible] and maybe even a little bit more if we had balance factors included. 00:07:10.000 --> 00:07:13.000 Those things ride along with our data, 00:07:13.000 --> 00:07:17.000 increasing the size and the memory footprint of it. 00:07:17.000 --> 00:07:18.000 So there's a certain amount of 00:07:18.000 --> 00:07:21.000 overhead built into this system. There's also this notion of excess capacity. To what 00:07:21.000 --> 00:07:24.000 extent do we over allocate our 00:07:24.000 --> 00:07:27.000 data structures early, planning for future growth. In some senses, save 00:07:27.000 --> 00:07:32.000 ourselves that trouble of enlarging on demand. It's preparing for 00:07:32.000 --> 00:07:33.000 00:07:33.000 --> 00:07:34.000 a large 00:07:34.000 --> 00:07:36.000 allotment and then using it up when we find it. 00:07:36.000 --> 00:07:39.000 Then when we do exceed that capacity, having you do it again. So where 00:07:39.000 --> 00:07:42.000 do we make those tradeoffs about how much excess capacity and when to do 00:07:42.000 --> 00:07:44.000 the enlarging? 00:07:44.000 --> 00:07:48.000 It has a lot to do with what our general memory constraints are. 00:07:48.000 --> 00:07:51.000 I would say that memory has gotten to be quite 00:07:51.000 --> 00:07:53.000 free 00:07:53.000 --> 00:07:56.000 in recent processors and 00:07:56.000 --> 00:07:58.000 computer systems. You don't really think about 00:07:58.000 --> 00:08:01.000 100 bytes here, 200 bytes there, 1,000 bytes there. Even 00:08:01.000 --> 00:08:05.000 megabytes, you can talk about as thought it's a drop in the bucket. 00:08:05.000 --> 00:08:06.000 00:08:06.000 --> 00:08:09.000 So this is something that isn't given as much weight 00:08:09.000 --> 00:08:12.000 nowadays as it was ten years ago when there were a lot more constraints. But with 00:08:12.000 --> 00:08:14.000 the move for a lot more imbedded device programming, and looking the 00:08:14.000 --> 00:08:16.000 iPhone or 00:08:16.000 --> 00:08:19.000 cell phones or things like that where they don't have that same liberty 00:08:19.000 --> 00:08:23.000 to be wanton about memory, it's come back to being a little more careful and keeping 00:08:23.000 --> 00:08:25.000 an eye on it. 00:08:25.000 --> 00:08:28.000 There's also and issue here which trades off a lot with 00:08:28.000 --> 00:08:31.000 runtime, which is redundancy versus sharing. Having 00:08:31.000 --> 00:08:33.000 two different ways to get at the same piece of data 00:08:33.000 --> 00:08:38.000 often will provide you with quick access. For example, if you have a 00:08:38.000 --> 00:08:42.000 card catalogue for a library where you can look up by author, you can look up by title, you 00:08:42.000 --> 00:08:44.000 probably are maintaining two data structures. 00:08:44.000 --> 00:08:46.000 One that manipulates them, 00:08:46.000 --> 00:08:49.000 sorted by title, one that manipulates them sorted by author. They're both kind 00:08:49.000 --> 00:08:52.000 of looking at the same books in the end. 00:08:52.000 --> 00:08:55.000 So maybe there's this set of pointers in one map over here, 00:08:55.000 --> 00:08:58.000 indexed by author, another 00:08:58.000 --> 00:09:00.000 keyed by title over there. 00:09:00.000 --> 00:09:03.000 That means that I'm repeating all my data. 00:09:03.000 --> 00:09:06.000 Hopefully I'm sharing it with a pointer, so I'm not actually really repeating all the 00:09:06.000 --> 00:09:10.000 data, but I actually have two pointers or potentially two or more 00:09:10.000 --> 00:09:11.000 pointers to the same book, 00:09:11.000 --> 00:09:15.000 depending on the different ways you can get to it. That gives me that fast 00:09:15.000 --> 00:09:17.000 access. I want to know all the books written by Leo 00:09:17.000 --> 00:09:22.000 Lionni, I can find them easily, as well as finding all the books that have King in the 00:09:22.000 --> 00:09:26.000 title. I don't have to do searches on a data set that's been optimized for only 00:09:26.000 --> 00:09:27.000 one 00:09:27.000 --> 00:09:28.000 access. Definitely, 00:09:28.000 --> 00:09:32.000 tradeoffs were more space thrown at it to get at that fast access for 00:09:32.000 --> 00:09:34.000 different ways. 00:09:34.000 --> 00:09:40.000 Then this last one is one that I had mentioned along the way, but I think it's one that's easy to overlook, 00:09:40.000 --> 00:09:42.000 which is to get very excited about how fast you can make it, how small you can 00:09:42.000 --> 00:09:43.000 make it, 00:09:43.000 --> 00:09:46.000 how fancy it is. It 00:09:46.000 --> 00:09:50.000 has a real downside in terms of how hard it was to 00:09:50.000 --> 00:09:53.000 write. Typically, when you go to these fancier strategies 00:09:53.000 --> 00:09:54.000 that are 00:09:54.000 --> 00:09:58.000 space efficient and time efficient, you had to pay for it somewhere. 00:09:58.000 --> 00:10:01.000 It wasn't for free that the easy, obvious code would do that. It tends to 00:10:01.000 --> 00:10:05.000 actually be complicated code that may use bid operations, that probably uses 00:10:05.000 --> 00:10:06.000 more pointers 00:10:06.000 --> 00:10:10.000 that has kind of a density to it that means it's going to take you longer to get it 00:10:10.000 --> 00:10:12.000 right. It's going to take you more time to debug it. It's going to be more likely to 00:10:12.000 --> 00:10:15.000 have lurking bugs. It's going to be harder to maintain. 00:10:15.000 --> 00:10:18.000 If somebody comes along and wants to add a new operation, 00:10:18.000 --> 00:10:21.000 they open up your code, and they're frightened away. It might be that that 00:10:21.000 --> 00:10:23.000 code will never get 00:10:23.000 --> 00:10:25.000 moved forward. Everybody will just open it 00:10:25.000 --> 00:10:29.000 and close it immediately and say, whatever. We'll make it work the way it is, 00:10:29.000 --> 00:10:33.000 rather than trying to improve it or move it forward because it actually is kind of scary 00:10:33.000 --> 00:10:34.000 to look at. 00:10:34.000 --> 00:10:35.000 So thinking about 00:10:35.000 --> 00:10:37.000 what's the simplest thing that could work. 00:10:37.000 --> 00:10:40.000 There's this movement that I think is pretty neat to think about. 00:10:40.000 --> 00:10:43.000 There's this idea of the agile programming methodology. 00:10:43.000 --> 00:10:44.000 00:10:44.000 --> 00:10:45.000 It's based on the idea that 00:10:45.000 --> 00:10:49.000 rather than planning for building the most super stellar 00:10:49.000 --> 00:10:51.000 infrastructure for solving all the worlds problems. You're about to write a chess 00:10:51.000 --> 00:10:53.000 program. 00:10:53.000 --> 00:10:55.000 Actually, something that's even simpler. You're going to write a checkers program. You say, 00:10:55.000 --> 00:10:59.000 I'm going to sit down, and I'm going to design the pieces and the board. Then you think, hey, 00:10:59.000 --> 00:11:02.000 checkers is just one embodiment of this. Why don't I try to make the 00:11:02.000 --> 00:11:07.000 uber board that can be used for strategic conquest or for chess or Chinese 00:11:07.000 --> 00:11:11.000 checkers. You could have these hexagonal things. You start building this crazy 00:11:11.000 --> 00:11:12.000 infrastructure that 00:11:12.000 --> 00:11:15.000 could handle all of these cases equally well, even though all you really needed 00:11:15.000 --> 00:11:18.000 was checkers, a very simple, 2D grid. 00:11:18.000 --> 00:11:19.000 But you imagine 00:11:19.000 --> 00:11:23.000 the future needs way in advance of having them, and then you design something 00:11:23.000 --> 00:11:26.000 overly complicated. Then what happens is you get bogged out of that. 00:11:26.000 --> 00:11:28.000 It never happens, and the project dies. 00:11:28.000 --> 00:11:29.000 You 00:11:29.000 --> 00:11:31.000 lead a lonely life 00:11:31.000 --> 00:11:35.000 by yourself, eating oatmeal. 00:11:35.000 --> 00:11:37.000 I like 00:11:37.000 --> 00:11:39.000 oatmeal. So one of the mottos is 00:11:39.000 --> 00:11:42.000 design the simplest thing that could possibly work. 00:11:42.000 --> 00:11:46.000 I think that's a neat gift to yourself to realize that 00:11:46.000 --> 00:11:48.000 this simplest thing that could possibly work, that meets the needs that 00:11:48.000 --> 00:11:52.000 you have, that has a decent enough [inaudible] and memory constraint right there. 00:11:52.000 --> 00:11:54.000 Meets your needs in the simplest form of that. 00:11:54.000 --> 00:11:57.000 It's going to be the easiest thing to get working and running. If you later decide you want to 00:11:57.000 --> 00:11:58.000 go fancy, 00:11:58.000 --> 00:12:01.000 you can swap it out 00:12:01.000 --> 00:12:06.000 using the principles of distraction and [inaudible]. It should be that it's independent 00:12:06.000 --> 00:12:10.000 when you decide you need that fancier thing. I've got a 00:12:10.000 --> 00:12:13.000 couple questions here because I thought this would just help us frame a little 00:12:13.000 --> 00:12:14.000 bit of thinking about these things. 00:12:14.000 --> 00:12:17.000 These are questions that I'm hoping you can answer for me. That's why I 00:12:17.000 --> 00:12:18.000 brought 00:12:18.000 --> 00:12:22.000 cup full of candy. I'm prepared to offer bribery for people who help 00:12:22.000 --> 00:12:24.000 me answer these questions. 00:12:24.000 --> 00:12:26.000 So let's talk about array 00:12:26.000 --> 00:12:28.000 versus vector. 00:12:28.000 --> 00:12:31.000 So the sequels [inaudible] builds and 00:12:31.000 --> 00:12:35.000 array is what is behind the scenes of a vector. Then the vector adds 00:12:35.000 --> 00:12:38.000 convenience on the outside of it. 00:12:38.000 --> 00:12:40.000 Early on, I had said, once I tell 00:12:40.000 --> 00:12:43.000 you about arrays, 00:12:43.000 --> 00:12:46.000 put that back in your memory, and just use vector instead because everything array 00:12:46.000 --> 00:12:48.000 does, vector does 00:12:48.000 --> 00:12:53.000 nicely and cleanly and adds all sorts of features. Somebody give me a list 00:12:53.000 --> 00:12:55.000 of things that vector does much better than 00:12:55.000 --> 00:12:58.000 array. 00:12:58.000 --> 00:12:59.000 Way in the back. Bounce 00:12:59.000 --> 00:13:00.000 checking. 00:13:00.000 --> 00:13:03.000 Bounce checking. What else? Come on. A 00:13:03.000 --> 00:13:06.000 blow pop is on the line. 00:13:06.000 --> 00:13:09.000 I need to interrupt your - help us 00:13:09.000 --> 00:13:10.000 out. 00:13:10.000 --> 00:13:11.000 You've got bounce checking. What 00:13:11.000 --> 00:13:14.000 else? More? I want more. You've got nothing else for 00:13:14.000 --> 00:13:17.000 me? I don't even want a blow pop. You don't want 00:13:17.000 --> 00:13:21.000 a blow 00:13:21.000 --> 00:13:23.000 pop? All right. I'm going to give your blow pop to somebody else. Right 00:13:23.000 --> 00:13:27.000 here. I'm going to give the blow pop to you. Tell me what makes vector good. It has operations that 00:13:27.000 --> 00:13:27.000 shuffle 00:13:27.000 --> 00:13:31.000 elements for you. Yeah, it does the insert, delete, moving stuff down. 00:13:31.000 --> 00:13:32.000 Also, it does the growing. 00:13:32.000 --> 00:13:34.000 You keep adding, and it just grows. [Inaudible] don't do that. You 00:13:34.000 --> 00:13:37.000 want a [inaudible] to grow, you grow it. 00:13:37.000 --> 00:13:40.000 You take the pointer, you copy stuff over, you delete the old one, you 00:13:40.000 --> 00:13:43.000 rearrange things. You do all the work. So 00:13:43.000 --> 00:13:44.000 vector buys you kind of 00:13:44.000 --> 00:13:46.000 array with benefits. 00:13:46.000 --> 00:13:48.000 So it seems like, then, 00:13:48.000 --> 00:13:50.000 you could kind of take away - the naive point would be like, well, 00:13:50.000 --> 00:13:53.000 just use vector. Always use vector. 00:13:53.000 --> 00:13:56.000 Vector's always better than array. Is there ever a situation where array still 00:13:56.000 --> 00:13:57.000 is a pretty valid choice? 00:13:57.000 --> 00:14:00.000 What is it array has that vector maybe doesn't? 00:14:00.000 --> 00:14:05.000 Well, when you add things dynamically to the vector, it actually doubles the size [inaudible] 00:14:05.000 --> 00:14:06.000 increasing it by one. If you know 00:14:06.000 --> 00:14:10.000 the exact size, and you know it's not going to change, you're using [inaudible]. That's it exactly. 00:14:10.000 --> 00:14:14.000 One of its big advantages. If you know you only have 00:14:14.000 --> 00:14:17.000 - let's say you have a very small array 00:14:17.000 --> 00:14:21.000 planned for. The one on the practice exam was, if you're doing a calendar 00:14:21.000 --> 00:14:23.000 where you had 365 days in a year, 00:14:23.000 --> 00:14:26.000 and you're keeping track of the events on a particular day, you know 00:14:26.000 --> 00:14:30.000 how many days in the year. There's never going to be any change in that. Well, leap 00:14:30.000 --> 00:14:31.000 00:14:31.000 --> 00:14:33.000 year. 00:14:33.000 --> 00:14:36.000 But you know that, and you know that in advance. So why fuss with a data structure that does all 00:14:36.000 --> 00:14:39.000 this growing and shrinking and has some overhead associated with that 00:14:39.000 --> 00:14:43.000 when, in fact, you know there's exactly this number. That's all you need. 00:14:43.000 --> 00:14:46.000 In fact, it's almost a little bit more convenient to set up. You can 00:14:46.000 --> 00:14:48.000 just declare it that size, and it's ready to go. With 00:14:48.000 --> 00:14:52.000 the vector, you have to go through and add all the elements and stuff like that. 00:14:52.000 --> 00:14:54.000 So there are a few situations where you might just say, 00:14:54.000 --> 00:14:57.000 I don't need the advantages. 00:14:57.000 --> 00:15:00.000 It adds some overhead that I don't want to pay, 00:15:00.000 --> 00:15:04.000 and I'm perfectly happy making it a small fixed 00:15:04.000 --> 00:15:07.000 array. You'll still lose bounce checking, 00:15:07.000 --> 00:15:08.000 but 00:15:08.000 --> 00:15:11.000 if you are being careful, maybe you're 00:15:11.000 --> 00:15:12.000 comfortable with 00:15:12.000 --> 00:15:15.000 making that tradeoff. By losing bounce checking, you actually are getting a little bit 00:15:15.000 --> 00:15:18.000 of the speed improvement back. The bounce check does actually cost you 00:15:18.000 --> 00:15:20.000 a little bit on every 00:15:20.000 --> 00:15:23.000 vector access that an array would not actually pay. 00:15:23.000 --> 00:15:26.000 That's really in the noise for most programs. I hesitate to 00:15:26.000 --> 00:15:28.000 even mention it, to get you to think it matters, 00:15:28.000 --> 00:15:31.000 but it is part of the thinking of a 00:15:31.000 --> 00:15:34.000 professional program. Sometimes that may be the thing it comes down to, 00:15:34.000 --> 00:15:36.000 00:15:36.000 --> 00:15:37.000 making your operation 00:15:37.000 --> 00:15:41.000 constant factors streamline enough for the operation you need. So I put here 00:15:41.000 --> 00:15:44.000 stack and queue versus vector. 00:15:44.000 --> 00:15:48.000 Given that stack and queue are really just vectors in disguise, and they're 00:15:48.000 --> 00:15:50.000 vectors with less features, vectors 00:15:50.000 --> 00:15:53.000 that can't do as much. 00:15:53.000 --> 00:15:55.000 One of the implementations for stack and vector 00:15:55.000 --> 00:15:58.000 could easily be just layered on top of a vector. 00:15:58.000 --> 00:16:02.000 Why is it good to take away things? 00:16:02.000 --> 00:16:06.000 [Inaudible] 00:16:06.000 --> 00:16:08.000 tampering with contents of [inaudible]. Yeah. So it's enforcing discipline 00:16:08.000 --> 00:16:11.000 on how you use it. Stacks are 00:16:11.000 --> 00:16:12.000 00:16:12.000 --> 00:16:14.000 lifo, queues are fifo. 00:16:14.000 --> 00:16:15.000 The way things go out, they way they go out 00:16:15.000 --> 00:16:17.000 are very 00:16:17.000 --> 00:16:19.000 rigid. By using stack and queue, 00:16:19.000 --> 00:16:22.000 you are guaranteeing that you won't go in and accidentally put something at the head 00:16:22.000 --> 00:16:25.000 of the line or remove something out of the middle of the stack because you just don't 00:16:25.000 --> 00:16:26.000 have access to that. 00:16:26.000 --> 00:16:27.000 It's been tidied up and 00:16:27.000 --> 00:16:29.000 packaged up into stack, which has push and pop. 00:16:29.000 --> 00:16:31.000 queue, which has MQDQ. 00:16:31.000 --> 00:16:34.000 No other access implied or given, 00:16:34.000 --> 00:16:37.000 totally encapsulated from you. So in fact, it allows you to 00:16:37.000 --> 00:16:39.000 maintain on a discipline 00:16:39.000 --> 00:16:45.000 that otherwise the vector didn't strongly provide for you. Question? [Inaudible]. Another 00:16:45.000 --> 00:16:48.000 thing you could do is you could optimize a stack or a 00:16:48.000 --> 00:16:51.000 queue to be very good at the only active things you're doing, which 00:16:51.000 --> 00:16:55.000 is pushing and popping or RQDQ. I think that's worth a smartie, don't you? 00:16:55.000 --> 00:16:57.000 I think that is. 00:16:57.000 --> 00:16:58.000 Exactly. 00:16:58.000 --> 00:17:02.000 So by restricting what you have available, 00:17:02.000 --> 00:17:05.000 it actually gives you choices in the implementation that wouldn't have worked out as well, [inaudible] full array 00:17:05.000 --> 00:17:08.000 of operations. It's exactly true. Stack and queue only adding from one end 00:17:08.000 --> 00:17:11.000 mean that things like a Linklis, which isn't a good choice for implementing the 00:17:11.000 --> 00:17:13.000 general case of vector, 00:17:13.000 --> 00:17:17.000 are perfectly good choices for the stack and queue case. So giving ourselves 00:17:17.000 --> 00:17:22.000 some freedom as the implementer. So information about how it's being used 00:17:22.000 --> 00:17:24.000 can pay off. It also 00:17:24.000 --> 00:17:25.000 means when you read code - if you see 00:17:25.000 --> 00:17:27.000 code that's using something, call the stack 00:17:27.000 --> 00:17:31.000 or using a queue, you immediately know how it works. You could 00:17:31.000 --> 00:17:34.000 say, here's this search that's actually stuffing a bunch of things into 00:17:34.000 --> 00:17:37.000 a stack. I know they're last and first [inaudible]. 00:17:37.000 --> 00:17:39.000 If you see them using a vector, 00:17:39.000 --> 00:17:42.000 you have to verify where they put them in the vector. Where do 00:17:42.000 --> 00:17:46.000 they pull them out? The code doesn't tell you the same things. It doesn't tell you the same story 00:17:46.000 --> 00:17:49.000 as it does when you see a word that has a strong meaning to it, like 00:17:49.000 --> 00:17:52.000 stack and queue. 00:17:52.000 --> 00:17:55.000 Generally, these [inaudible] to make about sorting things or not sorting things. We spent a 00:17:55.000 --> 00:17:57.000 long time talking about sorting 00:17:57.000 --> 00:18:01.000 and how various algorithms approach the sorting problem and what 00:18:01.000 --> 00:18:02.000 they come up with. 00:18:02.000 --> 00:18:05.000 I have a question here. What sorting algorithm is best? Is 00:18:05.000 --> 00:18:09.000 there an 00:18:09.000 --> 00:18:12.000 answer to that question? Why would I ask a question that has no answer? It depends on what you expect it 00:18:12.000 --> 00:18:16.000 to be giving you. It depends. That's a great thing. It depends on 00:18:16.000 --> 00:18:18.000 what you expect it to give you. How big is it? 00:18:18.000 --> 00:18:21.000 What are the contents of it? Is it random? Is it sorted? Is it almost 00:18:21.000 --> 00:18:24.000 sorted? Is it reverse sorted? Is it 00:18:24.000 --> 00:18:28.000 drawn from a range of one to 1,000 values, or is it drawn from one to three 00:18:28.000 --> 00:18:31.000 values? Huge array that has just one, two, three in it 00:18:31.000 --> 00:18:34.000 might necessitate a different approach than one that has 00:18:34.000 --> 00:18:36.000 one to max sine in it. 00:18:36.000 --> 00:18:37.000 So 00:18:37.000 --> 00:18:39.000 there is no answer to that. It is depends. What do you know? If you don't 00:18:39.000 --> 00:18:41.000 know anything, give 00:18:41.000 --> 00:18:43.000 it to someone else. 00:18:43.000 --> 00:18:45.000 00:18:45.000 --> 00:18:49.000 If you don't know, there's certain things that perform well. You know in the 00:18:49.000 --> 00:18:52.000 average case, you don't have degenerate behaviors. For example, [inaudible] is a 00:18:52.000 --> 00:18:56.000 great sort for the general case of an N log N that doesn't use any space 00:18:56.000 --> 00:18:56.000 00:18:56.000 --> 00:19:01.000 or any extra space the way word sort does. It doesn't have any degenerate cases. 00:19:01.000 --> 00:19:02.000 For an 00:19:02.000 --> 00:19:05.000 unknown set of inputs, it works pretty well. 00:19:05.000 --> 00:19:08.000 If you happen to have a little bit of data, it's almost sorted. 00:19:08.000 --> 00:19:09.000 00:19:09.000 --> 00:19:10.000 Even 00:19:10.000 --> 00:19:11.000 Barack Obama's 00:19:11.000 --> 00:19:13.000 not favorite bubble sort 00:19:13.000 --> 00:19:14.000 has a 00:19:14.000 --> 00:19:17.000 chance of being in the run because it's already sorted. Is it worth sorting? Calculating the payoff, we had a problem on 00:19:17.000 --> 00:19:20.000 one of the section 00:19:20.000 --> 00:19:24.000 exercises once. I think it's a really good 00:19:24.000 --> 00:19:25.000 one to 00:19:25.000 --> 00:19:27.000 think through and revisit. This idea of 00:19:27.000 --> 00:19:31.000 certain operations, many operations are much more 00:19:31.000 --> 00:19:34.000 efficiently implemented if the data's sorted. 00:19:34.000 --> 00:19:37.000 You can think of them off the top of your head. Oh, you want to have the minimum? If 00:19:37.000 --> 00:19:41.000 it's sorted, it's very easy. You want the maximum? Very easy if it's sorted. You want to find the median. 00:19:41.000 --> 00:19:43.000 Also very easy if it's sorted. You want to 00:19:43.000 --> 00:19:44.000 00:19:44.000 --> 00:19:48.000 find duplicates. Once they're sorted, they're all lined up together. You want to find the 00:19:48.000 --> 00:19:49.000 mode, which is the most 00:19:49.000 --> 00:19:50.000 frequent element. 00:19:50.000 --> 00:19:53.000 Also easier if it's sorted because they'll all be groups together in a run. So you 00:19:53.000 --> 00:19:57.000 can do a linear pass, counting what you see to find the longest run. 00:19:57.000 --> 00:20:00.000 If you had two arrays and you wanted to merge them or intersect them to find 00:20:00.000 --> 00:20:04.000 their common elements, it's much easier if they're sorted. All these things 00:20:04.000 --> 00:20:07.000 that are hard to do when the data's in 00:20:07.000 --> 00:20:08.000 a random order 00:20:08.000 --> 00:20:12.000 actually have much more streamlined algorithms once you invest in sorting it. 00:20:12.000 --> 00:20:13.000 So there was a problem that was like, 00:20:13.000 --> 00:20:15.000 how do you know it's worth doing that? 00:20:15.000 --> 00:20:18.000 If you only plan on writing one merge, and 00:20:18.000 --> 00:20:19.000 you could do it in N squared, 00:20:19.000 --> 00:20:21.000 is it worth it to 00:20:21.000 --> 00:20:24.000 sort it, N log N, so you can get an O out of merge? So it has a lot to do 00:20:24.000 --> 00:20:26.000 with how often you plan on doing the merge. 00:20:26.000 --> 00:20:29.000 How big is your data set? 00:20:29.000 --> 00:20:30.000 What are your 00:20:30.000 --> 00:20:32.000 constraints on the times there. It 00:20:32.000 --> 00:20:35.000 is interesting to think about that relationship. It isn't for free, but 00:20:35.000 --> 00:20:37.000 00:20:37.000 --> 00:20:41.000 potentially, it's an investment by sorting it so you can do a lot of fast searches 00:20:41.000 --> 00:20:44.000 later. 00:20:44.000 --> 00:20:47.000 This one actually is interesting because they actually solve a lot of the 00:20:47.000 --> 00:20:50.000 same problems, and they differ in one small way. If you had a 00:20:50.000 --> 00:20:51.000 set - 00:20:51.000 --> 00:20:55.000 so set is being backed by a balance by a [inaudible] string that actually is a sorted data 00:20:55.000 --> 00:20:56.000 structure internally - versus a 00:20:56.000 --> 00:20:58.000 sorted vector. 00:20:58.000 --> 00:21:03.000 So if your goal were to keep track of all the 00:21:03.000 --> 00:21:07.000 students at Stanford, and you wanted to be able to look up people by name, 00:21:07.000 --> 00:21:08.000 00:21:08.000 --> 00:21:11.000 then the contains operation of the set 00:21:11.000 --> 00:21:15.000 is actually doing these minor ray 00:21:15.000 --> 00:21:16.000 search on assorted vectors 00:21:16.000 --> 00:21:18.000 using the same path, but working through a vector. 00:21:18.000 --> 00:21:21.000 So we can make the searching operations, like contains 00:21:21.000 --> 00:21:23.000 00:21:23.000 --> 00:21:24.000 and 00:21:24.000 --> 00:21:26.000 other look-up based things, 00:21:26.000 --> 00:21:29.000 algorithmic in both cases. 00:21:29.000 --> 00:21:33.000 So what advantage - this sort 00:21:33.000 --> 00:21:36.000 of vector, where I just used the interchangeably, is there something one 00:21:36.000 --> 00:21:41.000 can do that the other can't or some advantage or disadvantage that's 00:21:41.000 --> 00:21:43.000 assumed in that decision? 00:21:43.000 --> 00:21:47.000 You don't have duplicates in a set, but you can have duplicates in your sorted vector. Yeah, 00:21:47.000 --> 00:21:51.000 so the set has another concern with just how it operates. You can't put duplicates 00:21:51.000 --> 00:21:54.000 in, so if you have two students with the same name, then 00:21:54.000 --> 00:21:57.000 in the set, it turns out you're all coalesced down to one. Your transcripts all get 00:21:57.000 --> 00:21:59.000 merged to your advantage or not. What 00:21:59.000 --> 00:22:01.000 else 00:22:01.000 --> 00:22:05.000 does set buy you that vector doesn't? How hard is it to 00:22:05.000 --> 00:22:09.000 edit a sorted vector? You 00:22:09.000 --> 00:22:11.000 want to put something new in there? 00:22:11.000 --> 00:22:12.000 You're shuffling. You want to 00:22:12.000 --> 00:22:13.000 take something out? You're shuffling. 00:22:13.000 --> 00:22:18.000 The movement to the pointer-based structure there, the research 00:22:18.000 --> 00:22:18.000 stream behind this, 00:22:18.000 --> 00:22:21.000 is giving us this editability of the data structure. 00:22:21.000 --> 00:22:25.000 The convenient logarithmic time to remove and add elements using the 00:22:25.000 --> 00:22:28.000 pointer wiring to do that, 00:22:28.000 --> 00:22:30.000 that vector doesn't have. That tradeoff 00:22:30.000 --> 00:22:35.000 is actually one that 00:22:35.000 --> 00:22:39.000 situation may be that you just don't do a lot of edits. That's actually a very common 00:22:39.000 --> 00:22:42.000 situation. The sorted vector happens to be a very underutilized 00:22:42.000 --> 00:22:45.000 or under appreciated arrangement 00:22:45.000 --> 00:22:48.000 for data because for most things that are getting a lot of 00:22:48.000 --> 00:22:49.000 edits, having a 00:22:49.000 --> 00:22:53.000 linear time edit isn't going to be a painful 00:22:53.000 --> 00:22:53.000 consequence. 00:22:53.000 --> 00:22:56.000 You're doing a lot of searches, and you're getting the finer research there, 00:22:56.000 --> 00:22:57.000 where you really care about it, 00:22:57.000 --> 00:23:01.000 with no overhead. Very little overhead. A little bit of excess capacity, 00:23:01.000 --> 00:23:03.000 but none of the pointer-based 00:23:03.000 --> 00:23:06.000 trouble that came with the set. So the set actually adding onto that another eight 00:23:06.000 --> 00:23:08.000 bytes, potentially even a little bit more, 00:23:08.000 --> 00:23:12.000 of overhead on every element to give us that editability that maybe isn't 00:23:12.000 --> 00:23:14.000 that important to us. 00:23:14.000 --> 00:23:17.000 So if you're looking at that thing and saying, I'd like to be able to have 00:23:17.000 --> 00:23:21.000 a sorted structure for both the set and the sorted vector, let you 00:23:21.000 --> 00:23:24.000 access the elements in order. The interrater of the set goes in sort order, 00:23:24.000 --> 00:23:27.000 going from zero to the end. The vector gives me order. 00:23:27.000 --> 00:23:30.000 I can browse them in sort order. I can search them using sorted. 00:23:30.000 --> 00:23:33.000 Where they differ is where does it take to edit one? 00:23:33.000 --> 00:23:37.000 The one that invested more in the memory had faster 00:23:37.000 --> 00:23:39.000 00:23:39.000 --> 00:23:44.000 editing. The P-queue. This is kind of interesting. The P-queue is also a sorted structure, but not quite. 00:23:44.000 --> 00:23:46.000 00:23:46.000 --> 00:23:46.000 00:23:46.000 --> 00:23:48.000 It's a much more weakly 00:23:48.000 --> 00:23:52.000 sorted structure. So if you're using the [inaudible] heap like we did on the 00:23:52.000 --> 00:23:53.000 P-queue 00:23:53.000 --> 00:23:55.000 [inaudible], it does give you this access to the maximum value, 00:23:55.000 --> 00:23:59.000 and then kind of a quick reshuffling [inaudible] the next 00:23:59.000 --> 00:24:00.000 and so on. 00:24:00.000 --> 00:24:02.000 But it doesn't really give you the full range. For example, if you 00:24:02.000 --> 00:24:10.000 were looking for the minimum in a P-queue, where is it? 00:24:10.000 --> 00:24:11.000 It's somewhere in the 00:24:11.000 --> 00:24:14.000 bottom. It's in the bottom-most level, but where across the bottom, 00:24:14.000 --> 00:24:16.000 no knowledge, right? It could be 00:24:16.000 --> 00:24:20.000 at any of the nodes that are at the bottom of the heap. So in fact it 00:24:20.000 --> 00:24:21.000 gives you this access to 00:24:21.000 --> 00:24:23.000 a partial 00:24:23.000 --> 00:24:26.000 ordering of the data. In this case, it's relative 00:24:26.000 --> 00:24:28.000 to the prioritization, but not 00:24:28.000 --> 00:24:31.000 the fluid form of sorting the way a vector is. For example, 00:24:31.000 --> 00:24:33.000 if P-queue doesn't give you things for 00:24:33.000 --> 00:24:35.000 finding duplicates or finding the minimum or 00:24:35.000 --> 00:24:39.000 doing the mode it's not actually an easy operation because you have 00:24:39.000 --> 00:24:42.000 it in a heap form. I can get to the max 00:24:42.000 --> 00:24:42.000 quickly. 00:24:42.000 --> 00:24:46.000 So if I cared about pulling them out in sorted order 00:24:46.000 --> 00:24:47.000 to browse them, 00:24:47.000 --> 00:24:51.000 I could stuff things into a P-queue and then pull them out, but what's interesting about that is 00:24:51.000 --> 00:24:54.000 it requires destroying the P-queue. 00:24:54.000 --> 00:24:57.000 The P-queue isn't really a place you store things and plan on using them again. It's 00:24:57.000 --> 00:25:00.000 the place you stick things in order to get them out 00:25:00.000 --> 00:25:02.000 in an order that's only 00:25:02.000 --> 00:25:05.000 value is in the de-queuing of them, pulling them out. 00:25:05.000 --> 00:25:10.000 So finding the most promising option to look at in a search or finding the most 00:25:10.000 --> 00:25:13.000 high-priority activity to take on or something. 00:25:13.000 --> 00:25:16.000 But it doesn't really have the - if I wanted to be able to look at all 00:25:16.000 --> 00:25:18.000 the students in my class in sorted order, 00:25:18.000 --> 00:25:20.000 it's like, well, I could stick them in a P-queue and then pull them out, but it's sort of 00:25:20.000 --> 00:25:25.000 funny. I have to put them in the P-queue just to pull them back out. It wasn't a good way to store them for that 00:25:25.000 --> 00:25:27.000 browsing operation to be easily repeated. 00:25:27.000 --> 00:25:29.000 If I put them into a sorted vector, I could just keep 00:25:29.000 --> 00:25:33.000 iterating over it whenever I need to see it again. 00:25:33.000 --> 00:25:37.000 So a lot of the things that it comes down to is having a pretty good 00:25:37.000 --> 00:25:41.000 visualization of what it is that going to a pointer-based data structure buys you, and 00:25:41.000 --> 00:25:44.000 what it costs you, relative to the continuous memory. That's one of the 00:25:44.000 --> 00:25:45.000 00:25:45.000 --> 00:25:48.000 tensions underneath most of these things. It's like, 00:25:48.000 --> 00:25:50.000 putting more memory into it 00:25:50.000 --> 00:25:54.000 by virtue of having a pointer wire as opposed to a 00:25:54.000 --> 00:25:55.000 location 00:25:55.000 --> 00:25:57.000 arrangement. 00:25:57.000 --> 00:26:00.000 There's no information that's stored to get you from a race of zero to a 00:26:00.000 --> 00:26:04.000 race of two. They're just continuous in memory. So it saves you space because you only 00:26:04.000 --> 00:26:07.000 have to know where it starts and access continuously for that. This pointer means 00:26:07.000 --> 00:26:11.000 there's more kerversals. There's more memory. There's more opportunity 00:26:11.000 --> 00:26:12.000 to air, but it gave 00:26:12.000 --> 00:26:16.000 you this flexibility. It broke down this continuous block, that's kind 00:26:16.000 --> 00:26:17.000 of a monolith, 00:26:17.000 --> 00:26:21.000 into these pieces that are more flexible in how you can rearrange 00:26:21.000 --> 00:26:25.000 them. So thinking about these things is 00:26:25.000 --> 00:26:28.000 an important skill to come away from 106b with that, 00:26:28.000 --> 00:26:30.000 yeah, there's never 00:26:30.000 --> 00:26:31.000 one right answer. 00:26:31.000 --> 00:26:33.000 00:26:33.000 --> 00:26:36.000 You're investing to solve this other problem, and 00:26:36.000 --> 00:26:40.000 there's going to be downsides. There's no 00:26:40.000 --> 00:26:41.000 obvious 00:26:41.000 --> 00:26:43.000 triumph over all. 00:26:43.000 --> 00:26:47.000 It's fast, it's cheap and it's small and easy to write 00:26:47.000 --> 00:26:51.000 solutions out there. 00:26:51.000 --> 00:26:53.000 So I put up here - I had to 00:26:53.000 --> 00:26:54.000 whittle 00:26:54.000 --> 00:26:57.000 this down to the five or six things that, at 00:26:57.000 --> 00:27:02.000 the end of 106b, I want you to feel like, that's what I got. 00:27:02.000 --> 00:27:04.000 That's the heart of 00:27:04.000 --> 00:27:05.000 00:27:05.000 --> 00:27:06.000 the 00:27:06.000 --> 00:27:10.000 goal. The MVPs - the most valuable players of the 00:27:10.000 --> 00:27:12.000 106b curriculum here. 00:27:12.000 --> 00:27:15.000 You're looking back at it, and you go, where did I start, where did I end up, 00:27:15.000 --> 00:27:18.000 and how did my knowledge grow? 00:27:18.000 --> 00:27:21.000 I'm hoping these are the things that stand out for you. That was a 00:27:21.000 --> 00:27:23.000 real 00:27:23.000 --> 00:27:24.000 growth area for me. One 00:27:24.000 --> 00:27:27.000 is this idea of abstraction. 00:27:27.000 --> 00:27:32.000 That was stressed early on. Here's a stack, here's a queue, here's a set. They 00:27:32.000 --> 00:27:33.000 do things for you. We don't know how 00:27:33.000 --> 00:27:34.000 they work yet, 00:27:34.000 --> 00:27:38.000 but you make cool things happen with them. So you solve maze problems and 00:27:38.000 --> 00:27:40.000 random writers and 00:27:40.000 --> 00:27:43.000 do neat things with 00:27:43.000 --> 00:27:47.000 dictionaries without knowing how they work. So you're leveraging existing 00:27:47.000 --> 00:27:48.000 material without knowledge of how it works. 00:27:48.000 --> 00:27:52.000 It helps to keep the complexity down. You can solve much more interesting problems 00:27:52.000 --> 00:27:56.000 than you could without them. Try to go back and imagine the Markov random writer 00:27:56.000 --> 00:27:57.000 problem. 00:27:57.000 --> 00:27:58.000 Now imagine doing it 00:27:58.000 --> 00:28:00.000 00:28:00.000 --> 00:28:02.000 without the map in play. So 00:28:02.000 --> 00:28:04.000 you had to figure out that 00:28:04.000 --> 00:28:08.000 given this character, go look it up and find a place to store it, and 00:28:08.000 --> 00:28:11.000 then store the character in some other array that was growing on 00:28:11.000 --> 00:28:14.000 demand, as you put stuff in there. You would never have completed that 00:28:14.000 --> 00:28:18.000 program. You would still be writing it today, having pointer errors and whatnot 00:28:18.000 --> 00:28:19.000 behind the scenes. 00:28:19.000 --> 00:28:22.000 Having those tools already there, it's like having 00:28:22.000 --> 00:28:25.000 the microwave and the food processor, and you're about to cook instead 00:28:25.000 --> 00:28:26.000 of having a 00:28:26.000 --> 00:28:27.000 dull knife 00:28:27.000 --> 00:28:31.000 and doing all your work with a 00:28:31.000 --> 00:28:31.000 dull 00:28:31.000 --> 00:28:34.000 knife and a fire. You actually have 00:28:34.000 --> 00:28:36.000 some real power 00:28:36.000 --> 00:28:39.000 there. Recursion's a big theme in the weeks that followed that. 00:28:39.000 --> 00:28:41.000 How to solve these problems that 00:28:41.000 --> 00:28:43.000 have this similar structure, 00:28:43.000 --> 00:28:47.000 looking at these exhaustive traversals and choice problems that can 00:28:47.000 --> 00:28:51.000 be solved with backtracking. Having this idea of how to think about thing [inaudible] 00:28:51.000 --> 00:28:53.000 using the recursive problem solving 00:28:53.000 --> 00:28:55.000 and how to model 00:28:55.000 --> 00:28:56.000 that 00:28:56.000 --> 00:28:59.000 process using the computer was a 00:28:59.000 --> 00:29:02.000 very important thing that comes back. We look at algorithm 00:29:02.000 --> 00:29:05.000 analysis, this idea of making this 00:29:05.000 --> 00:29:08.000 sloppy map that computer scientists are 00:29:08.000 --> 00:29:11.000 fond of, talking about the scalability and growth, knowing about curves and 00:29:11.000 --> 00:29:13.000 the relationships and how to 00:29:13.000 --> 00:29:16.000 look at algorithms. The more formal analysis. 00:29:16.000 --> 00:29:17.000 Still, in this case, there's still quite 00:29:17.000 --> 00:29:21.000 a bit more formulas to this than we really went into. If you 00:29:21.000 --> 00:29:23.000 were go on in CS and look at it, you will see there's quite a lot 00:29:23.000 --> 00:29:26.000 of rigor that can be 00:29:26.000 --> 00:29:27.000 looked at in that area. 00:29:27.000 --> 00:29:31.000 We were getting more an intuition for how to gauge things about algorithms and 00:29:31.000 --> 00:29:33.000 their growth. 00:29:33.000 --> 00:29:36.000 Then this backside, all about implementation strategies and tradeoffs. 00:29:36.000 --> 00:29:38.000 So now that we've 00:29:38.000 --> 00:29:40.000 benefited from those abstractions, and we've 00:29:40.000 --> 00:29:43.000 solved some cool problems, what's it like to actually be the person building them? What 00:29:43.000 --> 00:29:46.000 are the techniques? What does it take to 00:29:46.000 --> 00:29:50.000 wrestle a pointer or a Linklis or a tree, heap graph into doing something really cool for you. How 00:29:50.000 --> 00:29:52.000 does that 00:29:52.000 --> 00:29:57.000 inform your information, what you expect about it and the 00:29:57.000 --> 00:30:00.000 coding complexity that was inherent in achieving that result? 00:30:00.000 --> 00:30:04.000 Hopefully in continuing with the theme that 106a started you on was this 00:30:04.000 --> 00:30:07.000 appreciation for [inaudible]. In the 00:30:07.000 --> 00:30:11.000 end, getting code that works, that's ugly, just 00:30:11.000 --> 00:30:14.000 isn't, hopefully, what you're satisfied with. You want to 00:30:14.000 --> 00:30:17.000 approach problem solving to produce beautiful, elegant, 00:30:17.000 --> 00:30:19.000 unified, clean, 00:30:19.000 --> 00:30:20.000 readable code 00:30:20.000 --> 00:30:24.000 that you would be happy to show to other people and be proud of and 00:30:24.000 --> 00:30:25.000 maintain and live with. 00:30:25.000 --> 00:30:27.000 The kind of work that 00:30:27.000 --> 00:30:29.000 other people you work with would appreciate 00:30:29.000 --> 00:30:32.000 being involved with. 00:30:32.000 --> 00:30:34.000 Things that are not so much 00:30:34.000 --> 00:30:37.000 - I don't think these are the most valuable pointers, but what 00:30:37.000 --> 00:30:38.000 came along the way, 00:30:38.000 --> 00:30:39.000 00:30:39.000 --> 00:30:42.000 the idea of pointers and C++ being part of the - 00:30:42.000 --> 00:30:45.000 C++ and its pointers, they're one and the same. We choose [inaudible] for a 00:30:45.000 --> 00:30:48.000 language, so as a result, we get to learn some C++ 00:30:48.000 --> 00:30:51.000 because it's so heavily 00:30:51.000 --> 00:30:53.000 invested with pointers. We also have to do some practice with the pointers to 00:30:53.000 --> 00:30:55.000 get our jobs done. 00:30:55.000 --> 00:30:58.000 I think of those as being off to the side. I don't think of those being the 00:30:58.000 --> 00:31:01.000 platform of things I worship. They are 00:31:01.000 --> 00:31:05.000 things we need to get our job done. I 00:31:05.000 --> 00:31:08.000 did put a little note about pointers here because there were quite a lot of comments on 00:31:08.000 --> 00:31:12.000 the mid-quarter evals that said, pointers, pointers still scare me. They make me crazy. 00:31:12.000 --> 00:31:14.000 I don't know enough about 00:31:14.000 --> 00:31:15.000 pointers. Here's a fact for you. 00:31:15.000 --> 00:31:18.000 You'll probably never feel like you know enough about pointers. 00:31:18.000 --> 00:31:21.000 Today, tomorrow, ten years from now, you'll still kind of feel like 00:31:21.000 --> 00:31:24.000 there's some real subtlety and trickiness in there. 00:31:24.000 --> 00:31:27.000 They're an extremely powerful facility that gives you that 00:31:27.000 --> 00:31:31.000 direct control over allocation and deallocation and all the wiring and 00:31:31.000 --> 00:31:32.000 sharing 00:31:32.000 --> 00:31:34.000 that 00:31:34.000 --> 00:31:37.000 is important to building these complicated data structures. 00:31:37.000 --> 00:31:40.000 But there are so many pitfalls. I'm sure you've run into 00:31:40.000 --> 00:31:43.000 most of these, if not all of them at some point, where you 00:31:43.000 --> 00:31:46.000 mishandled a pointer, failed the allocator, deallocate, 00:31:46.000 --> 00:31:49.000 and 00:31:49.000 --> 00:31:52.000 the way it gets reported, whether the compiler or the runtime error that 00:31:52.000 --> 00:31:56.000 you see is helpful in sorting it out is a real cause for 00:31:56.000 --> 00:31:58.000 long hours to disappear 00:31:58.000 --> 00:32:01.000 in the name of mastery of this. 00:32:01.000 --> 00:32:02.000 If you're still wary of pointers, you 00:32:02.000 --> 00:32:04.000 probably should be. 00:32:04.000 --> 00:32:06.000 Think of this as your first go-around. 00:32:06.000 --> 00:32:07.000 We're still 00:32:07.000 --> 00:32:10.000 at the journeyman's stage for programming here. 00:32:10.000 --> 00:32:14.000 So although there were pointers in 106a, they were very hidden. This 00:32:14.000 --> 00:32:17.000 is the first time you really see it and are exposed to it and trying to get your head 00:32:17.000 --> 00:32:18.000 around it. 00:32:18.000 --> 00:32:20.000 If you go on to 107 and 108, you'll just 00:32:20.000 --> 00:32:23.000 get more and more practice. It will 00:32:23.000 --> 00:32:26.000 become more solid as you keep working your way through it, but 00:32:26.000 --> 00:32:30.000 at this stage, it's okay 00:32:30.000 --> 00:32:31.000 and 00:32:31.000 --> 00:32:35.000 commonplace to feel still, a little bit - you see the star, and you take a 00:32:35.000 --> 00:32:37.000 double take. That's just where you should be. 00:32:37.000 --> 00:32:40.000 It's just like in the 00:32:40.000 --> 00:32:43.000 Chinese restaurants. They put the little red star by food you want to be 00:32:43.000 --> 00:32:49.000 careful of. That's why they put the star for the 00:32:49.000 --> 00:32:52.000 pointer. I also wanted to 00:32:52.000 --> 00:32:54.000 zoom out farther, 00:32:54.000 --> 00:32:56.000 a decade from 00:32:56.000 --> 00:32:59.000 now, when you are 00:32:59.000 --> 00:33:01.000 thinking back on the 00:33:01.000 --> 00:33:04.000 whole of your education and what you learned. Do I want to really 00:33:04.000 --> 00:33:06.000 remember how to 00:33:06.000 --> 00:33:08.000 type in keyword and the sequel plus generic 00:33:08.000 --> 00:33:10.000 template facility? No. 00:33:10.000 --> 00:33:13.000 If you code in C++ every day, you'll know that. If you don't, you 00:33:13.000 --> 00:33:14.000 can forget it. It's long gone. 00:33:14.000 --> 00:33:16.000 What I hope you'll carry away from you 00:33:16.000 --> 00:33:18.000 is a couple 00:33:18.000 --> 00:33:19.000 concepts that 00:33:19.000 --> 00:33:21.000 broaden 00:33:21.000 --> 00:33:23.000 outside of CS. One is this idea of algorithmic thinking, 00:33:23.000 --> 00:33:27.000 how you approach something logically and step-wise and draw pictures and analyze it to understand 00:33:27.000 --> 00:33:29.000 the cases 00:33:29.000 --> 00:33:32.000 and to debug when things aren't working well. I think that's a skill that 00:33:32.000 --> 00:33:33.000 transcends 00:33:33.000 --> 00:33:37.000 the art of computer programming into other domains, any kind of thinking 00:33:37.000 --> 00:33:39.000 logically and problem solving and analyzing in that 00:33:39.000 --> 00:33:43.000 way. I also hope that some of the experimental things we did with the sort 00:33:43.000 --> 00:33:45.000 lab and the P-queue 00:33:45.000 --> 00:33:48.000 and some of the big O and thing we tried to do in class together help 00:33:48.000 --> 00:33:51.000 you to [inaudible] back of the envelope calculations. This is something I'm 00:33:51.000 --> 00:33:53.000 a little bit of a - 00:33:53.000 --> 00:33:55.000 it's a pet issue of mine, which is, 00:33:55.000 --> 00:33:57.000 I think, in the era of computers 00:33:57.000 --> 00:34:00.000 and whatnot, really easy to get distanced from numbers and start 00:34:00.000 --> 00:34:03.000 thinking, I just punch numbers into formulas and one comes out. It must be 00:34:03.000 --> 00:34:06.000 the truth. Not having any intuition as to whether that's the right 00:34:06.000 --> 00:34:10.000 number, whether that number makes sense, whether it's grounded, 00:34:10.000 --> 00:34:14.000 and whether you actually are [inaudible] or so off because you've 00:34:14.000 --> 00:34:17.000 actually made an entry error, and you don't even notice it. 00:34:17.000 --> 00:34:20.000 So becoming comfortable with this idea of using numbers to drive things, 00:34:20.000 --> 00:34:21.000 being able to 00:34:21.000 --> 00:34:26.000 take some time trials, do some predictions, do some more time trials, 00:34:26.000 --> 00:34:28.000 match those things up, do a little sketching on your numbers and see the 00:34:28.000 --> 00:34:32.000 things that are working out. So being comfortable having the math and the theory 00:34:32.000 --> 00:34:34.000 reinforce each other 00:34:34.000 --> 00:34:34.000 and 00:34:34.000 --> 00:34:36.000 not get too 00:34:36.000 --> 00:34:39.000 distanced from the fact that those numbers do play out in 00:34:39.000 --> 00:34:42.000 the real world in ways that are interesting. Don't just let the numbers themselves 00:34:42.000 --> 00:34:44.000 tell the story. There really is more 00:34:44.000 --> 00:34:47.000 to connect it up with. 00:34:47.000 --> 00:34:49.000 This idea of tradeoffs. 00:34:49.000 --> 00:34:51.000 You may not remember 00:34:51.000 --> 00:34:54.000 how you can write a Linklis or how the hash table does 00:34:54.000 --> 00:34:56.000 the things it does 00:34:56.000 --> 00:35:00.000 and how to get one working, but hopefully you will take away this idea that 00:35:00.000 --> 00:35:03.000 the answer to most questions begins with the phrase, it depends. 00:35:03.000 --> 00:35:05.000 If I say, is it better to use a hash table or [inaudible] search tree, the answer 00:35:05.000 --> 00:35:09.000 will be, it depends. There's no, 00:35:09.000 --> 00:35:12.000 it's always better to use this sort. It's always 00:35:12.000 --> 00:35:14.000 wrong to use this approach. 00:35:14.000 --> 00:35:15.000 It's, it depends. Do 00:35:15.000 --> 00:35:18.000 you have the code written for that, and it works well, and you don't care how 00:35:18.000 --> 00:35:19.000 long it takes? 00:35:19.000 --> 00:35:23.000 Then bubble sort actually could be the right answer. 00:35:23.000 --> 00:35:25.000 Barack Obama not withstanding. 00:35:25.000 --> 00:35:27.000 But knowing whether memory is 00:35:27.000 --> 00:35:32.000 at premium, time is at premium, what your mix of operations are, 00:35:32.000 --> 00:35:33.000 what 00:35:33.000 --> 00:35:34.000 language you're working with, what 00:35:34.000 --> 00:35:38.000 program expertise you have, what timeline you have for developing it all 00:35:38.000 --> 00:35:42.000 can play a role in deciding what choices to make and what strategy to pursue. 00:35:42.000 --> 00:35:45.000 So being very comfortable with this idea that it's about gathering information 00:35:45.000 --> 00:35:46.000 before you 00:35:46.000 --> 00:35:49.000 can commit to one strategy as opposed to it's always going to be 00:35:49.000 --> 00:35:52.000 this or that. 00:35:52.000 --> 00:35:54.000 So that's the 106b thing. 00:35:54.000 --> 00:35:57.000 I have a couple slides here on 00:35:57.000 --> 00:35:59.000 things that happen after 106b. I'm 00:35:59.000 --> 00:36:01.000 just going to go ahead and show you. 00:36:01.000 --> 00:36:03.000 00:36:03.000 --> 00:36:06.000 This would be a great time to ask questions. In particular, if there's anything you're 00:36:06.000 --> 00:36:07.000 curious about. 00:36:07.000 --> 00:36:10.000 Where do we go from here, and 00:36:10.000 --> 00:36:13.000 how do we make good decisions about things that follow? The 00:36:13.000 --> 00:36:17.000 most obvious following courses from CS - if you took 106b 00:36:17.000 --> 00:36:19.000 and you hated it, I'm very sorry. 00:36:19.000 --> 00:36:23.000 Hopefully 00:36:23.000 --> 00:36:25.000 the scarring 00:36:25.000 --> 00:36:26.000 will heal over and you'll lead a 00:36:26.000 --> 00:36:28.000 productive and happy life 00:36:28.000 --> 00:36:30.000 elsewhere. If you did like it and you think, 00:36:30.000 --> 00:36:33.000 I'm kind of interested in this. What do I do next to explore further? 00:36:33.000 --> 00:36:35.000 Where else can I go with this? 00:36:35.000 --> 00:36:38.000 I put up the intro 00:36:38.000 --> 00:36:40.000 courses with their numbers and names, and I'll talk a little bit 00:36:40.000 --> 00:36:44.000 about what each of these does to give you an idea of what opportunity it 00:36:44.000 --> 00:36:45.000 offers to you. 00:36:45.000 --> 00:36:49.000 The obvious follow onto the programming side - 106a and b 00:36:49.000 --> 00:36:51.000 are programming courses, building your programming mastery. 00:36:51.000 --> 00:36:54.000 107 follow in the same vein. We typically call those systems 00:36:54.000 --> 00:36:56.000 courses in CS nomenclature. 00:36:56.000 --> 00:36:59.000 That means practical, hands on programming, getting more exposure 00:36:59.000 --> 00:37:00.000 to the system. 00:37:00.000 --> 00:37:05.000 So 107 builds on 106b's knowledge in a class called programming paradigms. 00:37:05.000 --> 00:37:08.000 Part of what it's trying to explore is different languages. 00:37:08.000 --> 00:37:11.000 So you actually look at the language, C, kind of stepping back to old school. 00:37:11.000 --> 00:37:15.000 You look at the language scheme, which is a functional language that 00:37:15.000 --> 00:37:17.000 was developed for AI purposes 00:37:17.000 --> 00:37:17.000 that 00:37:17.000 --> 00:37:21.000 has a long history coming from the math side of things. Probably looking at some 00:37:21.000 --> 00:37:25.000 scripting and flexible [inaudible] like Python. We kind of mix them up a 00:37:25.000 --> 00:37:28.000 little bit, so it's not exactly the same things, but looking at other 00:37:28.000 --> 00:37:30.000 languages, other ways of doing things. 00:37:30.000 --> 00:37:33.000 The other component of 107 is also looking at 00:37:33.000 --> 00:37:35.000 some more low-level things. 00:37:35.000 --> 00:37:37.000 What happens? When you 00:37:37.000 --> 00:37:41.000 pass your code off to the compiler, what is it doing? What kind of analysis does do of your 00:37:41.000 --> 00:37:44.000 code? How does it actually generate machine code? What happens in the machine layer, 00:37:44.000 --> 00:37:47.000 understanding some more of the performance implications 00:37:47.000 --> 00:37:50.000 and how things are expressed. Getting a better understanding of pointers actually 00:37:50.000 --> 00:37:53.000 comes up because there is a certain amount of low-level coding that's being 00:37:53.000 --> 00:37:54.000 explored in this class. 00:37:54.000 --> 00:37:58.000 So building on the programming skill as we move toward a Unix 00:37:58.000 --> 00:38:00.000 Linux-based 00:38:00.000 --> 00:38:02.000 environment. So away from the Mac and Windows. 00:38:02.000 --> 00:38:03.000 00:38:03.000 --> 00:38:08.000 Just continue building larger programs with complicated 00:38:08.000 --> 00:38:10.000 features to grow your mastery. 00:38:10.000 --> 00:38:13.000 Internally, we call this class programming maturity, even though 00:38:13.000 --> 00:38:14.000 it's not its title. 00:38:14.000 --> 00:38:18.000 We see it that way in the sequence in terms of just growing you into 00:38:18.000 --> 00:38:19.000 a 00:38:19.000 --> 00:38:22.000 more versatile programmer, seeing different languages, different things 00:38:22.000 --> 00:38:25.000 and getting more understanding of what happens under the hood. 00:38:25.000 --> 00:38:29.000 It is a five-unit class. It's typically offered fall and spring. Most 00:38:29.000 --> 00:38:30.000 people thing it is 00:38:30.000 --> 00:38:33.000 about as much work as 106b. Some a 00:38:33.000 --> 00:38:37.000 little more or a little less in different parts of the quarter, depending on what your 00:38:37.000 --> 00:38:38.000 background is and 00:38:38.000 --> 00:38:39.000 00:38:39.000 --> 00:38:41.000 how 106b worked for you. 00:38:41.000 --> 00:38:44.000 In a model, I think most people think of it as being - would you guys 00:38:44.000 --> 00:38:45.000 say 107 00:38:45.000 --> 00:38:48.000 about as hard as 106b? 00:38:48.000 --> 00:38:51.000 Some ways, it's harder. Some ways, it's not. Maybe that's 00:38:51.000 --> 00:38:55.000 the deal. The 108 class which follows onto 107, but that 00:38:55.000 --> 00:38:57.000 prerequisite is particularly strong. It's a Java class. 00:38:57.000 --> 00:39:00.000 So moving in the other direction. Rather than moving down an extraction, moving 00:39:00.000 --> 00:39:02.000 up on the extraction, 00:39:02.000 --> 00:39:05.000 looking at more large-scale design, modern programming technique, 00:39:05.000 --> 00:39:06.000 using 00:39:06.000 --> 00:39:10.000 object-oriented facilities, thinking about large-scale programming. You do a 00:39:10.000 --> 00:39:11.000 big team project in the 00:39:11.000 --> 00:39:13.000 end, so a three week, 00:39:13.000 --> 00:39:15.000 three person, massive effort 00:39:15.000 --> 00:39:17.000 that is the first 00:39:17.000 --> 00:39:26.000 really big scale thing that you'll be exposed to in the lower-division 00:39:26.000 --> 00:39:30.000 curriculum. [Inaudible]. Sometimes in 107l, which has some relationship to the 00:39:30.000 --> 00:39:33.000 106l in the way that it's more hands-on, more 00:39:33.000 --> 00:39:34.000 C++, 00:39:34.000 --> 00:39:37.000 I don't know whether it will be offered this spring or not, 00:39:37.000 --> 00:39:38.000 but 00:39:38.000 --> 00:39:41.000 if you look in Access, it will tell you at least whether we have it tentatively on 00:39:41.000 --> 00:39:42.000 the schedule. 00:39:42.000 --> 00:39:46.000 I think that comes and goes with the interest from the students and the 00:39:46.000 --> 00:39:51.000 availability of someone to teach 00:39:51.000 --> 00:39:52.000 it. 00:39:52.000 --> 00:39:55.000 [Inaudible] can we watch [inaudible]? 00:39:55.000 --> 00:39:58.000 Yeah, 107 and 108 are offered on TV usually 00:39:58.000 --> 00:40:02.000 once a year. 107 is offered on TV this spring. The one in fall is 00:40:02.000 --> 00:40:03.000 not taped. 00:40:03.000 --> 00:40:06.000 It might be that the old lectures from spring are somewhere that 00:40:06.000 --> 00:40:09.000 you could dig them out, but I actually don't happen to know for sure. 00:40:09.000 --> 00:40:11.000 108 is on TV, I think, 00:40:11.000 --> 00:40:13.000 in the fall, not in the winter. So the fall lectures are probably still around for 00:40:13.000 --> 00:40:16.000 108, so you could take a look at them. You 00:40:16.000 --> 00:40:19.000 could certainly look at their websites, and 00:40:19.000 --> 00:40:23.000 there's often a lot of old materials and handouts 00:40:23.000 --> 00:40:26.000 and stuff that gives you at least some idea of things that have been 00:40:26.000 --> 00:40:29.000 covered in the most recent offering of it. 00:40:29.000 --> 00:40:34.000 Certainly showing up in the first week is one way to get an 00:40:34.000 --> 00:40:36.000 idea of what you're in for. 00:40:36.000 --> 00:40:38.000 108 is a four-unit class. 00:40:38.000 --> 00:40:41.000 Because it's in Java, there's certain things about Java, 00:40:41.000 --> 00:40:44.000 the safety and 00:40:44.000 --> 00:40:48.000 the attentiveness of the compiler, makes the error handing a little 00:40:48.000 --> 00:40:52.000 bit easier. So I don't think most people think of 108 as quite as intense as 00:40:52.000 --> 00:40:55.000 107 or 106b, but that project at the end has quite a reputation 00:40:55.000 --> 00:40:58.000 as being a 00:40:58.000 --> 00:41:02.000 real time suck. So it has lots to do with being 00:41:02.000 --> 00:41:04.000 scheduled and motivated and working through that team project at 00:41:04.000 --> 00:41:05.000 the end. 00:41:05.000 --> 00:41:09.000 I think that's where most people find the big challenge of 00:41:09.000 --> 00:41:12.000 108. Then on the math side, in the 00:41:12.000 --> 00:41:15.000 nomenclature for CS, we have this idea of theory, introducing you to more 00:41:15.000 --> 00:41:16.000 00:41:16.000 --> 00:41:19.000 the formalism for big O and discrete structures and doing proofs and 00:41:19.000 --> 00:41:21.000 thinking about logic. 00:41:21.000 --> 00:41:25.000 So we have a 103 a and b and a 103x, which is a combined, 00:41:25.000 --> 00:41:28.000 accelerated form of a and b, which serves as math classes. They are math classes 00:41:28.000 --> 00:41:29.000 for introducing 00:41:29.000 --> 00:41:33.000 the kind of mathematics that are important for computer scientists. 00:41:33.000 --> 00:41:34.000 So that is 00:41:34.000 --> 00:41:38.000 for people who are thinking about a CS major, another good course to get in 00:41:38.000 --> 00:41:39.000 early. Pretty 00:41:39.000 --> 00:41:43.000 much everything in the upper division of the CS major has 00:41:43.000 --> 00:41:47.000 these things layered down as prereqs. So these are the intro courses that serve as the 00:41:47.000 --> 00:41:50.000 gateway to prepping you to go on, and then looking at things in a more topical 00:41:50.000 --> 00:41:51.000 way. 00:41:51.000 --> 00:41:53.000 Looking at networks or security, 00:41:53.000 --> 00:41:54.000 HCI or 00:41:54.000 --> 00:41:59.000 artificial intelligence layers on a grounding in theory and programming 00:41:59.000 --> 00:42:03.000 that came from the intro courses. So 00:42:03.000 --> 00:42:07.000 that said, we're in the 00:42:07.000 --> 00:42:08.000 midst of a really 00:42:08.000 --> 00:42:10.000 serious curriculum revision, 00:42:10.000 --> 00:42:13.000 which is 00:42:13.000 --> 00:42:15.000 on the table and being designed right now, 00:42:15.000 --> 00:42:18.000 and was voted on by our faculty and improved and has to go 00:42:18.000 --> 00:42:22.000 through the school of engineering and get approval. Then there's going to be a little bit of 00:42:22.000 --> 00:42:23.000 jostling as everything gets worked out. 00:42:23.000 --> 00:42:27.000 But started next year, you're going to see some changes in the 00:42:27.000 --> 00:42:29.000 department. Some of these courses are going to 00:42:29.000 --> 00:42:33.000 get morphed, and 00:42:33.000 --> 00:42:36.000 they won't be exactly the way they are today. 00:42:36.000 --> 00:42:39.000 As a Stanford undergrad, you actually have the option of graduating under any set of 00:42:39.000 --> 00:42:42.000 requirements that are in play any time you're an undergrad. So you could graduate under this year's 00:42:42.000 --> 00:42:43.000 requirements, 00:42:43.000 --> 00:42:45.000 or you could wait. 00:42:45.000 --> 00:42:48.000 I think for most, the right answer's going to be wait because I think the new 00:42:48.000 --> 00:42:51.000 arrangement is very much more student-friendly. 00:42:51.000 --> 00:42:53.000 It has more flexibility 00:42:53.000 --> 00:42:57.000 in the program. If you look at the CS program as it is, it has a lot of pick 00:42:57.000 --> 00:43:00.000 one of two, pick one of three, pick one of two, 00:43:00.000 --> 00:43:04.000 where you're very constrained on what you can choose. The new program is going to be 00:43:04.000 --> 00:43:07.000 more track-based where you pick an area of interest. You say, I'm really interested in the 00:43:07.000 --> 00:43:08.000 graphics. You do a little 00:43:08.000 --> 00:43:12.000 depth concentration in that, and then just have room for a 00:43:12.000 --> 00:43:15.000 very wide selection of electives to fill up the program as opposed to 00:43:15.000 --> 00:43:19.000 a smorgasbord where we forced you to pick through a bunch of different areas. 00:43:19.000 --> 00:43:24.000 So for most students, I think it gives you flexibility that actually gives you 00:43:24.000 --> 00:43:25.000 the options 00:43:25.000 --> 00:43:26.000 to 00:43:26.000 --> 00:43:30.000 pick the classes that are most interesting to you and get the material that you want. It's a growing 00:43:30.000 --> 00:43:32.000 recognition of the fact that CS 00:43:32.000 --> 00:43:35.000 has really broadened as a field. Some say that our major looks the same way 00:43:35.000 --> 00:43:37.000 it did ten years ago, 00:43:37.000 --> 00:43:40.000 a classic definition of here's these 00:43:40.000 --> 00:43:42.000 seven things that all matter. The 00:43:42.000 --> 00:43:45.000 things aren't seven anymore. There's no 15 of them, 25 of them. If 00:43:45.000 --> 00:43:49.000 you look at the spectrum of things that computer science is now 00:43:49.000 --> 00:43:52.000 embracing, we can't say, you have to take one of all these things without keeping you here 00:43:52.000 --> 00:43:53.000 for eight years. 00:43:53.000 --> 00:43:55.000 So we are 00:43:55.000 --> 00:43:58.000 allowing that our field's footprint has gotten so large that we need to 00:43:58.000 --> 00:44:02.000 let you pick and choose a little bit to get the slice that seems most 00:44:02.000 --> 00:44:03.000 appropriate for where you're headed. 00:44:03.000 --> 00:44:06.000 So you'll see more about that. 00:44:06.000 --> 00:44:10.000 If you are thinking about being a CS major 00:44:10.000 --> 00:44:13.000 and have not yet declared, 00:44:13.000 --> 00:44:17.000 I would advise 00:44:17.000 --> 00:44:22.000 that you add yourself to what's called the considering CS list, 00:44:22.000 --> 00:44:26.000 and you can get to this from the CS course advisor's page, 00:44:26.000 --> 00:44:27.000 which, 00:44:27.000 --> 00:44:30.000 off the top of my head, I think 00:44:30.000 --> 00:44:32.000 it might be 00:44:32.000 --> 00:44:36.000 CSadvising, but I'm not positive. I'll put the link on 00:44:36.000 --> 00:44:39.000 our webpage, when I remember what it is. 00:44:39.000 --> 00:44:42.000 It is a mailing list. So we have a mailing list of the declared undergrads, but we 00:44:42.000 --> 00:44:45.000 also have a list for people who are 00:44:45.000 --> 00:44:48.000 at least potentially interested in the major and just want to be kept up to date on 00:44:48.000 --> 00:44:49.000 things that are happening. 00:44:49.000 --> 00:44:52.000 There's going to be a lot of activity on this list in the 00:44:52.000 --> 00:44:55.000 upcoming months as we publish the information about the new 00:44:55.000 --> 00:44:56.000 curriculum. 00:44:56.000 --> 00:44:57.000 So 00:44:57.000 --> 00:45:00.000 you'll know what's coming down the pipes so you can make good decisions for 00:45:00.000 --> 00:45:01.000 preparing for 00:45:01.000 --> 00:45:04.000 what will be [inaudible], what the transition plan will be as we move into the new 00:45:04.000 --> 00:45:05.000 courses 00:45:05.000 --> 00:45:06.000 and how to make sure you 00:45:06.000 --> 00:45:09.000 land 00:45:09.000 --> 00:45:13.000 in the right place when all is said 00:45:13.000 --> 00:45:14.000 and done. Then I put 00:45:14.000 --> 00:45:15.000 a little note about 00:45:15.000 --> 00:45:17.000 other majors that exist. 00:45:17.000 --> 00:45:19.000 CS is 00:45:19.000 --> 00:45:20.000 00:45:20.000 --> 00:45:22.000 the home base for people who are 00:45:22.000 --> 00:45:25.000 solving problems using computers. It spans a large variety of people 00:45:25.000 --> 00:45:27.000 doing a lot of different things. 00:45:27.000 --> 00:45:30.000 Then there's a couple other majors that we participate in, in the interdisciplinary 00:45:30.000 --> 00:45:32.000 sense, 00:45:32.000 --> 00:45:34.000 that UCS is part of a larger 00:45:34.000 --> 00:45:37.000 context for doing things. So computer systems engineering, which is 00:45:37.000 --> 00:45:40.000 between us and EE, which is a little bit more about hardware. 00:45:40.000 --> 00:45:43.000 The symbolic systems program, which is 00:45:43.000 --> 00:45:47.000 run by linguistics and includes cooperation from psychology, philosophy, communications, 00:45:47.000 --> 00:45:51.000 CS looking at artificial intelligence and modeling human 00:45:51.000 --> 00:45:52.000 thought and 00:45:52.000 --> 00:45:53.000 understanding 00:45:53.000 --> 00:45:55.000 00:45:55.000 --> 00:45:57.000 intelligence as expressed in that way. 00:45:57.000 --> 00:45:59.000 Then there's a mathematical computational science program that's homed 00:45:59.000 --> 00:46:03.000 in the statistic department that is joint between math, computer science and the 00:46:03.000 --> 00:46:06.000 MS and E. It's more of an applied math degree. So taking 00:46:06.000 --> 00:46:08.000 mathematical thinking and 00:46:08.000 --> 00:46:09.000 00:46:09.000 --> 00:46:12.000 using computers and statistical things to analyze and 00:46:12.000 --> 00:46:16.000 think about things. A lot of financial people end up 00:46:16.000 --> 00:46:20.000 being drawn to that particular one or people who are doing risk analysis or 00:46:20.000 --> 00:46:21.000 that sort of 00:46:21.000 --> 00:46:24.000 combination of things. But where CS has a role to play in all of these because it's a 00:46:24.000 --> 00:46:27.000 great tool for solving these kind of problems. So 00:46:27.000 --> 00:46:29.000 different ways to get at 00:46:29.000 --> 00:46:30.000 different pieces 00:46:30.000 --> 00:46:33.000 of the major. 00:46:33.000 --> 00:46:35.000 I've put a couple notes of things that 00:46:35.000 --> 00:46:38.000 I think are worth 00:46:38.000 --> 00:46:39.000 looking into 00:46:39.000 --> 00:46:40.000 in case they, 00:46:40.000 --> 00:46:43.000 at some point, are the right thing for you to look at. Section leading, 00:46:43.000 --> 00:46:45.000 so joining the 106 staff 00:46:45.000 --> 00:46:46.000 and shepherding the next generation 00:46:46.000 --> 00:46:49.000 of students through the program 00:46:49.000 --> 00:46:52.000 is a great way to strengthen your own knowledge. There's nothing like 00:46:52.000 --> 00:46:54.000 making sure you understand recursion as to try and explain it to someone else who 00:46:54.000 --> 00:46:55.000 doesn't understand it. 00:46:55.000 --> 00:46:56.000 I 00:46:56.000 --> 00:46:59.000 can really do a lot for 00:46:59.000 --> 00:47:01.000 the personal connection to you and your craft 00:47:01.000 --> 00:47:04.000 as well as just being involved in a great community. The section leaders are a really 00:47:04.000 --> 00:47:05.000 wonderful, 00:47:05.000 --> 00:47:08.000 dedicated group of students who are 00:47:08.000 --> 00:47:08.000 00:47:08.000 --> 00:47:12.000 fun to be around and great people to make friends with. So I 00:47:12.000 --> 00:47:14.000 highly encourage you to take a look at that. 00:47:14.000 --> 00:47:16.000 We interview for that every quarter, 00:47:16.000 --> 00:47:17.000 so usually about week six, 00:47:17.000 --> 00:47:19.000 we put out 00:47:19.000 --> 00:47:21.000 applications and bring people in for interviews. We're always looking for new people 00:47:21.000 --> 00:47:24.000 to join on a quarterly basis. 00:47:24.000 --> 00:47:27.000 There are research opportunities all around the campus. One that the CS 00:47:27.000 --> 00:47:30.000 department sponsors in particular is their [inaudible] summer institute. 00:47:30.000 --> 00:47:33.000 So matching professors with undergrads to do work in the summer for 00:47:33.000 --> 00:47:35.000 slave wages. 00:47:35.000 --> 00:47:38.000 But great opportunity. You can get started on 00:47:38.000 --> 00:47:40.000 thinking about how research might be part of your future. 00:47:40.000 --> 00:47:43.000 One of the best preparations for thinking about grad school. 00:47:43.000 --> 00:47:46.000 We do have an honors program. 00:47:46.000 --> 00:47:50.000 Like most, we have a chance to stake out on an independent project and 00:47:50.000 --> 00:47:50.000 show your mettle. 00:47:50.000 --> 00:47:53.000 We have a co-terminal masters program 00:47:53.000 --> 00:47:54.000 where you can 00:47:54.000 --> 00:47:57.000 get some of the depth 00:47:57.000 --> 00:47:58.000 before you leave this place. 00:47:58.000 --> 00:48:01.000 Not to mention, there's a million fabulous 00:48:01.000 --> 00:48:03.000 things to do with Stanford, getting involved with 00:48:03.000 --> 00:48:06.000 arts or nonprofit organizations or sports or the student leadership 00:48:06.000 --> 00:48:07.000 or whatever. 00:48:07.000 --> 00:48:10.000 Maybe the most important memory I have from my undergrad was that the 00:48:10.000 --> 00:48:13.000 trickiest part of my undergrad was learning what to say no to. 00:48:13.000 --> 00:48:14.000 Realizing that there are 00:48:14.000 --> 00:48:17.000 10,000 fabulous, worthwhile opportunities, 00:48:17.000 --> 00:48:21.000 and doing all of them only means sacrificing your sanity and 00:48:21.000 --> 00:48:23.000 the quality of the things you do. 00:48:23.000 --> 00:48:24.000 00:48:24.000 --> 00:48:29.000 So be careful and thoughtful about choosing the things you do and embracing them. 00:48:29.000 --> 00:48:32.000 Be okay with saying, I won't do this. Even though it's great to go overseas, and I 00:48:32.000 --> 00:48:35.000 would've liked that opportunity, it's not going to fit, and I'm okay with that. Don't 00:48:35.000 --> 00:48:38.000 beat yourself up over it and cause 00:48:38.000 --> 00:48:41.000 yourself too much grief trying to do too many things. 00:48:41.000 --> 00:48:44.000 I won't be here on Friday. Keith is going to come. He's going to teach you about C++ in the real 00:48:44.000 --> 00:48:45.000 world. 00:48:45.000 --> 00:48:47.000 I'm going to see you guys Friday at the final. 00:48:47.000 --> 00:48:57.000 I think that's all I need to say for today.