WEBVTT

00:00:00.000 --> 00:00:13.000


00:00:13.000 --> 00:00:16.000
This presentation is delivered by the Stanford Center for Professional

00:00:16.000 --> 00:00:26.000
Development.

00:00:26.000 --> 00:00:27.000
Hi, there.

00:00:27.000 --> 00:00:29.000
It's

00:00:29.000 --> 00:00:36.000
Wednesday. We are coming to the end of this fine quarter. So

00:00:36.000 --> 00:00:39.000
what I'm going to do today is I

00:00:39.000 --> 00:00:43.000
brought a cup filled with candy. We'll start with that

00:00:43.000 --> 00:00:45.000
because that's going to be helpful. What I'd like to do is it's just a couple

00:00:45.000 --> 00:00:50.000
bits of loose ends, some things that I wanted to touch on that

00:00:50.000 --> 00:00:53.000
have come and gone this quarter. I maybe wanted to pull them together and reach a bit of closure

00:00:53.000 --> 00:00:54.000
on.

00:00:54.000 --> 00:00:57.000
Then hopefully, if we have time, we'll move on to what

00:00:57.000 --> 00:01:00.000
happens after this. If you liked 106b, what do you do? If you don't like

00:01:00.000 --> 00:01:01.000
106b, what do you do?

00:01:01.000 --> 00:01:02.000


00:01:02.000 --> 00:01:06.000
What kind of options and future opportunities are there

00:01:06.000 --> 00:01:08.000
after this

00:01:08.000 --> 00:01:10.000
ends? I'm going to talk a little about some of the things that

00:01:10.000 --> 00:01:12.000
[inaudible], but I do want to talk about the final just

00:01:12.000 --> 00:01:15.000
for a second because it is kind of a

00:01:15.000 --> 00:01:17.000
last

00:01:17.000 --> 00:01:19.000
opportunity for you to show us your mastery of the material

00:01:19.000 --> 00:01:22.000
and get the grade that you've been working hard for all quarter.

00:01:22.000 --> 00:01:26.000
It does carry a certain amount of weight. 30 percent of the weight in the default

00:01:26.000 --> 00:01:27.000
system is actually placed on the final,

00:01:27.000 --> 00:01:30.000
and more of it can move to that if you have a better

00:01:30.000 --> 00:01:33.000
showing on the final than you did in the midterm. We're happy to move some of that

00:01:33.000 --> 00:01:34.000
weight over so it could actually

00:01:34.000 --> 00:01:36.000
account for even more of your total grade.

00:01:36.000 --> 00:01:38.000
So there's a lot riding on it, and you want to be sure you come

00:01:38.000 --> 00:01:40.000
ready to show us the

00:01:40.000 --> 00:01:44.000
things that you know and that you have comprehension and mastery over.

00:01:44.000 --> 00:01:47.000
To note, though, it is going to look a little bit different than the midterm in terms of its

00:01:47.000 --> 00:01:51.000
emphasis. We are going to see a lot more of this nitty gritty dynamic data

00:01:51.000 --> 00:01:52.000
structure,

00:01:52.000 --> 00:01:53.000
Linklis, trees,

00:01:53.000 --> 00:01:56.000
graphs, pointer-based

00:01:56.000 --> 00:01:58.000
structures with the

00:01:58.000 --> 00:02:01.000
inherent complications in dealing with that kind of dynamic structure. So bringing

00:02:01.000 --> 00:02:03.000
your best game to this exam

00:02:03.000 --> 00:02:05.000
is certainly going to pay off here. The

00:02:05.000 --> 00:02:08.000
coding that you will see will look a lot like what's in the assignments. If you already had a

00:02:08.000 --> 00:02:11.000
chance to look at the practice exam, you'll see that there are

00:02:11.000 --> 00:02:13.000
themes there that echo the things you did on the work. You want to try to make sure

00:02:13.000 --> 00:02:15.000
that's we're connecting to the

00:02:15.000 --> 00:02:16.000


00:02:16.000 --> 00:02:17.000
effort you've put in all along.

00:02:17.000 --> 00:02:20.000
There will actually be some opportunity to think a little bit at a higher level

00:02:20.000 --> 00:02:23.000
about making and analyzing choices for design and implementation. I'm going to talk a

00:02:23.000 --> 00:02:25.000
little bit about that today because I feel like

00:02:25.000 --> 00:02:26.000
at any given moment, we're talking about

00:02:26.000 --> 00:02:30.000
how we might implement a stack and how we might implement a queue. Sometimes it helps to

00:02:30.000 --> 00:02:32.000
step back a level and think a little bit about the choices for

00:02:32.000 --> 00:02:37.000
the bigger picture of when to use a stack and queue and how to make

00:02:37.000 --> 00:02:40.000
decisions in the large about those tradeoffs that are intelligent and

00:02:40.000 --> 00:02:42.000
grounded. So I'll do a little bit of that today, and then

00:02:42.000 --> 00:02:45.000
some of that has been built up all quarter long.

00:02:45.000 --> 00:02:48.000
It is open book and open notes. You can bring all your printouts, your readers,

00:02:48.000 --> 00:02:50.000
sections, all those sort of things.

00:02:50.000 --> 00:02:53.000
For some people, that translates to this idea that, well, I don't need to

00:02:53.000 --> 00:02:56.000
prepare because I'll have all my stuff there. If I need to know something

00:02:56.000 --> 00:02:59.000
in the exam, I'll just go look it up. That's certainly true for looking up details. You want to remember how

00:02:59.000 --> 00:03:02.000
this works or you had a piece of code that you think was similar. It could help

00:03:02.000 --> 00:03:05.000
you to brainstorm what you're trying to do this time. But

00:03:05.000 --> 00:03:08.000
you're not likely to have time to learn new things n the exam. So if there's

00:03:08.000 --> 00:03:11.000
something that you really feel you have not

00:03:11.000 --> 00:03:12.000
gotten your head around,

00:03:12.000 --> 00:03:17.000
the investment upfront before you get into the exam is where to get that

00:03:17.000 --> 00:03:19.000
understanding in place rather than trying, in the exam, to be flipping

00:03:19.000 --> 00:03:22.000
through chapter ten, trying to learn something. It

00:03:22.000 --> 00:03:23.000
may not really play out.

00:03:23.000 --> 00:03:27.000
I highly recommend practicing on paper in an exam-like environment. So really

00:03:27.000 --> 00:03:28.000
working

00:03:28.000 --> 00:03:31.000
the way you will be working in the exam to give yourself

00:03:31.000 --> 00:03:33.000
the best,

00:03:33.000 --> 00:03:35.000
consistent prep for what's happening there.

00:03:35.000 --> 00:03:39.000
Then some people have asked about extra problems, just wanting more of them to work on. I'll give you a

00:03:39.000 --> 00:03:41.000
couple places where you can look for things. One is that cs106 is

00:03:41.000 --> 00:03:44.000
being offered this same quarter, and they are just the honors version of our class.

00:03:44.000 --> 00:03:45.000
So they have some

00:03:45.000 --> 00:03:49.000
slightly notched up problems. But their practice and real midterm are posted on their

00:03:49.000 --> 00:03:51.000
website as well as their practice final.

00:03:51.000 --> 00:03:52.000


00:03:52.000 --> 00:03:55.000
They have very similar coverage. They did some assignments that were like

00:03:55.000 --> 00:03:58.000
ours, some that were a little bit different, but still go places to kind of

00:03:58.000 --> 00:03:59.000
just grab

00:03:59.000 --> 00:04:00.000
actual exam problems with their solution.

00:04:00.000 --> 00:04:04.000
The chapter exercises and section problems often are old

00:04:04.000 --> 00:04:05.000
exam problems or

00:04:05.000 --> 00:04:08.000
eventually became exam problems. So they kind of come and go in terms of being used as

00:04:08.000 --> 00:04:12.000
examples or used as exam problems. So they're definitely places to

00:04:12.000 --> 00:04:17.000
pick up extra mileage. Okay. So let me tell

00:04:17.000 --> 00:04:18.000
you a little bit about

00:04:18.000 --> 00:04:21.000
- any questions about the final? Okay. Let me talk about

00:04:21.000 --> 00:04:26.000
this line

00:04:26.000 --> 00:04:28.000
just for a second because I feel like we have

00:04:28.000 --> 00:04:31.000
touched on these themes again and again, but maybe it's good to have

00:04:31.000 --> 00:04:32.000
at least one

00:04:32.000 --> 00:04:35.000
moment of trying to pull this together and think a little bit about

00:04:35.000 --> 00:04:38.000
the closure and the big picture of all these things.

00:04:38.000 --> 00:04:41.000
A lot of what we're talking about in the beginning was we

00:04:41.000 --> 00:04:44.000
were giving you these abstractions, these cool things to do stuff with, a map, a

00:04:44.000 --> 00:04:45.000
stack, a queue, a vector,

00:04:45.000 --> 00:04:49.000
and problems to solve where those abstractions have a role to play.

00:04:49.000 --> 00:04:50.000
Then we spent a long time saying,

00:04:50.000 --> 00:04:53.000
okay, now that it's our job to make those abstractions work,

00:04:53.000 --> 00:04:56.000
what techniques can we use? What algorithms

00:04:56.000 --> 00:04:59.000
and data structures will help us manipulate and

00:04:59.000 --> 00:05:02.000
model those structures in efficient ways. We

00:05:02.000 --> 00:05:05.000
certainly spent a lot of time talking about runtime performance as though that

00:05:05.000 --> 00:05:08.000
was maybe a little too much in your mind to think that was the only thing

00:05:08.000 --> 00:05:11.000
that really mattered. How fast can we make something? O of N is better

00:05:11.000 --> 00:05:12.000
than N squared.

00:05:12.000 --> 00:05:15.000
Log N is better than N, driving to these holy grails of

00:05:15.000 --> 00:05:16.000
bringing the time down.

00:05:16.000 --> 00:05:19.000
It's certainly a very important factor

00:05:19.000 --> 00:05:20.000
that often -

00:05:20.000 --> 00:05:23.000
the difference between something being linear and something being quadratic is a

00:05:23.000 --> 00:05:27.000
matter of it being tractable at all for the problem at hand. You cannot sort a

00:05:27.000 --> 00:05:32.000
million numbers in a quadratic sort in nearly

00:05:32.000 --> 00:05:35.000
the same time. For large enough data sets, it'd

00:05:35.000 --> 00:05:37.000
be completely unfeasible.

00:05:37.000 --> 00:05:40.000
So it is important to have that big picture, to be thinking about it and know about

00:05:40.000 --> 00:05:42.000
those tradeoffs, to understand issues

00:05:42.000 --> 00:05:44.000
of scale here.

00:05:44.000 --> 00:05:47.000
But we also did some stuff on the homework about actual empirical time

00:05:47.000 --> 00:05:50.000
results, which is another way to bolster our understanding. The big O

00:05:50.000 --> 00:05:51.000
tells us these things, but

00:05:51.000 --> 00:05:53.000
what really happens in real life

00:05:53.000 --> 00:05:57.000
matches it, but not precisely. Those constant factors, we threw away,

00:05:57.000 --> 00:06:00.000
and other artifacts that are

00:06:00.000 --> 00:06:03.000
happening in the real world often give us new insights about things and

00:06:03.000 --> 00:06:06.000
challenges that we're facing.

00:06:06.000 --> 00:06:09.000
We also need to think about things like mix of expected operations, having some be

00:06:09.000 --> 00:06:11.000
slow

00:06:11.000 --> 00:06:14.000
at the consequence of other operations being fast. It's often a tradeoff we're

00:06:14.000 --> 00:06:16.000
willing to make.

00:06:16.000 --> 00:06:20.000
On the editor rougher, we drove toward getting all the operations to be

00:06:20.000 --> 00:06:23.000
fast. In an ideal world, that's certainly the best place to be, but

00:06:23.000 --> 00:06:26.000
it's also a matter at what cost

00:06:26.000 --> 00:06:29.000
in terms of code complexity and memory used

00:06:29.000 --> 00:06:33.000
and effort put into it. It may be that we're willing to tolerate some operations

00:06:33.000 --> 00:06:34.000
being less efficient

00:06:34.000 --> 00:06:39.000
as long as the most commonly used ones are very streamlined.

00:06:39.000 --> 00:06:42.000
We talked about these worst cases and degenerates about

00:06:42.000 --> 00:06:45.000
knowing when we can expect our algorithms to degrade and whether we

00:06:45.000 --> 00:06:47.000
want to do something about that

00:06:47.000 --> 00:06:50.000
or choose a different algorithm or provide some

00:06:50.000 --> 00:06:54.000
protection against those kind of cases coming through. To a lesser

00:06:54.000 --> 00:06:58.000
extent, we already talked about memory used, how much overhead

00:06:58.000 --> 00:07:01.000
there is. Seeing that we went to a pointer-based Linklis structure, we added

00:07:01.000 --> 00:07:03.000
the four byte overhead

00:07:03.000 --> 00:07:06.000
for every element. We went to an eight byte overhead once we got minor

00:07:06.000 --> 00:07:10.000
[inaudible] and maybe even a little bit more if we had balance factors included.

00:07:10.000 --> 00:07:13.000
Those things ride along with our data,

00:07:13.000 --> 00:07:17.000
increasing the size and the memory footprint of it.

00:07:17.000 --> 00:07:18.000
So there's a certain amount of

00:07:18.000 --> 00:07:21.000
overhead built into this system. There's also this notion of excess capacity. To what

00:07:21.000 --> 00:07:24.000
extent do we over allocate our

00:07:24.000 --> 00:07:27.000
data structures early, planning for future growth. In some senses, save

00:07:27.000 --> 00:07:32.000
ourselves that trouble of enlarging on demand. It's preparing for

00:07:32.000 --> 00:07:33.000


00:07:33.000 --> 00:07:34.000
a large

00:07:34.000 --> 00:07:36.000
allotment and then using it up when we find it.

00:07:36.000 --> 00:07:39.000
Then when we do exceed that capacity, having you do it again. So where

00:07:39.000 --> 00:07:42.000
do we make those tradeoffs about how much excess capacity and when to do

00:07:42.000 --> 00:07:44.000
the enlarging?

00:07:44.000 --> 00:07:48.000
It has a lot to do with what our general memory constraints are.

00:07:48.000 --> 00:07:51.000
I would say that memory has gotten to be quite

00:07:51.000 --> 00:07:53.000
free

00:07:53.000 --> 00:07:56.000
in recent processors and

00:07:56.000 --> 00:07:58.000
computer systems. You don't really think about

00:07:58.000 --> 00:08:01.000
100 bytes here, 200 bytes there, 1,000 bytes there. Even

00:08:01.000 --> 00:08:05.000
megabytes, you can talk about as thought it's a drop in the bucket.

00:08:05.000 --> 00:08:06.000


00:08:06.000 --> 00:08:09.000
So this is something that isn't given as much weight

00:08:09.000 --> 00:08:12.000
nowadays as it was ten years ago when there were a lot more constraints. But with

00:08:12.000 --> 00:08:14.000
the move for a lot more imbedded device programming, and looking the

00:08:14.000 --> 00:08:16.000
iPhone or

00:08:16.000 --> 00:08:19.000
cell phones or things like that where they don't have that same liberty

00:08:19.000 --> 00:08:23.000
to be wanton about memory, it's come back to being a little more careful and keeping

00:08:23.000 --> 00:08:25.000
an eye on it.

00:08:25.000 --> 00:08:28.000
There's also and issue here which trades off a lot with

00:08:28.000 --> 00:08:31.000
runtime, which is redundancy versus sharing. Having

00:08:31.000 --> 00:08:33.000
two different ways to get at the same piece of data

00:08:33.000 --> 00:08:38.000
often will provide you with quick access. For example, if you have a

00:08:38.000 --> 00:08:42.000
card catalogue for a library where you can look up by author, you can look up by title, you

00:08:42.000 --> 00:08:44.000
probably are maintaining two data structures.

00:08:44.000 --> 00:08:46.000
One that manipulates them,

00:08:46.000 --> 00:08:49.000
sorted by title, one that manipulates them sorted by author. They're both kind

00:08:49.000 --> 00:08:52.000
of looking at the same books in the end.

00:08:52.000 --> 00:08:55.000
So maybe there's this set of pointers in one map over here,

00:08:55.000 --> 00:08:58.000
indexed by author, another

00:08:58.000 --> 00:09:00.000
keyed by title over there.

00:09:00.000 --> 00:09:03.000
That means that I'm repeating all my data.

00:09:03.000 --> 00:09:06.000
Hopefully I'm sharing it with a pointer, so I'm not actually really repeating all the

00:09:06.000 --> 00:09:10.000
data, but I actually have two pointers or potentially two or more

00:09:10.000 --> 00:09:11.000
pointers to the same book,

00:09:11.000 --> 00:09:15.000
depending on the different ways you can get to it. That gives me that fast

00:09:15.000 --> 00:09:17.000
access. I want to know all the books written by Leo

00:09:17.000 --> 00:09:22.000
Lionni, I can find them easily, as well as finding all the books that have King in the

00:09:22.000 --> 00:09:26.000
title. I don't have to do searches on a data set that's been optimized for only

00:09:26.000 --> 00:09:27.000
one

00:09:27.000 --> 00:09:28.000
access. Definitely,

00:09:28.000 --> 00:09:32.000
tradeoffs were more space thrown at it to get at that fast access for

00:09:32.000 --> 00:09:34.000
different ways.

00:09:34.000 --> 00:09:40.000
Then this last one is one that I had mentioned along the way, but I think it's one that's easy to overlook,

00:09:40.000 --> 00:09:42.000
which is to get very excited about how fast you can make it, how small you can

00:09:42.000 --> 00:09:43.000
make it,

00:09:43.000 --> 00:09:46.000
how fancy it is. It

00:09:46.000 --> 00:09:50.000
has a real downside in terms of how hard it was to

00:09:50.000 --> 00:09:53.000
write. Typically, when you go to these fancier strategies

00:09:53.000 --> 00:09:54.000
that are

00:09:54.000 --> 00:09:58.000
space efficient and time efficient, you had to pay for it somewhere.

00:09:58.000 --> 00:10:01.000
It wasn't for free that the easy, obvious code would do that. It tends to

00:10:01.000 --> 00:10:05.000
actually be complicated code that may use bid operations, that probably uses

00:10:05.000 --> 00:10:06.000
more pointers

00:10:06.000 --> 00:10:10.000
that has kind of a density to it that means it's going to take you longer to get it

00:10:10.000 --> 00:10:12.000
right. It's going to take you more time to debug it. It's going to be more likely to

00:10:12.000 --> 00:10:15.000
have lurking bugs. It's going to be harder to maintain.

00:10:15.000 --> 00:10:18.000
If somebody comes along and wants to add a new operation,

00:10:18.000 --> 00:10:21.000
they open up your code, and they're frightened away. It might be that that

00:10:21.000 --> 00:10:23.000
code will never get

00:10:23.000 --> 00:10:25.000
moved forward. Everybody will just open it

00:10:25.000 --> 00:10:29.000
and close it immediately and say, whatever. We'll make it work the way it is,

00:10:29.000 --> 00:10:33.000
rather than trying to improve it or move it forward because it actually is kind of scary

00:10:33.000 --> 00:10:34.000
to look at.

00:10:34.000 --> 00:10:35.000
So thinking about

00:10:35.000 --> 00:10:37.000
what's the simplest thing that could work.

00:10:37.000 --> 00:10:40.000
There's this movement that I think is pretty neat to think about.

00:10:40.000 --> 00:10:43.000
There's this idea of the agile programming methodology.

00:10:43.000 --> 00:10:44.000


00:10:44.000 --> 00:10:45.000
It's based on the idea that

00:10:45.000 --> 00:10:49.000
rather than planning for building the most super stellar

00:10:49.000 --> 00:10:51.000
infrastructure for solving all the worlds problems. You're about to write a chess

00:10:51.000 --> 00:10:53.000
program.

00:10:53.000 --> 00:10:55.000
Actually, something that's even simpler. You're going to write a checkers program. You say,

00:10:55.000 --> 00:10:59.000
I'm going to sit down, and I'm going to design the pieces and the board. Then you think, hey,

00:10:59.000 --> 00:11:02.000
checkers is just one embodiment of this. Why don't I try to make the

00:11:02.000 --> 00:11:07.000
uber board that can be used for strategic conquest or for chess or Chinese

00:11:07.000 --> 00:11:11.000
checkers. You could have these hexagonal things. You start building this crazy

00:11:11.000 --> 00:11:12.000
infrastructure that

00:11:12.000 --> 00:11:15.000
could handle all of these cases equally well, even though all you really needed

00:11:15.000 --> 00:11:18.000
was checkers, a very simple, 2D grid.

00:11:18.000 --> 00:11:19.000
But you imagine

00:11:19.000 --> 00:11:23.000
the future needs way in advance of having them, and then you design something

00:11:23.000 --> 00:11:26.000
overly complicated. Then what happens is you get bogged out of that.

00:11:26.000 --> 00:11:28.000
It never happens, and the project dies.

00:11:28.000 --> 00:11:29.000
You

00:11:29.000 --> 00:11:31.000
lead a lonely life

00:11:31.000 --> 00:11:35.000
by yourself, eating oatmeal.

00:11:35.000 --> 00:11:37.000
I like

00:11:37.000 --> 00:11:39.000
oatmeal. So one of the mottos is

00:11:39.000 --> 00:11:42.000
design the simplest thing that could possibly work.

00:11:42.000 --> 00:11:46.000
I think that's a neat gift to yourself to realize that

00:11:46.000 --> 00:11:48.000
this simplest thing that could possibly work, that meets the needs that

00:11:48.000 --> 00:11:52.000
you have, that has a decent enough [inaudible] and memory constraint right there.

00:11:52.000 --> 00:11:54.000
Meets your needs in the simplest form of that.

00:11:54.000 --> 00:11:57.000
It's going to be the easiest thing to get working and running. If you later decide you want to

00:11:57.000 --> 00:11:58.000
go fancy,

00:11:58.000 --> 00:12:01.000
you can swap it out

00:12:01.000 --> 00:12:06.000
using the principles of distraction and [inaudible]. It should be that it's independent

00:12:06.000 --> 00:12:10.000
when you decide you need that fancier thing. I've got a

00:12:10.000 --> 00:12:13.000
couple questions here because I thought this would just help us frame a little

00:12:13.000 --> 00:12:14.000
bit of thinking about these things.

00:12:14.000 --> 00:12:17.000
These are questions that I'm hoping you can answer for me. That's why I

00:12:17.000 --> 00:12:18.000
brought

00:12:18.000 --> 00:12:22.000
cup full of candy. I'm prepared to offer bribery for people who help

00:12:22.000 --> 00:12:24.000
me answer these questions.

00:12:24.000 --> 00:12:26.000
So let's talk about array

00:12:26.000 --> 00:12:28.000
versus vector.

00:12:28.000 --> 00:12:31.000
So the sequels [inaudible] builds and

00:12:31.000 --> 00:12:35.000
array is what is behind the scenes of a vector. Then the vector adds

00:12:35.000 --> 00:12:38.000
convenience on the outside of it.

00:12:38.000 --> 00:12:40.000
Early on, I had said, once I tell

00:12:40.000 --> 00:12:43.000
you about arrays,

00:12:43.000 --> 00:12:46.000
put that back in your memory, and just use vector instead because everything array

00:12:46.000 --> 00:12:48.000
does, vector does

00:12:48.000 --> 00:12:53.000
nicely and cleanly and adds all sorts of features. Somebody give me a list

00:12:53.000 --> 00:12:55.000
of things that vector does much better than

00:12:55.000 --> 00:12:58.000
array.

00:12:58.000 --> 00:12:59.000
Way in the back. Bounce

00:12:59.000 --> 00:13:00.000
checking.

00:13:00.000 --> 00:13:03.000
Bounce checking. What else? Come on. A

00:13:03.000 --> 00:13:06.000
blow pop is on the line.

00:13:06.000 --> 00:13:09.000
I need to interrupt your - help us

00:13:09.000 --> 00:13:10.000
out.

00:13:10.000 --> 00:13:11.000
You've got bounce checking. What

00:13:11.000 --> 00:13:14.000
else? More? I want more. You've got nothing else for

00:13:14.000 --> 00:13:17.000
me? I don't even want a blow pop. You don't want

00:13:17.000 --> 00:13:21.000
a blow

00:13:21.000 --> 00:13:23.000
pop? All right. I'm going to give your blow pop to somebody else. Right

00:13:23.000 --> 00:13:27.000
here. I'm going to give the blow pop to you. Tell me what makes vector good. It has operations that

00:13:27.000 --> 00:13:27.000
shuffle

00:13:27.000 --> 00:13:31.000
elements for you. Yeah, it does the insert, delete, moving stuff down.

00:13:31.000 --> 00:13:32.000
Also, it does the growing.

00:13:32.000 --> 00:13:34.000
You keep adding, and it just grows. [Inaudible] don't do that. You

00:13:34.000 --> 00:13:37.000
want a [inaudible] to grow, you grow it.

00:13:37.000 --> 00:13:40.000
You take the pointer, you copy stuff over, you delete the old one, you

00:13:40.000 --> 00:13:43.000
rearrange things. You do all the work. So

00:13:43.000 --> 00:13:44.000
vector buys you kind of

00:13:44.000 --> 00:13:46.000
array with benefits.

00:13:46.000 --> 00:13:48.000
So it seems like, then,

00:13:48.000 --> 00:13:50.000
you could kind of take away - the naive point would be like, well,

00:13:50.000 --> 00:13:53.000
just use vector. Always use vector.

00:13:53.000 --> 00:13:56.000
Vector's always better than array. Is there ever a situation where array still

00:13:56.000 --> 00:13:57.000
is a pretty valid choice?

00:13:57.000 --> 00:14:00.000
What is it array has that vector maybe doesn't?

00:14:00.000 --> 00:14:05.000
Well, when you add things dynamically to the vector, it actually doubles the size [inaudible]

00:14:05.000 --> 00:14:06.000
increasing it by one. If you know

00:14:06.000 --> 00:14:10.000
the exact size, and you know it's not going to change, you're using [inaudible]. That's it exactly.

00:14:10.000 --> 00:14:14.000
One of its big advantages. If you know you only have

00:14:14.000 --> 00:14:17.000
- let's say you have a very small array

00:14:17.000 --> 00:14:21.000
planned for. The one on the practice exam was, if you're doing a calendar

00:14:21.000 --> 00:14:23.000
where you had 365 days in a year,

00:14:23.000 --> 00:14:26.000
and you're keeping track of the events on a particular day, you know

00:14:26.000 --> 00:14:30.000
how many days in the year. There's never going to be any change in that. Well, leap

00:14:30.000 --> 00:14:31.000


00:14:31.000 --> 00:14:33.000
year.

00:14:33.000 --> 00:14:36.000
But you know that, and you know that in advance. So why fuss with a data structure that does all

00:14:36.000 --> 00:14:39.000
this growing and shrinking and has some overhead associated with that

00:14:39.000 --> 00:14:43.000
when, in fact, you know there's exactly this number. That's all you need.

00:14:43.000 --> 00:14:46.000
In fact, it's almost a little bit more convenient to set up. You can

00:14:46.000 --> 00:14:48.000
just declare it that size, and it's ready to go. With

00:14:48.000 --> 00:14:52.000
the vector, you have to go through and add all the elements and stuff like that.

00:14:52.000 --> 00:14:54.000
So there are a few situations where you might just say,

00:14:54.000 --> 00:14:57.000
I don't need the advantages.

00:14:57.000 --> 00:15:00.000
It adds some overhead that I don't want to pay,

00:15:00.000 --> 00:15:04.000
and I'm perfectly happy making it a small fixed

00:15:04.000 --> 00:15:07.000
array. You'll still lose bounce checking,

00:15:07.000 --> 00:15:08.000
but

00:15:08.000 --> 00:15:11.000
if you are being careful, maybe you're

00:15:11.000 --> 00:15:12.000
comfortable with

00:15:12.000 --> 00:15:15.000
making that tradeoff. By losing bounce checking, you actually are getting a little bit

00:15:15.000 --> 00:15:18.000
of the speed improvement back. The bounce check does actually cost you

00:15:18.000 --> 00:15:20.000
a little bit on every

00:15:20.000 --> 00:15:23.000
vector access that an array would not actually pay.

00:15:23.000 --> 00:15:26.000
That's really in the noise for most programs. I hesitate to

00:15:26.000 --> 00:15:28.000
even mention it, to get you to think it matters,

00:15:28.000 --> 00:15:31.000
but it is part of the thinking of a

00:15:31.000 --> 00:15:34.000
professional program. Sometimes that may be the thing it comes down to,

00:15:34.000 --> 00:15:36.000


00:15:36.000 --> 00:15:37.000
making your operation

00:15:37.000 --> 00:15:41.000
constant factors streamline enough for the operation you need. So I put here

00:15:41.000 --> 00:15:44.000
stack and queue versus vector.

00:15:44.000 --> 00:15:48.000
Given that stack and queue are really just vectors in disguise, and they're

00:15:48.000 --> 00:15:50.000
vectors with less features, vectors

00:15:50.000 --> 00:15:53.000
that can't do as much.

00:15:53.000 --> 00:15:55.000
One of the implementations for stack and vector

00:15:55.000 --> 00:15:58.000
could easily be just layered on top of a vector.

00:15:58.000 --> 00:16:02.000
Why is it good to take away things?

00:16:02.000 --> 00:16:06.000
[Inaudible]

00:16:06.000 --> 00:16:08.000
tampering with contents of [inaudible]. Yeah. So it's enforcing discipline

00:16:08.000 --> 00:16:11.000
on how you use it. Stacks are

00:16:11.000 --> 00:16:12.000


00:16:12.000 --> 00:16:14.000
lifo, queues are fifo.

00:16:14.000 --> 00:16:15.000
The way things go out, they way they go out

00:16:15.000 --> 00:16:17.000
are very

00:16:17.000 --> 00:16:19.000
rigid. By using stack and queue,

00:16:19.000 --> 00:16:22.000
you are guaranteeing that you won't go in and accidentally put something at the head

00:16:22.000 --> 00:16:25.000
of the line or remove something out of the middle of the stack because you just don't

00:16:25.000 --> 00:16:26.000
have access to that.

00:16:26.000 --> 00:16:27.000
It's been tidied up and

00:16:27.000 --> 00:16:29.000
packaged up into stack, which has push and pop.

00:16:29.000 --> 00:16:31.000
queue, which has MQDQ.

00:16:31.000 --> 00:16:34.000
No other access implied or given,

00:16:34.000 --> 00:16:37.000
totally encapsulated from you. So in fact, it allows you to

00:16:37.000 --> 00:16:39.000
maintain on a discipline

00:16:39.000 --> 00:16:45.000
that otherwise the vector didn't strongly provide for you. Question? [Inaudible]. Another

00:16:45.000 --> 00:16:48.000
thing you could do is you could optimize a stack or a

00:16:48.000 --> 00:16:51.000
queue to be very good at the only active things you're doing, which

00:16:51.000 --> 00:16:55.000
is pushing and popping or RQDQ. I think that's worth a smartie, don't you?

00:16:55.000 --> 00:16:57.000
I think that is.

00:16:57.000 --> 00:16:58.000
Exactly.

00:16:58.000 --> 00:17:02.000
So by restricting what you have available,

00:17:02.000 --> 00:17:05.000
it actually gives you choices in the implementation that wouldn't have worked out as well, [inaudible] full array

00:17:05.000 --> 00:17:08.000
of operations. It's exactly true. Stack and queue only adding from one end

00:17:08.000 --> 00:17:11.000
mean that things like a Linklis, which isn't a good choice for implementing the

00:17:11.000 --> 00:17:13.000
general case of vector,

00:17:13.000 --> 00:17:17.000
are perfectly good choices for the stack and queue case. So giving ourselves

00:17:17.000 --> 00:17:22.000
some freedom as the implementer. So information about how it's being used

00:17:22.000 --> 00:17:24.000
can pay off. It also

00:17:24.000 --> 00:17:25.000
means when you read code - if you see

00:17:25.000 --> 00:17:27.000
code that's using something, call the stack

00:17:27.000 --> 00:17:31.000
or using a queue, you immediately know how it works. You could

00:17:31.000 --> 00:17:34.000
say, here's this search that's actually stuffing a bunch of things into

00:17:34.000 --> 00:17:37.000
a stack. I know they're last and first [inaudible].

00:17:37.000 --> 00:17:39.000
If you see them using a vector,

00:17:39.000 --> 00:17:42.000
you have to verify where they put them in the vector. Where do

00:17:42.000 --> 00:17:46.000
they pull them out? The code doesn't tell you the same things. It doesn't tell you the same story

00:17:46.000 --> 00:17:49.000
as it does when you see a word that has a strong meaning to it, like

00:17:49.000 --> 00:17:52.000
stack and queue.

00:17:52.000 --> 00:17:55.000
Generally, these [inaudible] to make about sorting things or not sorting things. We spent a

00:17:55.000 --> 00:17:57.000
long time talking about sorting

00:17:57.000 --> 00:18:01.000
and how various algorithms approach the sorting problem and what

00:18:01.000 --> 00:18:02.000
they come up with.

00:18:02.000 --> 00:18:05.000
I have a question here. What sorting algorithm is best? Is

00:18:05.000 --> 00:18:09.000
there an

00:18:09.000 --> 00:18:12.000
answer to that question? Why would I ask a question that has no answer? It depends on what you expect it

00:18:12.000 --> 00:18:16.000
to be giving you. It depends. That's a great thing. It depends on

00:18:16.000 --> 00:18:18.000
what you expect it to give you. How big is it?

00:18:18.000 --> 00:18:21.000
What are the contents of it? Is it random? Is it sorted? Is it almost

00:18:21.000 --> 00:18:24.000
sorted? Is it reverse sorted? Is it

00:18:24.000 --> 00:18:28.000
drawn from a range of one to 1,000 values, or is it drawn from one to three

00:18:28.000 --> 00:18:31.000
values? Huge array that has just one, two, three in it

00:18:31.000 --> 00:18:34.000
might necessitate a different approach than one that has

00:18:34.000 --> 00:18:36.000
one to max sine in it.

00:18:36.000 --> 00:18:37.000
So

00:18:37.000 --> 00:18:39.000
there is no answer to that. It is depends. What do you know? If you don't

00:18:39.000 --> 00:18:41.000
know anything, give

00:18:41.000 --> 00:18:43.000
it to someone else.

00:18:43.000 --> 00:18:45.000


00:18:45.000 --> 00:18:49.000
If you don't know, there's certain things that perform well. You know in the

00:18:49.000 --> 00:18:52.000
average case, you don't have degenerate behaviors. For example, [inaudible] is a

00:18:52.000 --> 00:18:56.000
great sort for the general case of an N log N that doesn't use any space

00:18:56.000 --> 00:18:56.000


00:18:56.000 --> 00:19:01.000
or any extra space the way word sort does. It doesn't have any degenerate cases.

00:19:01.000 --> 00:19:02.000
For an

00:19:02.000 --> 00:19:05.000
unknown set of inputs, it works pretty well.

00:19:05.000 --> 00:19:08.000
If you happen to have a little bit of data, it's almost sorted.

00:19:08.000 --> 00:19:09.000


00:19:09.000 --> 00:19:10.000
Even

00:19:10.000 --> 00:19:11.000
Barack Obama's

00:19:11.000 --> 00:19:13.000
not favorite bubble sort

00:19:13.000 --> 00:19:14.000
has a

00:19:14.000 --> 00:19:17.000
chance of being in the run because it's already sorted. Is it worth sorting? Calculating the payoff, we had a problem on

00:19:17.000 --> 00:19:20.000
one of the section

00:19:20.000 --> 00:19:24.000
exercises once. I think it's a really good

00:19:24.000 --> 00:19:25.000
one to

00:19:25.000 --> 00:19:27.000
think through and revisit. This idea of

00:19:27.000 --> 00:19:31.000
certain operations, many operations are much more

00:19:31.000 --> 00:19:34.000
efficiently implemented if the data's sorted.

00:19:34.000 --> 00:19:37.000
You can think of them off the top of your head. Oh, you want to have the minimum? If

00:19:37.000 --> 00:19:41.000
it's sorted, it's very easy. You want the maximum? Very easy if it's sorted. You want to find the median.

00:19:41.000 --> 00:19:43.000
Also very easy if it's sorted. You want to

00:19:43.000 --> 00:19:44.000


00:19:44.000 --> 00:19:48.000
find duplicates. Once they're sorted, they're all lined up together. You want to find the

00:19:48.000 --> 00:19:49.000
mode, which is the most

00:19:49.000 --> 00:19:50.000
frequent element.

00:19:50.000 --> 00:19:53.000
Also easier if it's sorted because they'll all be groups together in a run. So you

00:19:53.000 --> 00:19:57.000
can do a linear pass, counting what you see to find the longest run.

00:19:57.000 --> 00:20:00.000
If you had two arrays and you wanted to merge them or intersect them to find

00:20:00.000 --> 00:20:04.000
their common elements, it's much easier if they're sorted. All these things

00:20:04.000 --> 00:20:07.000
that are hard to do when the data's in

00:20:07.000 --> 00:20:08.000
a random order

00:20:08.000 --> 00:20:12.000
actually have much more streamlined algorithms once you invest in sorting it.

00:20:12.000 --> 00:20:13.000
So there was a problem that was like,

00:20:13.000 --> 00:20:15.000
how do you know it's worth doing that?

00:20:15.000 --> 00:20:18.000
If you only plan on writing one merge, and

00:20:18.000 --> 00:20:19.000
you could do it in N squared,

00:20:19.000 --> 00:20:21.000
is it worth it to

00:20:21.000 --> 00:20:24.000
sort it, N log N, so you can get an O out of merge? So it has a lot to do

00:20:24.000 --> 00:20:26.000
with how often you plan on doing the merge.

00:20:26.000 --> 00:20:29.000
How big is your data set?

00:20:29.000 --> 00:20:30.000
What are your

00:20:30.000 --> 00:20:32.000
constraints on the times there. It

00:20:32.000 --> 00:20:35.000
is interesting to think about that relationship. It isn't for free, but

00:20:35.000 --> 00:20:37.000


00:20:37.000 --> 00:20:41.000
potentially, it's an investment by sorting it so you can do a lot of fast searches

00:20:41.000 --> 00:20:44.000
later.

00:20:44.000 --> 00:20:47.000
This one actually is interesting because they actually solve a lot of the

00:20:47.000 --> 00:20:50.000
same problems, and they differ in one small way. If you had a

00:20:50.000 --> 00:20:51.000
set -

00:20:51.000 --> 00:20:55.000
so set is being backed by a balance by a [inaudible] string that actually is a sorted data

00:20:55.000 --> 00:20:56.000
structure internally - versus a

00:20:56.000 --> 00:20:58.000
sorted vector.

00:20:58.000 --> 00:21:03.000
So if your goal were to keep track of all the

00:21:03.000 --> 00:21:07.000
students at Stanford, and you wanted to be able to look up people by name,

00:21:07.000 --> 00:21:08.000


00:21:08.000 --> 00:21:11.000
then the contains operation of the set

00:21:11.000 --> 00:21:15.000
is actually doing these minor ray

00:21:15.000 --> 00:21:16.000
search on assorted vectors

00:21:16.000 --> 00:21:18.000
using the same path, but working through a vector.

00:21:18.000 --> 00:21:21.000
So we can make the searching operations, like contains

00:21:21.000 --> 00:21:23.000


00:21:23.000 --> 00:21:24.000
and

00:21:24.000 --> 00:21:26.000
other look-up based things,

00:21:26.000 --> 00:21:29.000
algorithmic in both cases.

00:21:29.000 --> 00:21:33.000
So what advantage - this sort

00:21:33.000 --> 00:21:36.000
of vector, where I just used the interchangeably, is there something one

00:21:36.000 --> 00:21:41.000
can do that the other can't or some advantage or disadvantage that's

00:21:41.000 --> 00:21:43.000
assumed in that decision?

00:21:43.000 --> 00:21:47.000
You don't have duplicates in a set, but you can have duplicates in your sorted vector. Yeah,

00:21:47.000 --> 00:21:51.000
so the set has another concern with just how it operates. You can't put duplicates

00:21:51.000 --> 00:21:54.000
in, so if you have two students with the same name, then

00:21:54.000 --> 00:21:57.000
in the set, it turns out you're all coalesced down to one. Your transcripts all get

00:21:57.000 --> 00:21:59.000
merged to your advantage or not. What

00:21:59.000 --> 00:22:01.000
else

00:22:01.000 --> 00:22:05.000
does set buy you that vector doesn't? How hard is it to

00:22:05.000 --> 00:22:09.000
edit a sorted vector? You

00:22:09.000 --> 00:22:11.000
want to put something new in there?

00:22:11.000 --> 00:22:12.000
You're shuffling. You want to

00:22:12.000 --> 00:22:13.000
take something out? You're shuffling.

00:22:13.000 --> 00:22:18.000
The movement to the pointer-based structure there, the research

00:22:18.000 --> 00:22:18.000
stream behind this,

00:22:18.000 --> 00:22:21.000
is giving us this editability of the data structure.

00:22:21.000 --> 00:22:25.000
The convenient logarithmic time to remove and add elements using the

00:22:25.000 --> 00:22:28.000
pointer wiring to do that,

00:22:28.000 --> 00:22:30.000
that vector doesn't have. That tradeoff

00:22:30.000 --> 00:22:35.000
is actually one that

00:22:35.000 --> 00:22:39.000
situation may be that you just don't do a lot of edits. That's actually a very common

00:22:39.000 --> 00:22:42.000
situation. The sorted vector happens to be a very underutilized

00:22:42.000 --> 00:22:45.000
or under appreciated arrangement

00:22:45.000 --> 00:22:48.000
for data because for most things that are getting a lot of

00:22:48.000 --> 00:22:49.000
edits, having a

00:22:49.000 --> 00:22:53.000
linear time edit isn't going to be a painful

00:22:53.000 --> 00:22:53.000
consequence.

00:22:53.000 --> 00:22:56.000
You're doing a lot of searches, and you're getting the finer research there,

00:22:56.000 --> 00:22:57.000
where you really care about it,

00:22:57.000 --> 00:23:01.000
with no overhead. Very little overhead. A little bit of excess capacity,

00:23:01.000 --> 00:23:03.000
but none of the pointer-based

00:23:03.000 --> 00:23:06.000
trouble that came with the set. So the set actually adding onto that another eight

00:23:06.000 --> 00:23:08.000
bytes, potentially even a little bit more,

00:23:08.000 --> 00:23:12.000
of overhead on every element to give us that editability that maybe isn't

00:23:12.000 --> 00:23:14.000
that important to us.

00:23:14.000 --> 00:23:17.000
So if you're looking at that thing and saying, I'd like to be able to have

00:23:17.000 --> 00:23:21.000
a sorted structure for both the set and the sorted vector, let you

00:23:21.000 --> 00:23:24.000
access the elements in order. The interrater of the set goes in sort order,

00:23:24.000 --> 00:23:27.000
going from zero to the end. The vector gives me order.

00:23:27.000 --> 00:23:30.000
I can browse them in sort order. I can search them using sorted.

00:23:30.000 --> 00:23:33.000
Where they differ is where does it take to edit one?

00:23:33.000 --> 00:23:37.000
The one that invested more in the memory had faster

00:23:37.000 --> 00:23:39.000


00:23:39.000 --> 00:23:44.000
editing. The P-queue. This is kind of interesting. The P-queue is also a sorted structure, but not quite.

00:23:44.000 --> 00:23:46.000


00:23:46.000 --> 00:23:46.000


00:23:46.000 --> 00:23:48.000
It's a much more weakly

00:23:48.000 --> 00:23:52.000
sorted structure. So if you're using the [inaudible] heap like we did on the

00:23:52.000 --> 00:23:53.000
P-queue

00:23:53.000 --> 00:23:55.000
[inaudible], it does give you this access to the maximum value,

00:23:55.000 --> 00:23:59.000
and then kind of a quick reshuffling [inaudible] the next

00:23:59.000 --> 00:24:00.000
and so on.

00:24:00.000 --> 00:24:02.000
But it doesn't really give you the full range. For example, if you

00:24:02.000 --> 00:24:10.000
were looking for the minimum in a P-queue, where is it?

00:24:10.000 --> 00:24:11.000
It's somewhere in the

00:24:11.000 --> 00:24:14.000
bottom. It's in the bottom-most level, but where across the bottom,

00:24:14.000 --> 00:24:16.000
no knowledge, right? It could be

00:24:16.000 --> 00:24:20.000
at any of the nodes that are at the bottom of the heap. So in fact it

00:24:20.000 --> 00:24:21.000
gives you this access to

00:24:21.000 --> 00:24:23.000
a partial

00:24:23.000 --> 00:24:26.000
ordering of the data. In this case, it's relative

00:24:26.000 --> 00:24:28.000
to the prioritization, but not

00:24:28.000 --> 00:24:31.000
the fluid form of sorting the way a vector is. For example,

00:24:31.000 --> 00:24:33.000
if P-queue doesn't give you things for

00:24:33.000 --> 00:24:35.000
finding duplicates or finding the minimum or

00:24:35.000 --> 00:24:39.000
doing the mode it's not actually an easy operation because you have

00:24:39.000 --> 00:24:42.000
it in a heap form. I can get to the max

00:24:42.000 --> 00:24:42.000
quickly.

00:24:42.000 --> 00:24:46.000
So if I cared about pulling them out in sorted order

00:24:46.000 --> 00:24:47.000
to browse them,

00:24:47.000 --> 00:24:51.000
I could stuff things into a P-queue and then pull them out, but what's interesting about that is

00:24:51.000 --> 00:24:54.000
it requires destroying the P-queue.

00:24:54.000 --> 00:24:57.000
The P-queue isn't really a place you store things and plan on using them again. It's

00:24:57.000 --> 00:25:00.000
the place you stick things in order to get them out

00:25:00.000 --> 00:25:02.000
in an order that's only

00:25:02.000 --> 00:25:05.000
value is in the de-queuing of them, pulling them out.

00:25:05.000 --> 00:25:10.000
So finding the most promising option to look at in a search or finding the most

00:25:10.000 --> 00:25:13.000
high-priority activity to take on or something.

00:25:13.000 --> 00:25:16.000
But it doesn't really have the - if I wanted to be able to look at all

00:25:16.000 --> 00:25:18.000
the students in my class in sorted order,

00:25:18.000 --> 00:25:20.000
it's like, well, I could stick them in a P-queue and then pull them out, but it's sort of

00:25:20.000 --> 00:25:25.000
funny. I have to put them in the P-queue just to pull them back out. It wasn't a good way to store them for that

00:25:25.000 --> 00:25:27.000
browsing operation to be easily repeated.

00:25:27.000 --> 00:25:29.000
If I put them into a sorted vector, I could just keep

00:25:29.000 --> 00:25:33.000
iterating over it whenever I need to see it again.

00:25:33.000 --> 00:25:37.000
So a lot of the things that it comes down to is having a pretty good

00:25:37.000 --> 00:25:41.000
visualization of what it is that going to a pointer-based data structure buys you, and

00:25:41.000 --> 00:25:44.000
what it costs you, relative to the continuous memory. That's one of the

00:25:44.000 --> 00:25:45.000


00:25:45.000 --> 00:25:48.000
tensions underneath most of these things. It's like,

00:25:48.000 --> 00:25:50.000
putting more memory into it

00:25:50.000 --> 00:25:54.000
by virtue of having a pointer wire as opposed to a

00:25:54.000 --> 00:25:55.000
location

00:25:55.000 --> 00:25:57.000
arrangement.

00:25:57.000 --> 00:26:00.000
There's no information that's stored to get you from a race of zero to a

00:26:00.000 --> 00:26:04.000
race of two. They're just continuous in memory. So it saves you space because you only

00:26:04.000 --> 00:26:07.000
have to know where it starts and access continuously for that. This pointer means

00:26:07.000 --> 00:26:11.000
there's more kerversals. There's more memory. There's more opportunity

00:26:11.000 --> 00:26:12.000
to air, but it gave

00:26:12.000 --> 00:26:16.000
you this flexibility. It broke down this continuous block, that's kind

00:26:16.000 --> 00:26:17.000
of a monolith,

00:26:17.000 --> 00:26:21.000
into these pieces that are more flexible in how you can rearrange

00:26:21.000 --> 00:26:25.000
them. So thinking about these things is

00:26:25.000 --> 00:26:28.000
an important skill to come away from 106b with that,

00:26:28.000 --> 00:26:30.000
yeah, there's never

00:26:30.000 --> 00:26:31.000
one right answer.

00:26:31.000 --> 00:26:33.000


00:26:33.000 --> 00:26:36.000
You're investing to solve this other problem, and

00:26:36.000 --> 00:26:40.000
there's going to be downsides. There's no

00:26:40.000 --> 00:26:41.000
obvious

00:26:41.000 --> 00:26:43.000
triumph over all.

00:26:43.000 --> 00:26:47.000
It's fast, it's cheap and it's small and easy to write

00:26:47.000 --> 00:26:51.000
solutions out there.

00:26:51.000 --> 00:26:53.000
So I put up here - I had to

00:26:53.000 --> 00:26:54.000
whittle

00:26:54.000 --> 00:26:57.000
this down to the five or six things that, at

00:26:57.000 --> 00:27:02.000
the end of 106b, I want you to feel like, that's what I got.

00:27:02.000 --> 00:27:04.000
That's the heart of

00:27:04.000 --> 00:27:05.000


00:27:05.000 --> 00:27:06.000
the

00:27:06.000 --> 00:27:10.000
goal. The MVPs - the most valuable players of the

00:27:10.000 --> 00:27:12.000
106b curriculum here.

00:27:12.000 --> 00:27:15.000
You're looking back at it, and you go, where did I start, where did I end up,

00:27:15.000 --> 00:27:18.000
and how did my knowledge grow?

00:27:18.000 --> 00:27:21.000
I'm hoping these are the things that stand out for you. That was a

00:27:21.000 --> 00:27:23.000
real

00:27:23.000 --> 00:27:24.000
growth area for me. One

00:27:24.000 --> 00:27:27.000
is this idea of abstraction.

00:27:27.000 --> 00:27:32.000
That was stressed early on. Here's a stack, here's a queue, here's a set. They

00:27:32.000 --> 00:27:33.000
do things for you. We don't know how

00:27:33.000 --> 00:27:34.000
they work yet,

00:27:34.000 --> 00:27:38.000
but you make cool things happen with them. So you solve maze problems and

00:27:38.000 --> 00:27:40.000
random writers and

00:27:40.000 --> 00:27:43.000
do neat things with

00:27:43.000 --> 00:27:47.000
dictionaries without knowing how they work. So you're leveraging existing

00:27:47.000 --> 00:27:48.000
material without knowledge of how it works.

00:27:48.000 --> 00:27:52.000
It helps to keep the complexity down. You can solve much more interesting problems

00:27:52.000 --> 00:27:56.000
than you could without them. Try to go back and imagine the Markov random writer

00:27:56.000 --> 00:27:57.000
problem.

00:27:57.000 --> 00:27:58.000
Now imagine doing it

00:27:58.000 --> 00:28:00.000


00:28:00.000 --> 00:28:02.000
without the map in play. So

00:28:02.000 --> 00:28:04.000
you had to figure out that

00:28:04.000 --> 00:28:08.000
given this character, go look it up and find a place to store it, and

00:28:08.000 --> 00:28:11.000
then store the character in some other array that was growing on

00:28:11.000 --> 00:28:14.000
demand, as you put stuff in there. You would never have completed that

00:28:14.000 --> 00:28:18.000
program. You would still be writing it today, having pointer errors and whatnot

00:28:18.000 --> 00:28:19.000
behind the scenes.

00:28:19.000 --> 00:28:22.000
Having those tools already there, it's like having

00:28:22.000 --> 00:28:25.000
the microwave and the food processor, and you're about to cook instead

00:28:25.000 --> 00:28:26.000
of having a

00:28:26.000 --> 00:28:27.000
dull knife

00:28:27.000 --> 00:28:31.000
and doing all your work with a

00:28:31.000 --> 00:28:31.000
dull

00:28:31.000 --> 00:28:34.000
knife and a fire. You actually have

00:28:34.000 --> 00:28:36.000
some real power

00:28:36.000 --> 00:28:39.000
there. Recursion's a big theme in the weeks that followed that.

00:28:39.000 --> 00:28:41.000
How to solve these problems that

00:28:41.000 --> 00:28:43.000
have this similar structure,

00:28:43.000 --> 00:28:47.000
looking at these exhaustive traversals and choice problems that can

00:28:47.000 --> 00:28:51.000
be solved with backtracking. Having this idea of how to think about thing [inaudible]

00:28:51.000 --> 00:28:53.000
using the recursive problem solving

00:28:53.000 --> 00:28:55.000
and how to model

00:28:55.000 --> 00:28:56.000
that

00:28:56.000 --> 00:28:59.000
process using the computer was a

00:28:59.000 --> 00:29:02.000
very important thing that comes back. We look at algorithm

00:29:02.000 --> 00:29:05.000
analysis, this idea of making this

00:29:05.000 --> 00:29:08.000
sloppy map that computer scientists are

00:29:08.000 --> 00:29:11.000
fond of, talking about the scalability and growth, knowing about curves and

00:29:11.000 --> 00:29:13.000
the relationships and how to

00:29:13.000 --> 00:29:16.000
look at algorithms. The more formal analysis.

00:29:16.000 --> 00:29:17.000
Still, in this case, there's still quite

00:29:17.000 --> 00:29:21.000
a bit more formulas to this than we really went into. If you

00:29:21.000 --> 00:29:23.000
were go on in CS and look at it, you will see there's quite a lot

00:29:23.000 --> 00:29:26.000
of rigor that can be

00:29:26.000 --> 00:29:27.000
looked at in that area.

00:29:27.000 --> 00:29:31.000
We were getting more an intuition for how to gauge things about algorithms and

00:29:31.000 --> 00:29:33.000
their growth.

00:29:33.000 --> 00:29:36.000
Then this backside, all about implementation strategies and tradeoffs.

00:29:36.000 --> 00:29:38.000
So now that we've

00:29:38.000 --> 00:29:40.000
benefited from those abstractions, and we've

00:29:40.000 --> 00:29:43.000
solved some cool problems, what's it like to actually be the person building them? What

00:29:43.000 --> 00:29:46.000
are the techniques? What does it take to

00:29:46.000 --> 00:29:50.000
wrestle a pointer or a Linklis or a tree, heap graph into doing something really cool for you. How

00:29:50.000 --> 00:29:52.000
does that

00:29:52.000 --> 00:29:57.000
inform your information, what you expect about it and the

00:29:57.000 --> 00:30:00.000
coding complexity that was inherent in achieving that result?

00:30:00.000 --> 00:30:04.000
Hopefully in continuing with the theme that 106a started you on was this

00:30:04.000 --> 00:30:07.000
appreciation for [inaudible]. In the

00:30:07.000 --> 00:30:11.000
end, getting code that works, that's ugly, just

00:30:11.000 --> 00:30:14.000
isn't, hopefully, what you're satisfied with. You want to

00:30:14.000 --> 00:30:17.000
approach problem solving to produce beautiful, elegant,

00:30:17.000 --> 00:30:19.000
unified, clean,

00:30:19.000 --> 00:30:20.000
readable code

00:30:20.000 --> 00:30:24.000
that you would be happy to show to other people and be proud of and

00:30:24.000 --> 00:30:25.000
maintain and live with.

00:30:25.000 --> 00:30:27.000
The kind of work that

00:30:27.000 --> 00:30:29.000
other people you work with would appreciate

00:30:29.000 --> 00:30:32.000
being involved with.

00:30:32.000 --> 00:30:34.000
Things that are not so much

00:30:34.000 --> 00:30:37.000
- I don't think these are the most valuable pointers, but what

00:30:37.000 --> 00:30:38.000
came along the way,

00:30:38.000 --> 00:30:39.000


00:30:39.000 --> 00:30:42.000
the idea of pointers and C++ being part of the -

00:30:42.000 --> 00:30:45.000
C++ and its pointers, they're one and the same. We choose [inaudible] for a

00:30:45.000 --> 00:30:48.000
language, so as a result, we get to learn some C++

00:30:48.000 --> 00:30:51.000
because it's so heavily

00:30:51.000 --> 00:30:53.000
invested with pointers. We also have to do some practice with the pointers to

00:30:53.000 --> 00:30:55.000
get our jobs done.

00:30:55.000 --> 00:30:58.000
I think of those as being off to the side. I don't think of those being the

00:30:58.000 --> 00:31:01.000
platform of things I worship. They are

00:31:01.000 --> 00:31:05.000
things we need to get our job done. I

00:31:05.000 --> 00:31:08.000
did put a little note about pointers here because there were quite a lot of comments on

00:31:08.000 --> 00:31:12.000
the mid-quarter evals that said, pointers, pointers still scare me. They make me crazy.

00:31:12.000 --> 00:31:14.000
I don't know enough about

00:31:14.000 --> 00:31:15.000
pointers. Here's a fact for you.

00:31:15.000 --> 00:31:18.000
You'll probably never feel like you know enough about pointers.

00:31:18.000 --> 00:31:21.000
Today, tomorrow, ten years from now, you'll still kind of feel like

00:31:21.000 --> 00:31:24.000
there's some real subtlety and trickiness in there.

00:31:24.000 --> 00:31:27.000
They're an extremely powerful facility that gives you that

00:31:27.000 --> 00:31:31.000
direct control over allocation and deallocation and all the wiring and

00:31:31.000 --> 00:31:32.000
sharing

00:31:32.000 --> 00:31:34.000
that

00:31:34.000 --> 00:31:37.000
is important to building these complicated data structures.

00:31:37.000 --> 00:31:40.000
But there are so many pitfalls. I'm sure you've run into

00:31:40.000 --> 00:31:43.000
most of these, if not all of them at some point, where you

00:31:43.000 --> 00:31:46.000
mishandled a pointer, failed the allocator, deallocate,

00:31:46.000 --> 00:31:49.000
and

00:31:49.000 --> 00:31:52.000
the way it gets reported, whether the compiler or the runtime error that

00:31:52.000 --> 00:31:56.000
you see is helpful in sorting it out is a real cause for

00:31:56.000 --> 00:31:58.000
long hours to disappear

00:31:58.000 --> 00:32:01.000
in the name of mastery of this.

00:32:01.000 --> 00:32:02.000
If you're still wary of pointers, you

00:32:02.000 --> 00:32:04.000
probably should be.

00:32:04.000 --> 00:32:06.000
Think of this as your first go-around.

00:32:06.000 --> 00:32:07.000
We're still

00:32:07.000 --> 00:32:10.000
at the journeyman's stage for programming here.

00:32:10.000 --> 00:32:14.000
So although there were pointers in 106a, they were very hidden. This

00:32:14.000 --> 00:32:17.000
is the first time you really see it and are exposed to it and trying to get your head

00:32:17.000 --> 00:32:18.000
around it.

00:32:18.000 --> 00:32:20.000
If you go on to 107 and 108, you'll just

00:32:20.000 --> 00:32:23.000
get more and more practice. It will

00:32:23.000 --> 00:32:26.000
become more solid as you keep working your way through it, but

00:32:26.000 --> 00:32:30.000
at this stage, it's okay

00:32:30.000 --> 00:32:31.000
and

00:32:31.000 --> 00:32:35.000
commonplace to feel still, a little bit - you see the star, and you take a

00:32:35.000 --> 00:32:37.000
double take. That's just where you should be.

00:32:37.000 --> 00:32:40.000
It's just like in the

00:32:40.000 --> 00:32:43.000
Chinese restaurants. They put the little red star by food you want to be

00:32:43.000 --> 00:32:49.000
careful of. That's why they put the star for the

00:32:49.000 --> 00:32:52.000
pointer. I also wanted to

00:32:52.000 --> 00:32:54.000
zoom out farther,

00:32:54.000 --> 00:32:56.000
a decade from

00:32:56.000 --> 00:32:59.000
now, when you are

00:32:59.000 --> 00:33:01.000
thinking back on the

00:33:01.000 --> 00:33:04.000
whole of your education and what you learned. Do I want to really

00:33:04.000 --> 00:33:06.000
remember how to

00:33:06.000 --> 00:33:08.000
type in keyword and the sequel plus generic

00:33:08.000 --> 00:33:10.000
template facility? No.

00:33:10.000 --> 00:33:13.000
If you code in C++ every day, you'll know that. If you don't, you

00:33:13.000 --> 00:33:14.000
can forget it. It's long gone.

00:33:14.000 --> 00:33:16.000
What I hope you'll carry away from you

00:33:16.000 --> 00:33:18.000
is a couple

00:33:18.000 --> 00:33:19.000
concepts that

00:33:19.000 --> 00:33:21.000
broaden

00:33:21.000 --> 00:33:23.000
outside of CS. One is this idea of algorithmic thinking,

00:33:23.000 --> 00:33:27.000
how you approach something logically and step-wise and draw pictures and analyze it to understand

00:33:27.000 --> 00:33:29.000
the cases

00:33:29.000 --> 00:33:32.000
and to debug when things aren't working well. I think that's a skill that

00:33:32.000 --> 00:33:33.000
transcends

00:33:33.000 --> 00:33:37.000
the art of computer programming into other domains, any kind of thinking

00:33:37.000 --> 00:33:39.000
logically and problem solving and analyzing in that

00:33:39.000 --> 00:33:43.000
way. I also hope that some of the experimental things we did with the sort

00:33:43.000 --> 00:33:45.000
lab and the P-queue

00:33:45.000 --> 00:33:48.000
and some of the big O and thing we tried to do in class together help

00:33:48.000 --> 00:33:51.000
you to [inaudible] back of the envelope calculations. This is something I'm

00:33:51.000 --> 00:33:53.000
a little bit of a -

00:33:53.000 --> 00:33:55.000
it's a pet issue of mine, which is,

00:33:55.000 --> 00:33:57.000
I think, in the era of computers

00:33:57.000 --> 00:34:00.000
and whatnot, really easy to get distanced from numbers and start

00:34:00.000 --> 00:34:03.000
thinking, I just punch numbers into formulas and one comes out. It must be

00:34:03.000 --> 00:34:06.000
the truth. Not having any intuition as to whether that's the right

00:34:06.000 --> 00:34:10.000
number, whether that number makes sense, whether it's grounded,

00:34:10.000 --> 00:34:14.000
and whether you actually are [inaudible] or so off because you've

00:34:14.000 --> 00:34:17.000
actually made an entry error, and you don't even notice it.

00:34:17.000 --> 00:34:20.000
So becoming comfortable with this idea of using numbers to drive things,

00:34:20.000 --> 00:34:21.000
being able to

00:34:21.000 --> 00:34:26.000
take some time trials, do some predictions, do some more time trials,

00:34:26.000 --> 00:34:28.000
match those things up, do a little sketching on your numbers and see the

00:34:28.000 --> 00:34:32.000
things that are working out. So being comfortable having the math and the theory

00:34:32.000 --> 00:34:34.000
reinforce each other

00:34:34.000 --> 00:34:34.000
and

00:34:34.000 --> 00:34:36.000
not get too

00:34:36.000 --> 00:34:39.000
distanced from the fact that those numbers do play out in

00:34:39.000 --> 00:34:42.000
the real world in ways that are interesting. Don't just let the numbers themselves

00:34:42.000 --> 00:34:44.000
tell the story. There really is more

00:34:44.000 --> 00:34:47.000
to connect it up with.

00:34:47.000 --> 00:34:49.000
This idea of tradeoffs.

00:34:49.000 --> 00:34:51.000
You may not remember

00:34:51.000 --> 00:34:54.000
how you can write a Linklis or how the hash table does

00:34:54.000 --> 00:34:56.000
the things it does

00:34:56.000 --> 00:35:00.000
and how to get one working, but hopefully you will take away this idea that

00:35:00.000 --> 00:35:03.000
the answer to most questions begins with the phrase, it depends.

00:35:03.000 --> 00:35:05.000
If I say, is it better to use a hash table or [inaudible] search tree, the answer

00:35:05.000 --> 00:35:09.000
will be, it depends. There's no,

00:35:09.000 --> 00:35:12.000
it's always better to use this sort. It's always

00:35:12.000 --> 00:35:14.000
wrong to use this approach.

00:35:14.000 --> 00:35:15.000
It's, it depends. Do

00:35:15.000 --> 00:35:18.000
you have the code written for that, and it works well, and you don't care how

00:35:18.000 --> 00:35:19.000
long it takes?

00:35:19.000 --> 00:35:23.000
Then bubble sort actually could be the right answer.

00:35:23.000 --> 00:35:25.000
Barack Obama not withstanding.

00:35:25.000 --> 00:35:27.000
But knowing whether memory is

00:35:27.000 --> 00:35:32.000
at premium, time is at premium, what your mix of operations are,

00:35:32.000 --> 00:35:33.000
what

00:35:33.000 --> 00:35:34.000
language you're working with, what

00:35:34.000 --> 00:35:38.000
program expertise you have, what timeline you have for developing it all

00:35:38.000 --> 00:35:42.000
can play a role in deciding what choices to make and what strategy to pursue.

00:35:42.000 --> 00:35:45.000
So being very comfortable with this idea that it's about gathering information

00:35:45.000 --> 00:35:46.000
before you

00:35:46.000 --> 00:35:49.000
can commit to one strategy as opposed to it's always going to be

00:35:49.000 --> 00:35:52.000
this or that.

00:35:52.000 --> 00:35:54.000
So that's the 106b thing.

00:35:54.000 --> 00:35:57.000
I have a couple slides here on

00:35:57.000 --> 00:35:59.000
things that happen after 106b. I'm

00:35:59.000 --> 00:36:01.000
just going to go ahead and show you.

00:36:01.000 --> 00:36:03.000


00:36:03.000 --> 00:36:06.000
This would be a great time to ask questions. In particular, if there's anything you're

00:36:06.000 --> 00:36:07.000
curious about.

00:36:07.000 --> 00:36:10.000
Where do we go from here, and

00:36:10.000 --> 00:36:13.000
how do we make good decisions about things that follow? The

00:36:13.000 --> 00:36:17.000
most obvious following courses from CS - if you took 106b

00:36:17.000 --> 00:36:19.000
and you hated it, I'm very sorry.

00:36:19.000 --> 00:36:23.000
Hopefully

00:36:23.000 --> 00:36:25.000
the scarring

00:36:25.000 --> 00:36:26.000
will heal over and you'll lead a

00:36:26.000 --> 00:36:28.000
productive and happy life

00:36:28.000 --> 00:36:30.000
elsewhere. If you did like it and you think,

00:36:30.000 --> 00:36:33.000
I'm kind of interested in this. What do I do next to explore further?

00:36:33.000 --> 00:36:35.000
Where else can I go with this?

00:36:35.000 --> 00:36:38.000
I put up the intro

00:36:38.000 --> 00:36:40.000
courses with their numbers and names, and I'll talk a little bit

00:36:40.000 --> 00:36:44.000
about what each of these does to give you an idea of what opportunity it

00:36:44.000 --> 00:36:45.000
offers to you.

00:36:45.000 --> 00:36:49.000
The obvious follow onto the programming side - 106a and b

00:36:49.000 --> 00:36:51.000
are programming courses, building your programming mastery.

00:36:51.000 --> 00:36:54.000
107 follow in the same vein. We typically call those systems

00:36:54.000 --> 00:36:56.000
courses in CS nomenclature.

00:36:56.000 --> 00:36:59.000
That means practical, hands on programming, getting more exposure

00:36:59.000 --> 00:37:00.000
to the system.

00:37:00.000 --> 00:37:05.000
So 107 builds on 106b's knowledge in a class called programming paradigms.

00:37:05.000 --> 00:37:08.000
Part of what it's trying to explore is different languages.

00:37:08.000 --> 00:37:11.000
So you actually look at the language, C, kind of stepping back to old school.

00:37:11.000 --> 00:37:15.000
You look at the language scheme, which is a functional language that

00:37:15.000 --> 00:37:17.000
was developed for AI purposes

00:37:17.000 --> 00:37:17.000
that

00:37:17.000 --> 00:37:21.000
has a long history coming from the math side of things. Probably looking at some

00:37:21.000 --> 00:37:25.000
scripting and flexible [inaudible] like Python. We kind of mix them up a

00:37:25.000 --> 00:37:28.000
little bit, so it's not exactly the same things, but looking at other

00:37:28.000 --> 00:37:30.000
languages, other ways of doing things.

00:37:30.000 --> 00:37:33.000
The other component of 107 is also looking at

00:37:33.000 --> 00:37:35.000
some more low-level things.

00:37:35.000 --> 00:37:37.000
What happens? When you

00:37:37.000 --> 00:37:41.000
pass your code off to the compiler, what is it doing? What kind of analysis does do of your

00:37:41.000 --> 00:37:44.000
code? How does it actually generate machine code? What happens in the machine layer,

00:37:44.000 --> 00:37:47.000
understanding some more of the performance implications

00:37:47.000 --> 00:37:50.000
and how things are expressed. Getting a better understanding of pointers actually

00:37:50.000 --> 00:37:53.000
comes up because there is a certain amount of low-level coding that's being

00:37:53.000 --> 00:37:54.000
explored in this class.

00:37:54.000 --> 00:37:58.000
So building on the programming skill as we move toward a Unix

00:37:58.000 --> 00:38:00.000
Linux-based

00:38:00.000 --> 00:38:02.000
environment. So away from the Mac and Windows.

00:38:02.000 --> 00:38:03.000


00:38:03.000 --> 00:38:08.000
Just continue building larger programs with complicated

00:38:08.000 --> 00:38:10.000
features to grow your mastery.

00:38:10.000 --> 00:38:13.000
Internally, we call this class programming maturity, even though

00:38:13.000 --> 00:38:14.000
it's not its title.

00:38:14.000 --> 00:38:18.000
We see it that way in the sequence in terms of just growing you into

00:38:18.000 --> 00:38:19.000
a

00:38:19.000 --> 00:38:22.000
more versatile programmer, seeing different languages, different things

00:38:22.000 --> 00:38:25.000
and getting more understanding of what happens under the hood.

00:38:25.000 --> 00:38:29.000
It is a five-unit class. It's typically offered fall and spring. Most

00:38:29.000 --> 00:38:30.000
people thing it is

00:38:30.000 --> 00:38:33.000
about as much work as 106b. Some a

00:38:33.000 --> 00:38:37.000
little more or a little less in different parts of the quarter, depending on what your

00:38:37.000 --> 00:38:38.000
background is and

00:38:38.000 --> 00:38:39.000


00:38:39.000 --> 00:38:41.000
how 106b worked for you.

00:38:41.000 --> 00:38:44.000
In a model, I think most people think of it as being - would you guys

00:38:44.000 --> 00:38:45.000
say 107

00:38:45.000 --> 00:38:48.000
about as hard as 106b?

00:38:48.000 --> 00:38:51.000
Some ways, it's harder. Some ways, it's not. Maybe that's

00:38:51.000 --> 00:38:55.000
the deal. The 108 class which follows onto 107, but that

00:38:55.000 --> 00:38:57.000
prerequisite is particularly strong. It's a Java class.

00:38:57.000 --> 00:39:00.000
So moving in the other direction. Rather than moving down an extraction, moving

00:39:00.000 --> 00:39:02.000
up on the extraction,

00:39:02.000 --> 00:39:05.000
looking at more large-scale design, modern programming technique,

00:39:05.000 --> 00:39:06.000
using

00:39:06.000 --> 00:39:10.000
object-oriented facilities, thinking about large-scale programming. You do a

00:39:10.000 --> 00:39:11.000
big team project in the

00:39:11.000 --> 00:39:13.000
end, so a three week,

00:39:13.000 --> 00:39:15.000
three person, massive effort

00:39:15.000 --> 00:39:17.000
that is the first

00:39:17.000 --> 00:39:26.000
really big scale thing that you'll be exposed to in the lower-division

00:39:26.000 --> 00:39:30.000
curriculum. [Inaudible]. Sometimes in 107l, which has some relationship to the

00:39:30.000 --> 00:39:33.000
106l in the way that it's more hands-on, more

00:39:33.000 --> 00:39:34.000
C++,

00:39:34.000 --> 00:39:37.000
I don't know whether it will be offered this spring or not,

00:39:37.000 --> 00:39:38.000
but

00:39:38.000 --> 00:39:41.000
if you look in Access, it will tell you at least whether we have it tentatively on

00:39:41.000 --> 00:39:42.000
the schedule.

00:39:42.000 --> 00:39:46.000
I think that comes and goes with the interest from the students and the

00:39:46.000 --> 00:39:51.000
availability of someone to teach

00:39:51.000 --> 00:39:52.000
it.

00:39:52.000 --> 00:39:55.000
[Inaudible] can we watch [inaudible]?

00:39:55.000 --> 00:39:58.000
Yeah, 107 and 108 are offered on TV usually

00:39:58.000 --> 00:40:02.000
once a year. 107 is offered on TV this spring. The one in fall is

00:40:02.000 --> 00:40:03.000
not taped.

00:40:03.000 --> 00:40:06.000
It might be that the old lectures from spring are somewhere that

00:40:06.000 --> 00:40:09.000
you could dig them out, but I actually don't happen to know for sure.

00:40:09.000 --> 00:40:11.000
108 is on TV, I think,

00:40:11.000 --> 00:40:13.000
in the fall, not in the winter. So the fall lectures are probably still around for

00:40:13.000 --> 00:40:16.000
108, so you could take a look at them. You

00:40:16.000 --> 00:40:19.000
could certainly look at their websites, and

00:40:19.000 --> 00:40:23.000
there's often a lot of old materials and handouts

00:40:23.000 --> 00:40:26.000
and stuff that gives you at least some idea of things that have been

00:40:26.000 --> 00:40:29.000
covered in the most recent offering of it.

00:40:29.000 --> 00:40:34.000
Certainly showing up in the first week is one way to get an

00:40:34.000 --> 00:40:36.000
idea of what you're in for.

00:40:36.000 --> 00:40:38.000
108 is a four-unit class.

00:40:38.000 --> 00:40:41.000
Because it's in Java, there's certain things about Java,

00:40:41.000 --> 00:40:44.000
the safety and

00:40:44.000 --> 00:40:48.000
the attentiveness of the compiler, makes the error handing a little

00:40:48.000 --> 00:40:52.000
bit easier. So I don't think most people think of 108 as quite as intense as

00:40:52.000 --> 00:40:55.000
107 or 106b, but that project at the end has quite a reputation

00:40:55.000 --> 00:40:58.000
as being a

00:40:58.000 --> 00:41:02.000
real time suck. So it has lots to do with being

00:41:02.000 --> 00:41:04.000
scheduled and motivated and working through that team project at

00:41:04.000 --> 00:41:05.000
the end.

00:41:05.000 --> 00:41:09.000
I think that's where most people find the big challenge of

00:41:09.000 --> 00:41:12.000
108. Then on the math side, in the

00:41:12.000 --> 00:41:15.000
nomenclature for CS, we have this idea of theory, introducing you to more

00:41:15.000 --> 00:41:16.000


00:41:16.000 --> 00:41:19.000
the formalism for big O and discrete structures and doing proofs and

00:41:19.000 --> 00:41:21.000
thinking about logic.

00:41:21.000 --> 00:41:25.000
So we have a 103 a and b and a 103x, which is a combined,

00:41:25.000 --> 00:41:28.000
accelerated form of a and b, which serves as math classes. They are math classes

00:41:28.000 --> 00:41:29.000
for introducing

00:41:29.000 --> 00:41:33.000
the kind of mathematics that are important for computer scientists.

00:41:33.000 --> 00:41:34.000
So that is

00:41:34.000 --> 00:41:38.000
for people who are thinking about a CS major, another good course to get in

00:41:38.000 --> 00:41:39.000
early. Pretty

00:41:39.000 --> 00:41:43.000
much everything in the upper division of the CS major has

00:41:43.000 --> 00:41:47.000
these things layered down as prereqs. So these are the intro courses that serve as the

00:41:47.000 --> 00:41:50.000
gateway to prepping you to go on, and then looking at things in a more topical

00:41:50.000 --> 00:41:51.000
way.

00:41:51.000 --> 00:41:53.000
Looking at networks or security,

00:41:53.000 --> 00:41:54.000
HCI or

00:41:54.000 --> 00:41:59.000
artificial intelligence layers on a grounding in theory and programming

00:41:59.000 --> 00:42:03.000
that came from the intro courses. So

00:42:03.000 --> 00:42:07.000
that said, we're in the

00:42:07.000 --> 00:42:08.000
midst of a really

00:42:08.000 --> 00:42:10.000
serious curriculum revision,

00:42:10.000 --> 00:42:13.000
which is

00:42:13.000 --> 00:42:15.000
on the table and being designed right now,

00:42:15.000 --> 00:42:18.000
and was voted on by our faculty and improved and has to go

00:42:18.000 --> 00:42:22.000
through the school of engineering and get approval. Then there's going to be a little bit of

00:42:22.000 --> 00:42:23.000
jostling as everything gets worked out.

00:42:23.000 --> 00:42:27.000
But started next year, you're going to see some changes in the

00:42:27.000 --> 00:42:29.000
department. Some of these courses are going to

00:42:29.000 --> 00:42:33.000
get morphed, and

00:42:33.000 --> 00:42:36.000
they won't be exactly the way they are today.

00:42:36.000 --> 00:42:39.000
As a Stanford undergrad, you actually have the option of graduating under any set of

00:42:39.000 --> 00:42:42.000
requirements that are in play any time you're an undergrad. So you could graduate under this year's

00:42:42.000 --> 00:42:43.000
requirements,

00:42:43.000 --> 00:42:45.000
or you could wait.

00:42:45.000 --> 00:42:48.000
I think for most, the right answer's going to be wait because I think the new

00:42:48.000 --> 00:42:51.000
arrangement is very much more student-friendly.

00:42:51.000 --> 00:42:53.000
It has more flexibility

00:42:53.000 --> 00:42:57.000
in the program. If you look at the CS program as it is, it has a lot of pick

00:42:57.000 --> 00:43:00.000
one of two, pick one of three, pick one of two,

00:43:00.000 --> 00:43:04.000
where you're very constrained on what you can choose. The new program is going to be

00:43:04.000 --> 00:43:07.000
more track-based where you pick an area of interest. You say, I'm really interested in the

00:43:07.000 --> 00:43:08.000
graphics. You do a little

00:43:08.000 --> 00:43:12.000
depth concentration in that, and then just have room for a

00:43:12.000 --> 00:43:15.000
very wide selection of electives to fill up the program as opposed to

00:43:15.000 --> 00:43:19.000
a smorgasbord where we forced you to pick through a bunch of different areas.

00:43:19.000 --> 00:43:24.000
So for most students, I think it gives you flexibility that actually gives you

00:43:24.000 --> 00:43:25.000
the options

00:43:25.000 --> 00:43:26.000
to

00:43:26.000 --> 00:43:30.000
pick the classes that are most interesting to you and get the material that you want. It's a growing

00:43:30.000 --> 00:43:32.000
recognition of the fact that CS

00:43:32.000 --> 00:43:35.000
has really broadened as a field. Some say that our major looks the same way

00:43:35.000 --> 00:43:37.000
it did ten years ago,

00:43:37.000 --> 00:43:40.000
a classic definition of here's these

00:43:40.000 --> 00:43:42.000
seven things that all matter. The

00:43:42.000 --> 00:43:45.000
things aren't seven anymore. There's no 15 of them, 25 of them. If

00:43:45.000 --> 00:43:49.000
you look at the spectrum of things that computer science is now

00:43:49.000 --> 00:43:52.000
embracing, we can't say, you have to take one of all these things without keeping you here

00:43:52.000 --> 00:43:53.000
for eight years.

00:43:53.000 --> 00:43:55.000
So we are

00:43:55.000 --> 00:43:58.000
allowing that our field's footprint has gotten so large that we need to

00:43:58.000 --> 00:44:02.000
let you pick and choose a little bit to get the slice that seems most

00:44:02.000 --> 00:44:03.000
appropriate for where you're headed.

00:44:03.000 --> 00:44:06.000
So you'll see more about that.

00:44:06.000 --> 00:44:10.000
If you are thinking about being a CS major

00:44:10.000 --> 00:44:13.000
and have not yet declared,

00:44:13.000 --> 00:44:17.000
I would advise

00:44:17.000 --> 00:44:22.000
that you add yourself to what's called the considering CS list,

00:44:22.000 --> 00:44:26.000
and you can get to this from the CS course advisor's page,

00:44:26.000 --> 00:44:27.000
which,

00:44:27.000 --> 00:44:30.000
off the top of my head, I think

00:44:30.000 --> 00:44:32.000
it might be

00:44:32.000 --> 00:44:36.000
CSadvising, but I'm not positive. I'll put the link on

00:44:36.000 --> 00:44:39.000
our webpage, when I remember what it is.

00:44:39.000 --> 00:44:42.000
It is a mailing list. So we have a mailing list of the declared undergrads, but we

00:44:42.000 --> 00:44:45.000
also have a list for people who are

00:44:45.000 --> 00:44:48.000
at least potentially interested in the major and just want to be kept up to date on

00:44:48.000 --> 00:44:49.000
things that are happening.

00:44:49.000 --> 00:44:52.000
There's going to be a lot of activity on this list in the

00:44:52.000 --> 00:44:55.000
upcoming months as we publish the information about the new

00:44:55.000 --> 00:44:56.000
curriculum.

00:44:56.000 --> 00:44:57.000
So

00:44:57.000 --> 00:45:00.000
you'll know what's coming down the pipes so you can make good decisions for

00:45:00.000 --> 00:45:01.000
preparing for

00:45:01.000 --> 00:45:04.000
what will be [inaudible], what the transition plan will be as we move into the new

00:45:04.000 --> 00:45:05.000
courses

00:45:05.000 --> 00:45:06.000
and how to make sure you

00:45:06.000 --> 00:45:09.000
land

00:45:09.000 --> 00:45:13.000
in the right place when all is said

00:45:13.000 --> 00:45:14.000
and done. Then I put

00:45:14.000 --> 00:45:15.000
a little note about

00:45:15.000 --> 00:45:17.000
other majors that exist.

00:45:17.000 --> 00:45:19.000
CS is

00:45:19.000 --> 00:45:20.000


00:45:20.000 --> 00:45:22.000
the home base for people who are

00:45:22.000 --> 00:45:25.000
solving problems using computers. It spans a large variety of people

00:45:25.000 --> 00:45:27.000
doing a lot of different things.

00:45:27.000 --> 00:45:30.000
Then there's a couple other majors that we participate in, in the interdisciplinary

00:45:30.000 --> 00:45:32.000
sense,

00:45:32.000 --> 00:45:34.000
that UCS is part of a larger

00:45:34.000 --> 00:45:37.000
context for doing things. So computer systems engineering, which is

00:45:37.000 --> 00:45:40.000
between us and EE, which is a little bit more about hardware.

00:45:40.000 --> 00:45:43.000
The symbolic systems program, which is

00:45:43.000 --> 00:45:47.000
run by linguistics and includes cooperation from psychology, philosophy, communications,

00:45:47.000 --> 00:45:51.000
CS looking at artificial intelligence and modeling human

00:45:51.000 --> 00:45:52.000
thought and

00:45:52.000 --> 00:45:53.000
understanding

00:45:53.000 --> 00:45:55.000


00:45:55.000 --> 00:45:57.000
intelligence as expressed in that way.

00:45:57.000 --> 00:45:59.000
Then there's a mathematical computational science program that's homed

00:45:59.000 --> 00:46:03.000
in the statistic department that is joint between math, computer science and the

00:46:03.000 --> 00:46:06.000
MS and E. It's more of an applied math degree. So taking

00:46:06.000 --> 00:46:08.000
mathematical thinking and

00:46:08.000 --> 00:46:09.000


00:46:09.000 --> 00:46:12.000
using computers and statistical things to analyze and

00:46:12.000 --> 00:46:16.000
think about things. A lot of financial people end up

00:46:16.000 --> 00:46:20.000
being drawn to that particular one or people who are doing risk analysis or

00:46:20.000 --> 00:46:21.000
that sort of

00:46:21.000 --> 00:46:24.000
combination of things. But where CS has a role to play in all of these because it's a

00:46:24.000 --> 00:46:27.000
great tool for solving these kind of problems. So

00:46:27.000 --> 00:46:29.000
different ways to get at

00:46:29.000 --> 00:46:30.000
different pieces

00:46:30.000 --> 00:46:33.000
of the major.

00:46:33.000 --> 00:46:35.000
I've put a couple notes of things that

00:46:35.000 --> 00:46:38.000
I think are worth

00:46:38.000 --> 00:46:39.000
looking into

00:46:39.000 --> 00:46:40.000
in case they,

00:46:40.000 --> 00:46:43.000
at some point, are the right thing for you to look at. Section leading,

00:46:43.000 --> 00:46:45.000
so joining the 106 staff

00:46:45.000 --> 00:46:46.000
and shepherding the next generation

00:46:46.000 --> 00:46:49.000
of students through the program

00:46:49.000 --> 00:46:52.000
is a great way to strengthen your own knowledge. There's nothing like

00:46:52.000 --> 00:46:54.000
making sure you understand recursion as to try and explain it to someone else who

00:46:54.000 --> 00:46:55.000
doesn't understand it.

00:46:55.000 --> 00:46:56.000
I

00:46:56.000 --> 00:46:59.000
can really do a lot for

00:46:59.000 --> 00:47:01.000
the personal connection to you and your craft

00:47:01.000 --> 00:47:04.000
as well as just being involved in a great community. The section leaders are a really

00:47:04.000 --> 00:47:05.000
wonderful,

00:47:05.000 --> 00:47:08.000
dedicated group of students who are

00:47:08.000 --> 00:47:08.000


00:47:08.000 --> 00:47:12.000
fun to be around and great people to make friends with. So I

00:47:12.000 --> 00:47:14.000
highly encourage you to take a look at that.

00:47:14.000 --> 00:47:16.000
We interview for that every quarter,

00:47:16.000 --> 00:47:17.000
so usually about week six,

00:47:17.000 --> 00:47:19.000
we put out

00:47:19.000 --> 00:47:21.000
applications and bring people in for interviews. We're always looking for new people

00:47:21.000 --> 00:47:24.000
to join on a quarterly basis.

00:47:24.000 --> 00:47:27.000
There are research opportunities all around the campus. One that the CS

00:47:27.000 --> 00:47:30.000
department sponsors in particular is their [inaudible] summer institute.

00:47:30.000 --> 00:47:33.000
So matching professors with undergrads to do work in the summer for

00:47:33.000 --> 00:47:35.000
slave wages.

00:47:35.000 --> 00:47:38.000
But great opportunity. You can get started on

00:47:38.000 --> 00:47:40.000
thinking about how research might be part of your future.

00:47:40.000 --> 00:47:43.000
One of the best preparations for thinking about grad school.

00:47:43.000 --> 00:47:46.000
We do have an honors program.

00:47:46.000 --> 00:47:50.000
Like most, we have a chance to stake out on an independent project and

00:47:50.000 --> 00:47:50.000
show your mettle.

00:47:50.000 --> 00:47:53.000
We have a co-terminal masters program

00:47:53.000 --> 00:47:54.000
where you can

00:47:54.000 --> 00:47:57.000
get some of the depth

00:47:57.000 --> 00:47:58.000
before you leave this place.

00:47:58.000 --> 00:48:01.000
Not to mention, there's a million fabulous

00:48:01.000 --> 00:48:03.000
things to do with Stanford, getting involved with

00:48:03.000 --> 00:48:06.000
arts or nonprofit organizations or sports or the student leadership

00:48:06.000 --> 00:48:07.000
or whatever.

00:48:07.000 --> 00:48:10.000
Maybe the most important memory I have from my undergrad was that the

00:48:10.000 --> 00:48:13.000
trickiest part of my undergrad was learning what to say no to.

00:48:13.000 --> 00:48:14.000
Realizing that there are

00:48:14.000 --> 00:48:17.000
10,000 fabulous, worthwhile opportunities,

00:48:17.000 --> 00:48:21.000
and doing all of them only means sacrificing your sanity and

00:48:21.000 --> 00:48:23.000
the quality of the things you do.

00:48:23.000 --> 00:48:24.000


00:48:24.000 --> 00:48:29.000
So be careful and thoughtful about choosing the things you do and embracing them.

00:48:29.000 --> 00:48:32.000
Be okay with saying, I won't do this. Even though it's great to go overseas, and I

00:48:32.000 --> 00:48:35.000
would've liked that opportunity, it's not going to fit, and I'm okay with that. Don't

00:48:35.000 --> 00:48:38.000
beat yourself up over it and cause

00:48:38.000 --> 00:48:41.000
yourself too much grief trying to do too many things.

00:48:41.000 --> 00:48:44.000
I won't be here on Friday. Keith is going to come. He's going to teach you about C++ in the real

00:48:44.000 --> 00:48:45.000
world.

00:48:45.000 --> 00:48:47.000
I'm going to see you guys Friday at the final.

00:48:47.000 --> 00:48:57.000
I think that's all I need to say for today.