Lecture 26 | Programming Abstractions (Stanford)

Edit subtitles

0:00 - 0:13
0:13 - 0:16

This presentation is delivered by the Stanford Center for Professional
0:16 - 0:26

Development.
0:26 - 0:27

Hi, there.
0:27 - 0:29

It's
0:29 - 0:36

Wednesday. We are coming to the end of this fine quarter. So
0:36 - 0:39

what I'm going to do today is I
0:39 - 0:43

brought a cup filled with candy. We'll start with that
0:43 - 0:45

because that's going to be helpful. What I'd like to do is it's just a couple
0:45 - 0:50

bits of loose ends, some things that I wanted to touch on that
0:50 - 0:53

have come and gone this quarter. I maybe wanted to pull them together and reach a bit of closure
0:53 - 0:54

on.
0:54 - 0:57

Then hopefully, if we have time, we'll move on to what
0:57 - 1:00

happens after this. If you liked 106b, what do you do? If you don't like
1:00 - 1:01

106b, what do you do?
1:01 - 1:02
1:02 - 1:06

What kind of options and future opportunities are there
1:06 - 1:08

after this
1:08 - 1:10

ends? I'm going to talk a little about some of the things that
1:10 - 1:12

[inaudible], but I do want to talk about the final just
1:12 - 1:15

for a second because it is kind of a
1:15 - 1:17

last
1:17 - 1:19

opportunity for you to show us your mastery of the material
1:19 - 1:22

and get the grade that you've been working hard for all quarter.
1:22 - 1:26

It does carry a certain amount of weight. 30 percent of the weight in the default
1:26 - 1:27

system is actually placed on the final,
1:27 - 1:30

and more of it can move to that if you have a better
1:30 - 1:33

showing on the final than you did in the midterm. We're happy to move some of that
1:33 - 1:34

weight over so it could actually
1:34 - 1:36

account for even more of your total grade.
1:36 - 1:38

So there's a lot riding on it, and you want to be sure you come
1:38 - 1:40

ready to show us the
1:40 - 1:44

things that you know and that you have comprehension and mastery over.
1:44 - 1:47

To note, though, it is going to look a little bit different than the midterm in terms of its
1:47 - 1:51

emphasis. We are going to see a lot more of this nitty gritty dynamic data
1:51 - 1:52

structure,
1:52 - 1:53

Linklis, trees,
1:53 - 1:56

graphs, pointer-based
1:56 - 1:58

structures with the
1:58 - 2:01

inherent complications in dealing with that kind of dynamic structure. So bringing
2:01 - 2:03

your best game to this exam
2:03 - 2:05

is certainly going to pay off here. The
2:05 - 2:08

coding that you will see will look a lot like what's in the assignments. If you already had a
2:08 - 2:11

chance to look at the practice exam, you'll see that there are
2:11 - 2:13

themes there that echo the things you did on the work. You want to try to make sure
2:13 - 2:15

that's we're connecting to the
2:15 - 2:16
2:16 - 2:17

effort you've put in all along.
2:17 - 2:20

There will actually be some opportunity to think a little bit at a higher level
2:20 - 2:23

about making and analyzing choices for design and implementation. I'm going to talk a
2:23 - 2:25

little bit about that today because I feel like
2:25 - 2:26

at any given moment, we're talking about
2:26 - 2:30

how we might implement a stack and how we might implement a queue. Sometimes it helps to
2:30 - 2:32

step back a level and think a little bit about the choices for
2:32 - 2:37

the bigger picture of when to use a stack and queue and how to make
2:37 - 2:40

decisions in the large about those tradeoffs that are intelligent and
2:40 - 2:42

grounded. So I'll do a little bit of that today, and then
2:42 - 2:45

some of that has been built up all quarter long.
2:45 - 2:48

It is open book and open notes. You can bring all your printouts, your readers,
2:48 - 2:50

sections, all those sort of things.
2:50 - 2:53

For some people, that translates to this idea that, well, I don't need to
2:53 - 2:56

prepare because I'll have all my stuff there. If I need to know something
2:56 - 2:59

in the exam, I'll just go look it up. That's certainly true for looking up details. You want to remember how
2:59 - 3:02

this works or you had a piece of code that you think was similar. It could help
3:02 - 3:05

you to brainstorm what you're trying to do this time. But
3:05 - 3:08

you're not likely to have time to learn new things n the exam. So if there's
3:08 - 3:11

something that you really feel you have not
3:11 - 3:12

gotten your head around,
3:12 - 3:17

the investment upfront before you get into the exam is where to get that
3:17 - 3:19

understanding in place rather than trying, in the exam, to be flipping
3:19 - 3:22

through chapter ten, trying to learn something. It
3:22 - 3:23

may not really play out.
3:23 - 3:27

I highly recommend practicing on paper in an exam-like environment. So really
3:27 - 3:28

working
3:28 - 3:31

the way you will be working in the exam to give yourself
3:31 - 3:33

the best,
3:33 - 3:35

consistent prep for what's happening there.
3:35 - 3:39

Then some people have asked about extra problems, just wanting more of them to work on. I'll give you a
3:39 - 3:41

couple places where you can look for things. One is that cs106 is
3:41 - 3:44

being offered this same quarter, and they are just the honors version of our class.
3:44 - 3:45

So they have some
3:45 - 3:49

slightly notched up problems. But their practice and real midterm are posted on their
3:49 - 3:51

website as well as their practice final.
3:51 - 3:52
3:52 - 3:55

They have very similar coverage. They did some assignments that were like
3:55 - 3:58

ours, some that were a little bit different, but still go places to kind of
3:58 - 3:59

just grab
3:59 - 4:00

actual exam problems with their solution.
4:00 - 4:04

The chapter exercises and section problems often are old
4:04 - 4:05

exam problems or
4:05 - 4:08

eventually became exam problems. So they kind of come and go in terms of being used as
4:08 - 4:12

examples or used as exam problems. So they're definitely places to
4:12 - 4:17

pick up extra mileage. Okay. So let me tell
4:17 - 4:18

you a little bit about
4:18 - 4:21

- any questions about the final? Okay. Let me talk about
4:21 - 4:26

this line
4:26 - 4:28

just for a second because I feel like we have
4:28 - 4:31

touched on these themes again and again, but maybe it's good to have
4:31 - 4:32

at least one
4:32 - 4:35

moment of trying to pull this together and think a little bit about
4:35 - 4:38

the closure and the big picture of all these things.
4:38 - 4:41

A lot of what we're talking about in the beginning was we
4:41 - 4:44

were giving you these abstractions, these cool things to do stuff with, a map, a
4:44 - 4:45

stack, a queue, a vector,
4:45 - 4:49

and problems to solve where those abstractions have a role to play.
4:49 - 4:50

Then we spent a long time saying,
4:50 - 4:53

okay, now that it's our job to make those abstractions work,
4:53 - 4:56

what techniques can we use? What algorithms
4:56 - 4:59

and data structures will help us manipulate and
4:59 - 5:02

model those structures in efficient ways. We
5:02 - 5:05

certainly spent a lot of time talking about runtime performance as though that
5:05 - 5:08

was maybe a little too much in your mind to think that was the only thing
5:08 - 5:11

that really mattered. How fast can we make something? O of N is better
5:11 - 5:12

than N squared.
5:12 - 5:15

Log N is better than N, driving to these holy grails of
5:15 - 5:16

bringing the time down.
5:16 - 5:19

It's certainly a very important factor
5:19 - 5:20

that often -
5:20 - 5:23

the difference between something being linear and something being quadratic is a
5:23 - 5:27

matter of it being tractable at all for the problem at hand. You cannot sort a
5:27 - 5:32

million numbers in a quadratic sort in nearly
5:32 - 5:35

the same time. For large enough data sets, it'd
5:35 - 5:37

be completely unfeasible.
5:37 - 5:40

So it is important to have that big picture, to be thinking about it and know about
5:40 - 5:42

those tradeoffs, to understand issues
5:42 - 5:44

of scale here.
5:44 - 5:47

But we also did some stuff on the homework about actual empirical time
5:47 - 5:50

results, which is another way to bolster our understanding. The big O
5:50 - 5:51

tells us these things, but
5:51 - 5:53

what really happens in real life
5:53 - 5:57

matches it, but not precisely. Those constant factors, we threw away,
5:57 - 6:00

and other artifacts that are
6:00 - 6:03

happening in the real world often give us new insights about things and
6:03 - 6:06

challenges that we're facing.
6:06 - 6:09

We also need to think about things like mix of expected operations, having some be
6:09 - 6:11

slow
6:11 - 6:14

at the consequence of other operations being fast. It's often a tradeoff we're
6:14 - 6:16

willing to make.
6:16 - 6:20

On the editor rougher, we drove toward getting all the operations to be
6:20 - 6:23

fast. In an ideal world, that's certainly the best place to be, but
6:23 - 6:26

it's also a matter at what cost
6:26 - 6:29

in terms of code complexity and memory used
6:29 - 6:33

and effort put into it. It may be that we're willing to tolerate some operations
6:33 - 6:34

being less efficient
6:34 - 6:39

as long as the most commonly used ones are very streamlined.
6:39 - 6:42

We talked about these worst cases and degenerates about
6:42 - 6:45

knowing when we can expect our algorithms to degrade and whether we
6:45 - 6:47

want to do something about that
6:47 - 6:50

or choose a different algorithm or provide some
6:50 - 6:54

protection against those kind of cases coming through. To a lesser
6:54 - 6:58

extent, we already talked about memory used, how much overhead
6:58 - 7:01

there is. Seeing that we went to a pointer-based Linklis structure, we added
7:01 - 7:03

the four byte overhead
7:03 - 7:06

for every element. We went to an eight byte overhead once we got minor
7:06 - 7:10

[inaudible] and maybe even a little bit more if we had balance factors included.
7:10 - 7:13

Those things ride along with our data,
7:13 - 7:17

increasing the size and the memory footprint of it.
7:17 - 7:18

So there's a certain amount of
7:18 - 7:21

overhead built into this system. There's also this notion of excess capacity. To what
7:21 - 7:24

extent do we over allocate our
7:24 - 7:27

data structures early, planning for future growth. In some senses, save
7:27 - 7:32

ourselves that trouble of enlarging on demand. It's preparing for
7:32 - 7:33
7:33 - 7:34

a large
7:34 - 7:36

allotment and then using it up when we find it.
7:36 - 7:39

Then when we do exceed that capacity, having you do it again. So where
7:39 - 7:42

do we make those tradeoffs about how much excess capacity and when to do
7:42 - 7:44

the enlarging?
7:44 - 7:48

It has a lot to do with what our general memory constraints are.
7:48 - 7:51

I would say that memory has gotten to be quite
7:51 - 7:53

free
7:53 - 7:56

in recent processors and
7:56 - 7:58

computer systems. You don't really think about
7:58 - 8:01

100 bytes here, 200 bytes there, 1,000 bytes there. Even
8:01 - 8:05

megabytes, you can talk about as thought it's a drop in the bucket.
8:05 - 8:06
8:06 - 8:09

So this is something that isn't given as much weight
8:09 - 8:12

nowadays as it was ten years ago when there were a lot more constraints. But with
8:12 - 8:14

the move for a lot more imbedded device programming, and looking the
8:14 - 8:16

iPhone or
8:16 - 8:19

cell phones or things like that where they don't have that same liberty
8:19 - 8:23

to be wanton about memory, it's come back to being a little more careful and keeping
8:23 - 8:25

an eye on it.
8:25 - 8:28

There's also and issue here which trades off a lot with
8:28 - 8:31

runtime, which is redundancy versus sharing. Having
8:31 - 8:33

two different ways to get at the same piece of data
8:33 - 8:38

often will provide you with quick access. For example, if you have a
8:38 - 8:42

card catalogue for a library where you can look up by author, you can look up by title, you
8:42 - 8:44

probably are maintaining two data structures.
8:44 - 8:46

One that manipulates them,
8:46 - 8:49

sorted by title, one that manipulates them sorted by author. They're both kind
8:49 - 8:52

of looking at the same books in the end.
8:52 - 8:55

So maybe there's this set of pointers in one map over here,
8:55 - 8:58

indexed by author, another
8:58 - 9:00

keyed by title over there.
9:00 - 9:03

That means that I'm repeating all my data.
9:03 - 9:06

Hopefully I'm sharing it with a pointer, so I'm not actually really repeating all the
9:06 - 9:10

data, but I actually have two pointers or potentially two or more
9:10 - 9:11

pointers to the same book,
9:11 - 9:15

depending on the different ways you can get to it. That gives me that fast
9:15 - 9:17

access. I want to know all the books written by Leo
9:17 - 9:22

Lionni, I can find them easily, as well as finding all the books that have King in the
9:22 - 9:26

title. I don't have to do searches on a data set that's been optimized for only
9:26 - 9:27

one
9:27 - 9:28

access. Definitely,
9:28 - 9:32

tradeoffs were more space thrown at it to get at that fast access for
9:32 - 9:34

different ways.
9:34 - 9:40

Then this last one is one that I had mentioned along the way, but I think it's one that's easy to overlook,
9:40 - 9:42

which is to get very excited about how fast you can make it, how small you can
9:42 - 9:43

make it,
9:43 - 9:46

how fancy it is. It
9:46 - 9:50

has a real downside in terms of how hard it was to
9:50 - 9:53

write. Typically, when you go to these fancier strategies
9:53 - 9:54

that are
9:54 - 9:58

space efficient and time efficient, you had to pay for it somewhere.
9:58 - 10:01

It wasn't for free that the easy, obvious code would do that. It tends to
10:01 - 10:05

actually be complicated code that may use bid operations, that probably uses
10:05 - 10:06

more pointers
10:06 - 10:10

that has kind of a density to it that means it's going to take you longer to get it
10:10 - 10:12

right. It's going to take you more time to debug it. It's going to be more likely to
10:12 - 10:15

have lurking bugs. It's going to be harder to maintain.
10:15 - 10:18

If somebody comes along and wants to add a new operation,
10:18 - 10:21

they open up your code, and they're frightened away. It might be that that
10:21 - 10:23

code will never get
10:23 - 10:25

moved forward. Everybody will just open it
10:25 - 10:29

and close it immediately and say, whatever. We'll make it work the way it is,
10:29 - 10:33

rather than trying to improve it or move it forward because it actually is kind of scary
10:33 - 10:34

to look at.
10:34 - 10:35

So thinking about
10:35 - 10:37

what's the simplest thing that could work.
10:37 - 10:40

There's this movement that I think is pretty neat to think about.
10:40 - 10:43

There's this idea of the agile programming methodology.
10:43 - 10:44
10:44 - 10:45

It's based on the idea that
10:45 - 10:49

rather than planning for building the most super stellar
10:49 - 10:51

infrastructure for solving all the worlds problems. You're about to write a chess
10:51 - 10:53

program.
10:53 - 10:55

Actually, something that's even simpler. You're going to write a checkers program. You say,
10:55 - 10:59

I'm going to sit down, and I'm going to design the pieces and the board. Then you think, hey,
10:59 - 11:02

checkers is just one embodiment of this. Why don't I try to make the
11:02 - 11:07

uber board that can be used for strategic conquest or for chess or Chinese
11:07 - 11:11

checkers. You could have these hexagonal things. You start building this crazy
11:11 - 11:12

infrastructure that
11:12 - 11:15

could handle all of these cases equally well, even though all you really needed
11:15 - 11:18

was checkers, a very simple, 2D grid.
11:18 - 11:19

But you imagine
11:19 - 11:23

the future needs way in advance of having them, and then you design something
11:23 - 11:26

overly complicated. Then what happens is you get bogged out of that.
11:26 - 11:28

It never happens, and the project dies.
11:28 - 11:29

You
11:29 - 11:31

lead a lonely life
11:31 - 11:35

by yourself, eating oatmeal.
11:35 - 11:37

I like
11:37 - 11:39

oatmeal. So one of the mottos is
11:39 - 11:42

design the simplest thing that could possibly work.
11:42 - 11:46

I think that's a neat gift to yourself to realize that
11:46 - 11:48

this simplest thing that could possibly work, that meets the needs that
11:48 - 11:52

you have, that has a decent enough [inaudible] and memory constraint right there.
11:52 - 11:54

Meets your needs in the simplest form of that.
11:54 - 11:57

It's going to be the easiest thing to get working and running. If you later decide you want to
11:57 - 11:58

go fancy,
11:58 - 12:01

you can swap it out
12:01 - 12:06

using the principles of distraction and [inaudible]. It should be that it's independent
12:06 - 12:10

when you decide you need that fancier thing. I've got a
12:10 - 12:13

couple questions here because I thought this would just help us frame a little
12:13 - 12:14

bit of thinking about these things.
12:14 - 12:17

These are questions that I'm hoping you can answer for me. That's why I
12:17 - 12:18

brought
12:18 - 12:22

cup full of candy. I'm prepared to offer bribery for people who help
12:22 - 12:24

me answer these questions.
12:24 - 12:26

So let's talk about array
12:26 - 12:28

versus vector.
12:28 - 12:31

So the sequels [inaudible] builds and
12:31 - 12:35

array is what is behind the scenes of a vector. Then the vector adds
12:35 - 12:38

convenience on the outside of it.
12:38 - 12:40

Early on, I had said, once I tell
12:40 - 12:43

you about arrays,
12:43 - 12:46

put that back in your memory, and just use vector instead because everything array
12:46 - 12:48

does, vector does
12:48 - 12:53

nicely and cleanly and adds all sorts of features. Somebody give me a list
12:53 - 12:55

of things that vector does much better than
12:55 - 12:58

array.
12:58 - 12:59

Way in the back. Bounce
12:59 - 13:00

checking.
13:00 - 13:03

Bounce checking. What else? Come on. A
13:03 - 13:06

blow pop is on the line.
13:06 - 13:09

I need to interrupt your - help us
13:09 - 13:10

out.
13:10 - 13:11

You've got bounce checking. What
13:11 - 13:14

else? More? I want more. You've got nothing else for
13:14 - 13:17

me? I don't even want a blow pop. You don't want
13:17 - 13:21

a blow
13:21 - 13:23

pop? All right. I'm going to give your blow pop to somebody else. Right
13:23 - 13:27

here. I'm going to give the blow pop to you. Tell me what makes vector good. It has operations that
13:27 - 13:27

shuffle
13:27 - 13:31

elements for you. Yeah, it does the insert, delete, moving stuff down.
13:31 - 13:32

Also, it does the growing.
13:32 - 13:34

You keep adding, and it just grows. [Inaudible] don't do that. You
13:34 - 13:37

want a [inaudible] to grow, you grow it.
13:37 - 13:40

You take the pointer, you copy stuff over, you delete the old one, you
13:40 - 13:43

rearrange things. You do all the work. So
13:43 - 13:44

vector buys you kind of
13:44 - 13:46

array with benefits.
13:46 - 13:48

So it seems like, then,
13:48 - 13:50

you could kind of take away - the naive point would be like, well,
13:50 - 13:53

just use vector. Always use vector.
13:53 - 13:56

Vector's always better than array. Is there ever a situation where array still
13:56 - 13:57

is a pretty valid choice?
13:57 - 14:00

What is it array has that vector maybe doesn't?
14:00 - 14:05

Well, when you add things dynamically to the vector, it actually doubles the size [inaudible]
14:05 - 14:06

increasing it by one. If you know
14:06 - 14:10

the exact size, and you know it's not going to change, you're using [inaudible]. That's it exactly.
14:10 - 14:14

One of its big advantages. If you know you only have
14:14 - 14:17

- let's say you have a very small array
14:17 - 14:21

planned for. The one on the practice exam was, if you're doing a calendar
14:21 - 14:23

where you had 365 days in a year,
14:23 - 14:26

and you're keeping track of the events on a particular day, you know
14:26 - 14:30

how many days in the year. There's never going to be any change in that. Well, leap
14:30 - 14:31
14:31 - 14:33

year.
14:33 - 14:36

But you know that, and you know that in advance. So why fuss with a data structure that does all
14:36 - 14:39

this growing and shrinking and has some overhead associated with that
14:39 - 14:43

when, in fact, you know there's exactly this number. That's all you need.
14:43 - 14:46

In fact, it's almost a little bit more convenient to set up. You can
14:46 - 14:48

just declare it that size, and it's ready to go. With
14:48 - 14:52

the vector, you have to go through and add all the elements and stuff like that.
14:52 - 14:54

So there are a few situations where you might just say,
14:54 - 14:57

I don't need the advantages.
14:57 - 15:00

It adds some overhead that I don't want to pay,
15:00 - 15:04

and I'm perfectly happy making it a small fixed
15:04 - 15:07

array. You'll still lose bounce checking,
15:07 - 15:08

but
15:08 - 15:11

if you are being careful, maybe you're
15:11 - 15:12

comfortable with
15:12 - 15:15

making that tradeoff. By losing bounce checking, you actually are getting a little bit
15:15 - 15:18

of the speed improvement back. The bounce check does actually cost you
15:18 - 15:20

a little bit on every
15:20 - 15:23

vector access that an array would not actually pay.
15:23 - 15:26

That's really in the noise for most programs. I hesitate to
15:26 - 15:28

even mention it, to get you to think it matters,
15:28 - 15:31

but it is part of the thinking of a
15:31 - 15:34

professional program. Sometimes that may be the thing it comes down to,
15:34 - 15:36
15:36 - 15:37

making your operation
15:37 - 15:41

constant factors streamline enough for the operation you need. So I put here
15:41 - 15:44

stack and queue versus vector.
15:44 - 15:48

Given that stack and queue are really just vectors in disguise, and they're
15:48 - 15:50

vectors with less features, vectors
15:50 - 15:53

that can't do as much.
15:53 - 15:55

One of the implementations for stack and vector
15:55 - 15:58

could easily be just layered on top of a vector.
15:58 - 16:02

Why is it good to take away things?
16:02 - 16:06

[Inaudible]
16:06 - 16:08

tampering with contents of [inaudible]. Yeah. So it's enforcing discipline
16:08 - 16:11

on how you use it. Stacks are
16:11 - 16:12
16:12 - 16:14

lifo, queues are fifo.
16:14 - 16:15

The way things go out, they way they go out
16:15 - 16:17

are very
16:17 - 16:19

rigid. By using stack and queue,
16:19 - 16:22

you are guaranteeing that you won't go in and accidentally put something at the head
16:22 - 16:25

of the line or remove something out of the middle of the stack because you just don't
16:25 - 16:26

have access to that.
16:26 - 16:27

It's been tidied up and
16:27 - 16:29

packaged up into stack, which has push and pop.
16:29 - 16:31

queue, which has MQDQ.
16:31 - 16:34

No other access implied or given,
16:34 - 16:37

totally encapsulated from you. So in fact, it allows you to
16:37 - 16:39

maintain on a discipline
16:39 - 16:45

that otherwise the vector didn't strongly provide for you. Question? [Inaudible]. Another
16:45 - 16:48

thing you could do is you could optimize a stack or a
16:48 - 16:51

queue to be very good at the only active things you're doing, which
16:51 - 16:55

is pushing and popping or RQDQ. I think that's worth a smartie, don't you?
16:55 - 16:57

I think that is.
16:57 - 16:58

Exactly.
16:58 - 17:02

So by restricting what you have available,
17:02 - 17:05

it actually gives you choices in the implementation that wouldn't have worked out as well, [inaudible] full array
17:05 - 17:08

of operations. It's exactly true. Stack and queue only adding from one end
17:08 - 17:11

mean that things like a Linklis, which isn't a good choice for implementing the
17:11 - 17:13

general case of vector,
17:13 - 17:17

are perfectly good choices for the stack and queue case. So giving ourselves
17:17 - 17:22

some freedom as the implementer. So information about how it's being used
17:22 - 17:24

can pay off. It also
17:24 - 17:25

means when you read code - if you see
17:25 - 17:27

code that's using something, call the stack
17:27 - 17:31

or using a queue, you immediately know how it works. You could
17:31 - 17:34

say, here's this search that's actually stuffing a bunch of things into
17:34 - 17:37

a stack. I know they're last and first [inaudible].
17:37 - 17:39

If you see them using a vector,
17:39 - 17:42

you have to verify where they put them in the vector. Where do
17:42 - 17:46

they pull them out? The code doesn't tell you the same things. It doesn't tell you the same story
17:46 - 17:49

as it does when you see a word that has a strong meaning to it, like
17:49 - 17:52

stack and queue.
17:52 - 17:55

Generally, these [inaudible] to make about sorting things or not sorting things. We spent a
17:55 - 17:57

long time talking about sorting
17:57 - 18:01

and how various algorithms approach the sorting problem and what
18:01 - 18:02

they come up with.
18:02 - 18:05

I have a question here. What sorting algorithm is best? Is
18:05 - 18:09

there an
18:09 - 18:12

answer to that question? Why would I ask a question that has no answer? It depends on what you expect it
18:12 - 18:16

to be giving you. It depends. That's a great thing. It depends on
18:16 - 18:18

what you expect it to give you. How big is it?
18:18 - 18:21

What are the contents of it? Is it random? Is it sorted? Is it almost
18:21 - 18:24

sorted? Is it reverse sorted? Is it
18:24 - 18:28

drawn from a range of one to 1,000 values, or is it drawn from one to three
18:28 - 18:31

values? Huge array that has just one, two, three in it
18:31 - 18:34

might necessitate a different approach than one that has
18:34 - 18:36

one to max sine in it.
18:36 - 18:37

So
18:37 - 18:39

there is no answer to that. It is depends. What do you know? If you don't
18:39 - 18:41

know anything, give
18:41 - 18:43

it to someone else.
18:43 - 18:45
18:45 - 18:49

If you don't know, there's certain things that perform well. You know in the
18:49 - 18:52

average case, you don't have degenerate behaviors. For example, [inaudible] is a
18:52 - 18:56

great sort for the general case of an N log N that doesn't use any space
18:56 - 18:56
18:56 - 19:01

or any extra space the way word sort does. It doesn't have any degenerate cases.
19:01 - 19:02

For an
19:02 - 19:05

unknown set of inputs, it works pretty well.
19:05 - 19:08

If you happen to have a little bit of data, it's almost sorted.
19:08 - 19:09
19:09 - 19:10

Even
19:10 - 19:11

Barack Obama's
19:11 - 19:13

not favorite bubble sort
19:13 - 19:14

has a
19:14 - 19:17

chance of being in the run because it's already sorted. Is it worth sorting? Calculating the payoff, we had a problem on
19:17 - 19:20

one of the section
19:20 - 19:24

exercises once. I think it's a really good
19:24 - 19:25

one to
19:25 - 19:27

think through and revisit. This idea of
19:27 - 19:31

certain operations, many operations are much more
19:31 - 19:34

efficiently implemented if the data's sorted.
19:34 - 19:37

You can think of them off the top of your head. Oh, you want to have the minimum? If
19:37 - 19:41

it's sorted, it's very easy. You want the maximum? Very easy if it's sorted. You want to find the median.
19:41 - 19:43

Also very easy if it's sorted. You want to
19:43 - 19:44
19:44 - 19:48

find duplicates. Once they're sorted, they're all lined up together. You want to find the
19:48 - 19:49

mode, which is the most
19:49 - 19:50

frequent element.
19:50 - 19:53

Also easier if it's sorted because they'll all be groups together in a run. So you
19:53 - 19:57

can do a linear pass, counting what you see to find the longest run.
19:57 - 20:00

If you had two arrays and you wanted to merge them or intersect them to find
20:00 - 20:04

their common elements, it's much easier if they're sorted. All these things
20:04 - 20:07

that are hard to do when the data's in
20:07 - 20:08

a random order
20:08 - 20:12

actually have much more streamlined algorithms once you invest in sorting it.
20:12 - 20:13

So there was a problem that was like,
20:13 - 20:15

how do you know it's worth doing that?
20:15 - 20:18

If you only plan on writing one merge, and
20:18 - 20:19

you could do it in N squared,
20:19 - 20:21

is it worth it to
20:21 - 20:24

sort it, N log N, so you can get an O out of merge? So it has a lot to do
20:24 - 20:26

with how often you plan on doing the merge.
20:26 - 20:29

How big is your data set?
20:29 - 20:30

What are your
20:30 - 20:32

constraints on the times there. It
20:32 - 20:35

is interesting to think about that relationship. It isn't for free, but
20:35 - 20:37
20:37 - 20:41

potentially, it's an investment by sorting it so you can do a lot of fast searches
20:41 - 20:44

later.
20:44 - 20:47

This one actually is interesting because they actually solve a lot of the
20:47 - 20:50

same problems, and they differ in one small way. If you had a
20:50 - 20:51

set -
20:51 - 20:55

so set is being backed by a balance by a [inaudible] string that actually is a sorted data
20:55 - 20:56

structure internally - versus a
20:56 - 20:58

sorted vector.
20:58 - 21:03

So if your goal were to keep track of all the
21:03 - 21:07

students at Stanford, and you wanted to be able to look up people by name,
21:07 - 21:08
21:08 - 21:11

then the contains operation of the set
21:11 - 21:15

is actually doing these minor ray
21:15 - 21:16

search on assorted vectors
21:16 - 21:18

using the same path, but working through a vector.
21:18 - 21:21

So we can make the searching operations, like contains
21:21 - 21:23
21:23 - 21:24

and
21:24 - 21:26

other look-up based things,
21:26 - 21:29

algorithmic in both cases.
21:29 - 21:33

So what advantage - this sort
21:33 - 21:36

of vector, where I just used the interchangeably, is there something one
21:36 - 21:41

can do that the other can't or some advantage or disadvantage that's
21:41 - 21:43

assumed in that decision?
21:43 - 21:47

You don't have duplicates in a set, but you can have duplicates in your sorted vector. Yeah,
21:47 - 21:51

so the set has another concern with just how it operates. You can't put duplicates
21:51 - 21:54

in, so if you have two students with the same name, then
21:54 - 21:57

in the set, it turns out you're all coalesced down to one. Your transcripts all get
21:57 - 21:59

merged to your advantage or not. What
21:59 - 22:01

else
22:01 - 22:05

does set buy you that vector doesn't? How hard is it to
22:05 - 22:09

edit a sorted vector? You
22:09 - 22:11

want to put something new in there?
22:11 - 22:12

You're shuffling. You want to
22:12 - 22:13

take something out? You're shuffling.
22:13 - 22:18

The movement to the pointer-based structure there, the research
22:18 - 22:18

stream behind this,
22:18 - 22:21

is giving us this editability of the data structure.
22:21 - 22:25

The convenient logarithmic time to remove and add elements using the
22:25 - 22:28

pointer wiring to do that,
22:28 - 22:30

that vector doesn't have. That tradeoff
22:30 - 22:35

is actually one that
22:35 - 22:39

situation may be that you just don't do a lot of edits. That's actually a very common
22:39 - 22:42

situation. The sorted vector happens to be a very underutilized
22:42 - 22:45

or under appreciated arrangement
22:45 - 22:48

for data because for most things that are getting a lot of
22:48 - 22:49

edits, having a
22:49 - 22:53

linear time edit isn't going to be a painful
22:53 - 22:53

consequence.
22:53 - 22:56

You're doing a lot of searches, and you're getting the finer research there,
22:56 - 22:57

where you really care about it,
22:57 - 23:01

with no overhead. Very little overhead. A little bit of excess capacity,
23:01 - 23:03

but none of the pointer-based
23:03 - 23:06

trouble that came with the set. So the set actually adding onto that another eight
23:06 - 23:08

bytes, potentially even a little bit more,
23:08 - 23:12

of overhead on every element to give us that editability that maybe isn't
23:12 - 23:14

that important to us.
23:14 - 23:17

So if you're looking at that thing and saying, I'd like to be able to have
23:17 - 23:21

a sorted structure for both the set and the sorted vector, let you
23:21 - 23:24

access the elements in order. The interrater of the set goes in sort order,
23:24 - 23:27

going from zero to the end. The vector gives me order.
23:27 - 23:30

I can browse them in sort order. I can search them using sorted.
23:30 - 23:33

Where they differ is where does it take to edit one?
23:33 - 23:37

The one that invested more in the memory had faster
23:37 - 23:39
23:39 - 23:44

editing. The P-queue. This is kind of interesting. The P-queue is also a sorted structure, but not quite.
23:44 - 23:46
23:46 - 23:46
23:46 - 23:48

It's a much more weakly
23:48 - 23:52

sorted structure. So if you're using the [inaudible] heap like we did on the
23:52 - 23:53

P-queue
23:53 - 23:55

[inaudible], it does give you this access to the maximum value,
23:55 - 23:59

and then kind of a quick reshuffling [inaudible] the next
23:59 - 24:00

and so on.
24:00 - 24:02

But it doesn't really give you the full range. For example, if you
24:02 - 24:10

were looking for the minimum in a P-queue, where is it?
24:10 - 24:11

It's somewhere in the
24:11 - 24:14

bottom. It's in the bottom-most level, but where across the bottom,
24:14 - 24:16

no knowledge, right? It could be
24:16 - 24:20

at any of the nodes that are at the bottom of the heap. So in fact it
24:20 - 24:21

gives you this access to
24:21 - 24:23

a partial
24:23 - 24:26

ordering of the data. In this case, it's relative
24:26 - 24:28

to the prioritization, but not
24:28 - 24:31

the fluid form of sorting the way a vector is. For example,
24:31 - 24:33

if P-queue doesn't give you things for
24:33 - 24:35

finding duplicates or finding the minimum or
24:35 - 24:39

doing the mode it's not actually an easy operation because you have
24:39 - 24:42

it in a heap form. I can get to the max
24:42 - 24:42

quickly.
24:42 - 24:46

So if I cared about pulling them out in sorted order
24:46 - 24:47

to browse them,
24:47 - 24:51

I could stuff things into a P-queue and then pull them out, but what's interesting about that is
24:51 - 24:54

it requires destroying the P-queue.
24:54 - 24:57

The P-queue isn't really a place you store things and plan on using them again. It's
24:57 - 25:00

the place you stick things in order to get them out
25:00 - 25:02

in an order that's only
25:02 - 25:05

value is in the de-queuing of them, pulling them out.
25:05 - 25:10

So finding the most promising option to look at in a search or finding the most
25:10 - 25:13

high-priority activity to take on or something.
25:13 - 25:16

But it doesn't really have the - if I wanted to be able to look at all
25:16 - 25:18

the students in my class in sorted order,
25:18 - 25:20

it's like, well, I could stick them in a P-queue and then pull them out, but it's sort of
25:20 - 25:25

funny. I have to put them in the P-queue just to pull them back out. It wasn't a good way to store them for that
25:25 - 25:27

browsing operation to be easily repeated.
25:27 - 25:29

If I put them into a sorted vector, I could just keep
25:29 - 25:33

iterating over it whenever I need to see it again.
25:33 - 25:37

So a lot of the things that it comes down to is having a pretty good
25:37 - 25:41

visualization of what it is that going to a pointer-based data structure buys you, and
25:41 - 25:44

what it costs you, relative to the continuous memory. That's one of the
25:44 - 25:45
25:45 - 25:48

tensions underneath most of these things. It's like,
25:48 - 25:50

putting more memory into it
25:50 - 25:54

by virtue of having a pointer wire as opposed to a
25:54 - 25:55

location
25:55 - 25:57

arrangement.
25:57 - 26:00

There's no information that's stored to get you from a race of zero to a
26:00 - 26:04

race of two. They're just continuous in memory. So it saves you space because you only
26:04 - 26:07

have to know where it starts and access continuously for that. This pointer means
26:07 - 26:11

there's more kerversals. There's more memory. There's more opportunity
26:11 - 26:12

to air, but it gave
26:12 - 26:16

you this flexibility. It broke down this continuous block, that's kind
26:16 - 26:17

of a monolith,
26:17 - 26:21

into these pieces that are more flexible in how you can rearrange
26:21 - 26:25

them. So thinking about these things is
26:25 - 26:28

an important skill to come away from 106b with that,
26:28 - 26:30

yeah, there's never
26:30 - 26:31

one right answer.
26:31 - 26:33
26:33 - 26:36

You're investing to solve this other problem, and
26:36 - 26:40

there's going to be downsides. There's no
26:40 - 26:41

obvious
26:41 - 26:43

triumph over all.
26:43 - 26:47

It's fast, it's cheap and it's small and easy to write
26:47 - 26:51

solutions out there.
26:51 - 26:53

So I put up here - I had to
26:53 - 26:54

whittle
26:54 - 26:57

this down to the five or six things that, at
26:57 - 27:02

the end of 106b, I want you to feel like, that's what I got.
27:02 - 27:04

That's the heart of
27:04 - 27:05
27:05 - 27:06

the
27:06 - 27:10

goal. The MVPs - the most valuable players of the
27:10 - 27:12

106b curriculum here.
27:12 - 27:15

You're looking back at it, and you go, where did I start, where did I end up,
27:15 - 27:18

and how did my knowledge grow?
27:18 - 27:21

I'm hoping these are the things that stand out for you. That was a
27:21 - 27:23

real
27:23 - 27:24

growth area for me. One
27:24 - 27:27

is this idea of abstraction.
27:27 - 27:32

That was stressed early on. Here's a stack, here's a queue, here's a set. They
27:32 - 27:33

do things for you. We don't know how
27:33 - 27:34

they work yet,
27:34 - 27:38

but you make cool things happen with them. So you solve maze problems and
27:38 - 27:40

random writers and
27:40 - 27:43

do neat things with
27:43 - 27:47

dictionaries without knowing how they work. So you're leveraging existing
27:47 - 27:48

material without knowledge of how it works.
27:48 - 27:52

It helps to keep the complexity down. You can solve much more interesting problems
27:52 - 27:56

than you could without them. Try to go back and imagine the Markov random writer
27:56 - 27:57

problem.
27:57 - 27:58

Now imagine doing it
27:58 - 28:00
28:00 - 28:02

without the map in play. So
28:02 - 28:04

you had to figure out that
28:04 - 28:08

given this character, go look it up and find a place to store it, and
28:08 - 28:11

then store the character in some other array that was growing on
28:11 - 28:14

demand, as you put stuff in there. You would never have completed that
28:14 - 28:18

program. You would still be writing it today, having pointer errors and whatnot
28:18 - 28:19

behind the scenes.
28:19 - 28:22

Having those tools already there, it's like having
28:22 - 28:25

the microwave and the food processor, and you're about to cook instead
28:25 - 28:26

of having a
28:26 - 28:27

dull knife
28:27 - 28:31

and doing all your work with a
28:31 - 28:31

dull
28:31 - 28:34

knife and a fire. You actually have
28:34 - 28:36

some real power
28:36 - 28:39

there. Recursion's a big theme in the weeks that followed that.
28:39 - 28:41

How to solve these problems that
28:41 - 28:43

have this similar structure,
28:43 - 28:47

looking at these exhaustive traversals and choice problems that can
28:47 - 28:51

be solved with backtracking. Having this idea of how to think about thing [inaudible]
28:51 - 28:53

using the recursive problem solving
28:53 - 28:55

and how to model
28:55 - 28:56

that
28:56 - 28:59

process using the computer was a
28:59 - 29:02

very important thing that comes back. We look at algorithm
29:02 - 29:05

analysis, this idea of making this
29:05 - 29:08

sloppy map that computer scientists are
29:08 - 29:11

fond of, talking about the scalability and growth, knowing about curves and
29:11 - 29:13

the relationships and how to
29:13 - 29:16

look at algorithms. The more formal analysis.
29:16 - 29:17

Still, in this case, there's still quite
29:17 - 29:21

a bit more formulas to this than we really went into. If you
29:21 - 29:23

were go on in CS and look at it, you will see there's quite a lot
29:23 - 29:26

of rigor that can be
29:26 - 29:27

looked at in that area.
29:27 - 29:31

We were getting more an intuition for how to gauge things about algorithms and
29:31 - 29:33

their growth.
29:33 - 29:36

Then this backside, all about implementation strategies and tradeoffs.
29:36 - 29:38

So now that we've
29:38 - 29:40

benefited from those abstractions, and we've
29:40 - 29:43

solved some cool problems, what's it like to actually be the person building them? What
29:43 - 29:46

are the techniques? What does it take to
29:46 - 29:50

wrestle a pointer or a Linklis or a tree, heap graph into doing something really cool for you. How
29:50 - 29:52

does that
29:52 - 29:57

inform your information, what you expect about it and the
29:57 - 30:00

coding complexity that was inherent in achieving that result?
30:00 - 30:04

Hopefully in continuing with the theme that 106a started you on was this
30:04 - 30:07

appreciation for [inaudible]. In the
30:07 - 30:11

end, getting code that works, that's ugly, just
30:11 - 30:14

isn't, hopefully, what you're satisfied with. You want to
30:14 - 30:17

approach problem solving to produce beautiful, elegant,
30:17 - 30:19

unified, clean,
30:19 - 30:20

readable code
30:20 - 30:24

that you would be happy to show to other people and be proud of and
30:24 - 30:25

maintain and live with.
30:25 - 30:27

The kind of work that
30:27 - 30:29

other people you work with would appreciate
30:29 - 30:32

being involved with.
30:32 - 30:34

Things that are not so much
30:34 - 30:37

- I don't think these are the most valuable pointers, but what
30:37 - 30:38

came along the way,
30:38 - 30:39
30:39 - 30:42

the idea of pointers and C++ being part of the -
30:42 - 30:45

C++ and its pointers, they're one and the same. We choose [inaudible] for a
30:45 - 30:48

language, so as a result, we get to learn some C++
30:48 - 30:51

because it's so heavily
30:51 - 30:53

invested with pointers. We also have to do some practice with the pointers to
30:53 - 30:55

get our jobs done.
30:55 - 30:58

I think of those as being off to the side. I don't think of those being the
30:58 - 31:01

platform of things I worship. They are
31:01 - 31:05

things we need to get our job done. I
31:05 - 31:08

did put a little note about pointers here because there were quite a lot of comments on
31:08 - 31:12

the mid-quarter evals that said, pointers, pointers still scare me. They make me crazy.
31:12 - 31:14

I don't know enough about
31:14 - 31:15

pointers. Here's a fact for you.
31:15 - 31:18

You'll probably never feel like you know enough about pointers.
31:18 - 31:21

Today, tomorrow, ten years from now, you'll still kind of feel like
31:21 - 31:24

there's some real subtlety and trickiness in there.
31:24 - 31:27

They're an extremely powerful facility that gives you that
31:27 - 31:31

direct control over allocation and deallocation and all the wiring and
31:31 - 31:32

sharing
31:32 - 31:34

that
31:34 - 31:37

is important to building these complicated data structures.
31:37 - 31:40

But there are so many pitfalls. I'm sure you've run into
31:40 - 31:43

most of these, if not all of them at some point, where you
31:43 - 31:46

mishandled a pointer, failed the allocator, deallocate,
31:46 - 31:49

and
31:49 - 31:52

the way it gets reported, whether the compiler or the runtime error that
31:52 - 31:56

you see is helpful in sorting it out is a real cause for
31:56 - 31:58

long hours to disappear
31:58 - 32:01

in the name of mastery of this.
32:01 - 32:02

If you're still wary of pointers, you
32:02 - 32:04

probably should be.
32:04 - 32:06

Think of this as your first go-around.
32:06 - 32:07

We're still
32:07 - 32:10

at the journeyman's stage for programming here.
32:10 - 32:14

So although there were pointers in 106a, they were very hidden. This
32:14 - 32:17

is the first time you really see it and are exposed to it and trying to get your head
32:17 - 32:18

around it.
32:18 - 32:20

If you go on to 107 and 108, you'll just
32:20 - 32:23

get more and more practice. It will
32:23 - 32:26

become more solid as you keep working your way through it, but
32:26 - 32:30

at this stage, it's okay
32:30 - 32:31

and
32:31 - 32:35

commonplace to feel still, a little bit - you see the star, and you take a
32:35 - 32:37

double take. That's just where you should be.
32:37 - 32:40

It's just like in the
32:40 - 32:43

Chinese restaurants. They put the little red star by food you want to be
32:43 - 32:49

careful of. That's why they put the star for the
32:49 - 32:52

pointer. I also wanted to
32:52 - 32:54

zoom out farther,
32:54 - 32:56

a decade from
32:56 - 32:59

now, when you are
32:59 - 33:01

thinking back on the
33:01 - 33:04

whole of your education and what you learned. Do I want to really
33:04 - 33:06

remember how to
33:06 - 33:08

type in keyword and the sequel plus generic
33:08 - 33:10

template facility? No.
33:10 - 33:13

If you code in C++ every day, you'll know that. If you don't, you
33:13 - 33:14

can forget it. It's long gone.
33:14 - 33:16

What I hope you'll carry away from you
33:16 - 33:18

is a couple
33:18 - 33:19

concepts that
33:19 - 33:21

broaden
33:21 - 33:23

outside of CS. One is this idea of algorithmic thinking,
33:23 - 33:27

how you approach something logically and step-wise and draw pictures and analyze it to understand
33:27 - 33:29

the cases
33:29 - 33:32

and to debug when things aren't working well. I think that's a skill that
33:32 - 33:33

transcends
33:33 - 33:37

the art of computer programming into other domains, any kind of thinking
33:37 - 33:39

logically and problem solving and analyzing in that
33:39 - 33:43

way. I also hope that some of the experimental things we did with the sort
33:43 - 33:45

lab and the P-queue
33:45 - 33:48

and some of the big O and thing we tried to do in class together help
33:48 - 33:51

you to [inaudible] back of the envelope calculations. This is something I'm
33:51 - 33:53

a little bit of a -
33:53 - 33:55

it's a pet issue of mine, which is,
33:55 - 33:57

I think, in the era of computers
33:57 - 34:00

and whatnot, really easy to get distanced from numbers and start
34:00 - 34:03

thinking, I just punch numbers into formulas and one comes out. It must be
34:03 - 34:06

the truth. Not having any intuition as to whether that's the right
34:06 - 34:10

number, whether that number makes sense, whether it's grounded,
34:10 - 34:14

and whether you actually are [inaudible] or so off because you've
34:14 - 34:17

actually made an entry error, and you don't even notice it.
34:17 - 34:20

So becoming comfortable with this idea of using numbers to drive things,
34:20 - 34:21

being able to
34:21 - 34:26

take some time trials, do some predictions, do some more time trials,
34:26 - 34:28

match those things up, do a little sketching on your numbers and see the
34:28 - 34:32

things that are working out. So being comfortable having the math and the theory
34:32 - 34:34

reinforce each other
34:34 - 34:34

and
34:34 - 34:36

not get too
34:36 - 34:39

distanced from the fact that those numbers do play out in
34:39 - 34:42

the real world in ways that are interesting. Don't just let the numbers themselves
34:42 - 34:44

tell the story. There really is more
34:44 - 34:47

to connect it up with.
34:47 - 34:49

This idea of tradeoffs.
34:49 - 34:51

You may not remember
34:51 - 34:54

how you can write a Linklis or how the hash table does
34:54 - 34:56

the things it does
34:56 - 35:00

and how to get one working, but hopefully you will take away this idea that
35:00 - 35:03

the answer to most questions begins with the phrase, it depends.
35:03 - 35:05

If I say, is it better to use a hash table or [inaudible] search tree, the answer
35:05 - 35:09

will be, it depends. There's no,
35:09 - 35:12

it's always better to use this sort. It's always
35:12 - 35:14

wrong to use this approach.
35:14 - 35:15

It's, it depends. Do
35:15 - 35:18

you have the code written for that, and it works well, and you don't care how
35:18 - 35:19

long it takes?
35:19 - 35:23

Then bubble sort actually could be the right answer.
35:23 - 35:25

Barack Obama not withstanding.
35:25 - 35:27

But knowing whether memory is
35:27 - 35:32

at premium, time is at premium, what your mix of operations are,
35:32 - 35:33

what
35:33 - 35:34

language you're working with, what
35:34 - 35:38

program expertise you have, what timeline you have for developing it all
35:38 - 35:42

can play a role in deciding what choices to make and what strategy to pursue.
35:42 - 35:45

So being very comfortable with this idea that it's about gathering information
35:45 - 35:46

before you
35:46 - 35:49

can commit to one strategy as opposed to it's always going to be
35:49 - 35:52

this or that.
35:52 - 35:54

So that's the 106b thing.
35:54 - 35:57

I have a couple slides here on
35:57 - 35:59

things that happen after 106b. I'm
35:59 - 36:01

just going to go ahead and show you.
36:01 - 36:03
36:03 - 36:06

This would be a great time to ask questions. In particular, if there's anything you're
36:06 - 36:07

curious about.
36:07 - 36:10

Where do we go from here, and
36:10 - 36:13

how do we make good decisions about things that follow? The
36:13 - 36:17

most obvious following courses from CS - if you took 106b
36:17 - 36:19

and you hated it, I'm very sorry.
36:19 - 36:23

Hopefully
36:23 - 36:25

the scarring
36:25 - 36:26

will heal over and you'll lead a
36:26 - 36:28

productive and happy life
36:28 - 36:30

elsewhere. If you did like it and you think,
36:30 - 36:33

I'm kind of interested in this. What do I do next to explore further?
36:33 - 36:35

Where else can I go with this?
36:35 - 36:38

I put up the intro
36:38 - 36:40

courses with their numbers and names, and I'll talk a little bit
36:40 - 36:44

about what each of these does to give you an idea of what opportunity it
36:44 - 36:45

offers to you.
36:45 - 36:49

The obvious follow onto the programming side - 106a and b
36:49 - 36:51

are programming courses, building your programming mastery.
36:51 - 36:54

107 follow in the same vein. We typically call those systems
36:54 - 36:56

courses in CS nomenclature.
36:56 - 36:59

That means practical, hands on programming, getting more exposure
36:59 - 37:00

to the system.
37:00 - 37:05

So 107 builds on 106b's knowledge in a class called programming paradigms.
37:05 - 37:08

Part of what it's trying to explore is different languages.
37:08 - 37:11

So you actually look at the language, C, kind of stepping back to old school.
37:11 - 37:15

You look at the language scheme, which is a functional language that
37:15 - 37:17

was developed for AI purposes
37:17 - 37:17

that
37:17 - 37:21

has a long history coming from the math side of things. Probably looking at some
37:21 - 37:25

scripting and flexible [inaudible] like Python. We kind of mix them up a
37:25 - 37:28

little bit, so it's not exactly the same things, but looking at other
37:28 - 37:30

languages, other ways of doing things.
37:30 - 37:33

The other component of 107 is also looking at
37:33 - 37:35

some more low-level things.
37:35 - 37:37

What happens? When you
37:37 - 37:41

pass your code off to the compiler, what is it doing? What kind of analysis does do of your
37:41 - 37:44

code? How does it actually generate machine code? What happens in the machine layer,
37:44 - 37:47

understanding some more of the performance implications
37:47 - 37:50

and how things are expressed. Getting a better understanding of pointers actually
37:50 - 37:53

comes up because there is a certain amount of low-level coding that's being
37:53 - 37:54

explored in this class.
37:54 - 37:58

So building on the programming skill as we move toward a Unix
37:58 - 38:00

Linux-based
38:00 - 38:02

environment. So away from the Mac and Windows.
38:02 - 38:03
38:03 - 38:08

Just continue building larger programs with complicated
38:08 - 38:10

features to grow your mastery.
38:10 - 38:13

Internally, we call this class programming maturity, even though
38:13 - 38:14

it's not its title.
38:14 - 38:18

We see it that way in the sequence in terms of just growing you into
38:18 - 38:19

a
38:19 - 38:22

more versatile programmer, seeing different languages, different things
38:22 - 38:25

and getting more understanding of what happens under the hood.
38:25 - 38:29

It is a five-unit class. It's typically offered fall and spring. Most
38:29 - 38:30

people thing it is
38:30 - 38:33

about as much work as 106b. Some a
38:33 - 38:37

little more or a little less in different parts of the quarter, depending on what your
38:37 - 38:38

background is and
38:38 - 38:39
38:39 - 38:41

how 106b worked for you.
38:41 - 38:44

In a model, I think most people think of it as being - would you guys
38:44 - 38:45

say 107
38:45 - 38:48

about as hard as 106b?
38:48 - 38:51

Some ways, it's harder. Some ways, it's not. Maybe that's
38:51 - 38:55

the deal. The 108 class which follows onto 107, but that
38:55 - 38:57

prerequisite is particularly strong. It's a Java class.
38:57 - 39:00

So moving in the other direction. Rather than moving down an extraction, moving
39:00 - 39:02

up on the extraction,
39:02 - 39:05

looking at more large-scale design, modern programming technique,
39:05 - 39:06

using
39:06 - 39:10

object-oriented facilities, thinking about large-scale programming. You do a
39:10 - 39:11

big team project in the
39:11 - 39:13

end, so a three week,
39:13 - 39:15

three person, massive effort
39:15 - 39:17

that is the first
39:17 - 39:26

really big scale thing that you'll be exposed to in the lower-division
39:26 - 39:30

curriculum. [Inaudible]. Sometimes in 107l, which has some relationship to the
39:30 - 39:33

106l in the way that it's more hands-on, more
39:33 - 39:34

C++,
39:34 - 39:37

I don't know whether it will be offered this spring or not,
39:37 - 39:38

but
39:38 - 39:41

if you look in Access, it will tell you at least whether we have it tentatively on
39:41 - 39:42

the schedule.
39:42 - 39:46

I think that comes and goes with the interest from the students and the
39:46 - 39:51

availability of someone to teach
39:51 - 39:52

it.
39:52 - 39:55

[Inaudible] can we watch [inaudible]?
39:55 - 39:58

Yeah, 107 and 108 are offered on TV usually
39:58 - 40:02

once a year. 107 is offered on TV this spring. The one in fall is
40:02 - 40:03

not taped.
40:03 - 40:06

It might be that the old lectures from spring are somewhere that
40:06 - 40:09

you could dig them out, but I actually don't happen to know for sure.
40:09 - 40:11

108 is on TV, I think,
40:11 - 40:13

in the fall, not in the winter. So the fall lectures are probably still around for
40:13 - 40:16

108, so you could take a look at them. You
40:16 - 40:19

could certainly look at their websites, and
40:19 - 40:23

there's often a lot of old materials and handouts
40:23 - 40:26

and stuff that gives you at least some idea of things that have been
40:26 - 40:29

covered in the most recent offering of it.
40:29 - 40:34

Certainly showing up in the first week is one way to get an
40:34 - 40:36

idea of what you're in for.
40:36 - 40:38

108 is a four-unit class.
40:38 - 40:41

Because it's in Java, there's certain things about Java,
40:41 - 40:44

the safety and
40:44 - 40:48

the attentiveness of the compiler, makes the error handing a little
40:48 - 40:52

bit easier. So I don't think most people think of 108 as quite as intense as
40:52 - 40:55

107 or 106b, but that project at the end has quite a reputation
40:55 - 40:58

as being a
40:58 - 41:02

real time suck. So it has lots to do with being
41:02 - 41:04

scheduled and motivated and working through that team project at
41:04 - 41:05

the end.
41:05 - 41:09

I think that's where most people find the big challenge of
41:09 - 41:12

108. Then on the math side, in the
41:12 - 41:15

nomenclature for CS, we have this idea of theory, introducing you to more
41:15 - 41:16
41:16 - 41:19

the formalism for big O and discrete structures and doing proofs and
41:19 - 41:21

thinking about logic.
41:21 - 41:25

So we have a 103 a and b and a 103x, which is a combined,
41:25 - 41:28

accelerated form of a and b, which serves as math classes. They are math classes
41:28 - 41:29

for introducing
41:29 - 41:33

the kind of mathematics that are important for computer scientists.
41:33 - 41:34

So that is
41:34 - 41:38

for people who are thinking about a CS major, another good course to get in
41:38 - 41:39

early. Pretty
41:39 - 41:43

much everything in the upper division of the CS major has
41:43 - 41:47

these things layered down as prereqs. So these are the intro courses that serve as the
41:47 - 41:50

gateway to prepping you to go on, and then looking at things in a more topical
41:50 - 41:51

way.
41:51 - 41:53

Looking at networks or security,
41:53 - 41:54

HCI or
41:54 - 41:59

artificial intelligence layers on a grounding in theory and programming
41:59 - 42:03

that came from the intro courses. So
42:03 - 42:07

that said, we're in the
42:07 - 42:08

midst of a really
42:08 - 42:10

serious curriculum revision,
42:10 - 42:13

which is
42:13 - 42:15

on the table and being designed right now,
42:15 - 42:18

and was voted on by our faculty and improved and has to go
42:18 - 42:22

through the school of engineering and get approval. Then there's going to be a little bit of
42:22 - 42:23

jostling as everything gets worked out.
42:23 - 42:27

But started next year, you're going to see some changes in the
42:27 - 42:29

department. Some of these courses are going to
42:29 - 42:33

get morphed, and
42:33 - 42:36

they won't be exactly the way they are today.
42:36 - 42:39

As a Stanford undergrad, you actually have the option of graduating under any set of
42:39 - 42:42

requirements that are in play any time you're an undergrad. So you could graduate under this year's
42:42 - 42:43

requirements,
42:43 - 42:45

or you could wait.
42:45 - 42:48

I think for most, the right answer's going to be wait because I think the new
42:48 - 42:51

arrangement is very much more student-friendly.
42:51 - 42:53

It has more flexibility
42:53 - 42:57

in the program. If you look at the CS program as it is, it has a lot of pick
42:57 - 43:00

one of two, pick one of three, pick one of two,
43:00 - 43:04

where you're very constrained on what you can choose. The new program is going to be
43:04 - 43:07

more track-based where you pick an area of interest. You say, I'm really interested in the
43:07 - 43:08

graphics. You do a little
43:08 - 43:12

depth concentration in that, and then just have room for a
43:12 - 43:15

very wide selection of electives to fill up the program as opposed to
43:15 - 43:19

a smorgasbord where we forced you to pick through a bunch of different areas.
43:19 - 43:24

So for most students, I think it gives you flexibility that actually gives you
43:24 - 43:25

the options
43:25 - 43:26

to
43:26 - 43:30

pick the classes that are most interesting to you and get the material that you want. It's a growing
43:30 - 43:32

recognition of the fact that CS
43:32 - 43:35

has really broadened as a field. Some say that our major looks the same way
43:35 - 43:37

it did ten years ago,
43:37 - 43:40

a classic definition of here's these
43:40 - 43:42

seven things that all matter. The
43:42 - 43:45

things aren't seven anymore. There's no 15 of them, 25 of them. If
43:45 - 43:49

you look at the spectrum of things that computer science is now
43:49 - 43:52

embracing, we can't say, you have to take one of all these things without keeping you here
43:52 - 43:53

for eight years.
43:53 - 43:55

So we are
43:55 - 43:58

allowing that our field's footprint has gotten so large that we need to
43:58 - 44:02

let you pick and choose a little bit to get the slice that seems most
44:02 - 44:03

appropriate for where you're headed.
44:03 - 44:06

So you'll see more about that.
44:06 - 44:10

If you are thinking about being a CS major
44:10 - 44:13

and have not yet declared,
44:13 - 44:17

I would advise
44:17 - 44:22

that you add yourself to what's called the considering CS list,
44:22 - 44:26

and you can get to this from the CS course advisor's page,
44:26 - 44:27

which,
44:27 - 44:30

off the top of my head, I think
44:30 - 44:32

it might be
44:32 - 44:36

CSadvising, but I'm not positive. I'll put the link on
44:36 - 44:39

our webpage, when I remember what it is.
44:39 - 44:42

It is a mailing list. So we have a mailing list of the declared undergrads, but we
44:42 - 44:45

also have a list for people who are
44:45 - 44:48

at least potentially interested in the major and just want to be kept up to date on
44:48 - 44:49

things that are happening.
44:49 - 44:52

There's going to be a lot of activity on this list in the
44:52 - 44:55

upcoming months as we publish the information about the new
44:55 - 44:56

curriculum.
44:56 - 44:57

So
44:57 - 45:00

you'll know what's coming down the pipes so you can make good decisions for
45:00 - 45:01

preparing for
45:01 - 45:04

what will be [inaudible], what the transition plan will be as we move into the new
45:04 - 45:05

courses
45:05 - 45:06

and how to make sure you
45:06 - 45:09

land
45:09 - 45:13

in the right place when all is said
45:13 - 45:14

and done. Then I put
45:14 - 45:15

a little note about
45:15 - 45:17

other majors that exist.
45:17 - 45:19

CS is
45:19 - 45:20
45:20 - 45:22

the home base for people who are
45:22 - 45:25

solving problems using computers. It spans a large variety of people
45:25 - 45:27

doing a lot of different things.
45:27 - 45:30

Then there's a couple other majors that we participate in, in the interdisciplinary
45:30 - 45:32

sense,
45:32 - 45:34

that UCS is part of a larger
45:34 - 45:37

context for doing things. So computer systems engineering, which is
45:37 - 45:40

between us and EE, which is a little bit more about hardware.
45:40 - 45:43

The symbolic systems program, which is
45:43 - 45:47

run by linguistics and includes cooperation from psychology, philosophy, communications,
45:47 - 45:51

CS looking at artificial intelligence and modeling human
45:51 - 45:52

thought and
45:52 - 45:53

understanding
45:53 - 45:55
45:55 - 45:57

intelligence as expressed in that way.
45:57 - 45:59

Then there's a mathematical computational science program that's homed
45:59 - 46:03

in the statistic department that is joint between math, computer science and the
46:03 - 46:06

MS and E. It's more of an applied math degree. So taking
46:06 - 46:08

mathematical thinking and
46:08 - 46:09
46:09 - 46:12

using computers and statistical things to analyze and
46:12 - 46:16

think about things. A lot of financial people end up
46:16 - 46:20

being drawn to that particular one or people who are doing risk analysis or
46:20 - 46:21

that sort of
46:21 - 46:24

combination of things. But where CS has a role to play in all of these because it's a
46:24 - 46:27

great tool for solving these kind of problems. So
46:27 - 46:29

different ways to get at
46:29 - 46:30

different pieces
46:30 - 46:33

of the major.
46:33 - 46:35

I've put a couple notes of things that
46:35 - 46:38

I think are worth
46:38 - 46:39

looking into
46:39 - 46:40

in case they,
46:40 - 46:43

at some point, are the right thing for you to look at. Section leading,
46:43 - 46:45

so joining the 106 staff
46:45 - 46:46

and shepherding the next generation
46:46 - 46:49

of students through the program
46:49 - 46:52

is a great way to strengthen your own knowledge. There's nothing like
46:52 - 46:54

making sure you understand recursion as to try and explain it to someone else who
46:54 - 46:55

doesn't understand it.
46:55 - 46:56

I
46:56 - 46:59

can really do a lot for
46:59 - 47:01

the personal connection to you and your craft
47:01 - 47:04

as well as just being involved in a great community. The section leaders are a really
47:04 - 47:05

wonderful,
47:05 - 47:08

dedicated group of students who are
47:08 - 47:08
47:08 - 47:12

fun to be around and great people to make friends with. So I
47:12 - 47:14

highly encourage you to take a look at that.
47:14 - 47:16

We interview for that every quarter,
47:16 - 47:17

so usually about week six,
47:17 - 47:19

we put out
47:19 - 47:21

applications and bring people in for interviews. We're always looking for new people
47:21 - 47:24

to join on a quarterly basis.
47:24 - 47:27

There are research opportunities all around the campus. One that the CS
47:27 - 47:30

department sponsors in particular is their [inaudible] summer institute.
47:30 - 47:33

So matching professors with undergrads to do work in the summer for
47:33 - 47:35

slave wages.
47:35 - 47:38

But great opportunity. You can get started on
47:38 - 47:40

thinking about how research might be part of your future.
47:40 - 47:43

One of the best preparations for thinking about grad school.
47:43 - 47:46

We do have an honors program.
47:46 - 47:50

Like most, we have a chance to stake out on an independent project and
47:50 - 47:50

show your mettle.
47:50 - 47:53

We have a co-terminal masters program
47:53 - 47:54

where you can
47:54 - 47:57

get some of the depth
47:57 - 47:58

before you leave this place.
47:58 - 48:01

Not to mention, there's a million fabulous
48:01 - 48:03

things to do with Stanford, getting involved with
48:03 - 48:06

arts or nonprofit organizations or sports or the student leadership
48:06 - 48:07

or whatever.
48:07 - 48:10

Maybe the most important memory I have from my undergrad was that the
48:10 - 48:13

trickiest part of my undergrad was learning what to say no to.
48:13 - 48:14

Realizing that there are
48:14 - 48:17

10,000 fabulous, worthwhile opportunities,
48:17 - 48:21

and doing all of them only means sacrificing your sanity and
48:21 - 48:23

the quality of the things you do.
48:23 - 48:24
48:24 - 48:29

So be careful and thoughtful about choosing the things you do and embracing them.
48:29 - 48:32

Be okay with saying, I won't do this. Even though it's great to go overseas, and I
48:32 - 48:35

would've liked that opportunity, it's not going to fit, and I'm okay with that. Don't
48:35 - 48:38

beat yourself up over it and cause
48:38 - 48:41

yourself too much grief trying to do too many things.
48:41 - 48:44

I won't be here on Friday. Keith is going to come. He's going to teach you about C++ in the real
48:44 - 48:45

world.
48:45 - 48:47

I'm going to see you guys Friday at the final.
48:47 - 48:57

I think that's all I need to say for today.

Title:: Lecture 26 | Programming Abstractions (Stanford)
Description:: Lecture 26 by Julie Zelenski for the Programming Abstractions Course (CS106B) in the Stanford Computer Science Department.

Julie ties up the "loose ends" of the course: after a general review of the concepts covered in the course, she asks which of two examples is the better. She then covers manipulation of dynamic data structures (lists, trees, graphs) as well as reviews code from similar assignments. Then she touches on design and efficiency of code. Lastly, she goes over the next courses one might take if continuing in computer science and the different paths possible.

Complete Playlist for the Course:
http://www.youtube.com/view_play_list?p=FE6E58F856038C69

CS 106B Course Website:
http://cs106b.stanford.edu

Stanford Center for Professional Development:
http://scpd.stanford.edu/

Stanford University:
http://www.stanford.edu/

Stanford University Channel on YouTube:
http://www.youtube.com/stanforduniversity/

more » « less
Video Language:: English
Duration:: 49:05

	N. Ueda edited English subtitles for Lecture 26 \| Programming Abstractions (Stanford)
	Eunjeong_Kim added a translation

English subtitles

Revisions

Revision 2

N. Ueda

Lecture 26 | Programming Abstractions (Stanford)

Revisions

Our website uses cookies

Operating cookies (Required)