< Return to Video

Lecture 7 | Programming Abstractions (Stanford)

  • 0:00 - 0:08
  • 0:08 - 0:11
  • 0:11 - 0:22
    This presentation is delivered by the Stanford Center for Professional Development.
  • 0:22 - 0:25
    Okay, so thank you for coming out in the rain.
  • 0:25 - 0:26
    I
  • 0:26 - 0:27
    appreciate you
  • 0:27 - 0:29
    trudging over here and
  • 0:29 - 0:30
    risking
  • 0:30 - 0:31
    your health and life.
  • 0:31 - 0:37
    So I'm gonna do a little diversion here to explain this idea of functions as data before I come back
  • 0:37 - 0:40
    to fixing the problem we had kind of hit upon at the very end of the last lecture
  • 0:40 - 0:42
    about set.
  • 0:42 - 0:44
    So what I'm just gonna show you is
  • 0:44 - 0:45
    a little bit about
  • 0:45 - 0:46
  • 0:46 - 0:49
    the design of something in C++
  • 0:49 - 0:52
    that is gonna helpful in solving the problem we'd run into.
  • 0:52 - 0:55
    What I'm gonna show you here is two pieces of code that plot
  • 0:55 - 0:57
    a
  • 0:57 - 1:01
    single value function across the number line. So in this case, the function is that it's the
  • 1:01 - 1:05
    top one is applying the sine function, the sine wave as it varies across, and then the
  • 1:05 - 1:06
    square root function
  • 1:06 - 1:09
    which goes up and has a sloping curve to it,
  • 1:09 - 1:13
    and that both of these have exactly the same behaviors. They're designed to kind of go into the graphics
  • 1:13 - 1:17
    window, and they're just using a kind of a very simple kind of hand moving strategy. It
  • 1:17 - 1:21
    starts on a particular location based on what the value of sine is here,
  • 1:21 - 1:23
    and then over a particular interval
  • 1:23 - 1:24
    at 0.1
  • 1:24 - 1:28
    of one tenth of an inch, it plots the next point and connects a line between
  • 1:28 - 1:31
    them. So it does a simple kind of line approximation of what
  • 1:31 - 1:34
    that function looks like over the interval you asked it to plot.
  • 1:34 - 1:37
    The same code is being used here for plotting the square root.
  • 1:37 - 1:41
    And the thing to note and I tried to highlight by putting it in blue here was that every other
  • 1:41 - 1:45
    part of the code is exactly the same except for those two calls where I need to know what's
  • 1:45 - 1:48
    the value of square root starting at this particular
  • 1:48 - 1:51
    X. So starting at Value 1,
  • 1:51 - 1:53
    what is sine of one? What is square root of one? As I
  • 1:53 - 1:57
    work my way across the interval, what's a square root of 1.1, 1.2, 1.3 and
  • 1:57 - 1:59
    sine of those same values?
  • 1:59 - 2:03
    And so everything about the code is functionally identical. It's kind of frustrating
  • 2:03 - 2:07
    to look at it, though, and realize if I wanted to plot a bunch of other things -I also wanna be able to plot
  • 2:07 - 2:09
    the cosine function, or
  • 2:09 - 2:12
    some other function of my own creation
  • 2:12 - 2:13
    -across this interval
  • 2:13 - 2:17
    that I keep having to copy and paste this code and duplicate it.
  • 2:17 - 2:21
    And so one of the things that hopefully your 106A and prior experiences really
  • 2:21 - 2:24
    heightened your attention to is
  • 2:24 - 2:26
    if I have the same piece of code
  • 2:26 - 2:30
    in multiple places, I ought to be able to unify it. I ought to be able to make there be a plot
  • 2:30 - 2:31
    function
  • 2:31 - 2:36
    that can handle both sine and square root without actually having to distinguish it by copy and pasting,
  • 2:36 - 2:38
    so I wanna unify these two.
  • 2:38 - 2:43
    And so there is -the mechanism that we're gonna use kinda follows naturally if you don't let yourself get
  • 2:43 - 2:45
    too tripped up by what it means.
  • 2:45 - 2:48
    Just imagine that the parameter for example going into the function right now are the start and the stop,
  • 2:48 - 2:49
    the interval
  • 2:49 - 2:52
    from X is one to five.
  • 2:52 - 2:55
    What we'd like to do is further parameterize the function.
  • 2:55 - 2:58
    We'd like to add a third argument to it, which is to say
  • 2:58 - 3:03
    and when you're ready to know what function you're plotting, here's the one to use.
  • 3:03 - 3:04
    I'd like you to plot
  • 3:04 - 3:08
    sine over the interval start to stop. I'd like you to plot square root over that. So we added the third argument,
  • 3:08 - 3:11
    which was the function we wanted to invoke there.
  • 3:11 - 3:16
    Then we would be able to unify this down to where we had one generic plot function.
  • 3:16 - 3:19
    The good news is that this does actually
  • 3:19 - 3:20
    have
  • 3:20 - 3:23
    support features in the C++ language that let us do this.
  • 3:23 - 3:26
    The syntax for it is a little bit unusual, and
  • 3:26 - 3:29
    if you think about it too much, it can actually make your head hurt a little bit, think about what
  • 3:29 - 3:33
    it's doing. But you can actually use a function, a function's name
  • 3:33 - 3:35
    and the code that is associated with that name,
  • 3:35 - 3:37
    as a piece of data
  • 3:37 - 3:41
    that not just -you think of the code as we're calling this function, and
  • 3:41 - 3:44
    we're moving these things around, and executing things.
  • 3:44 - 3:47
    The things that we tend to be executing and operating on you think of as being integers, and strings,
  • 3:47 - 3:49
    and characters, and files.
  • 3:49 - 3:52
    But you can also extend your notion of what's data
  • 3:52 - 3:54
    to include the code you wrote
  • 3:54 - 3:56
    as part of the possibilities.
  • 3:56 - 3:59
    So in this case, I've added that third argument that I wanted to plot,
  • 3:59 - 4:02
    and this
  • 4:02 - 4:06
    syntax here that's a little bit unusual to you, and I'll kind of identify it, is that the
  • 4:06 - 4:09
    name of the parameter is actually FN.
  • 4:09 - 4:12
    Its name is enclosed in parentheses there.
  • 4:12 - 4:16
    And then to the right would be a list of the arguments
  • 4:16 - 4:17
    or the prototype
  • 4:17 - 4:21
    information about this function, and on the left is its return value.
  • 4:21 - 4:24
    So what this says is you have two doubles, the start and stop,
  • 4:24 - 4:26
    and the third thing in there isn't a double at all. It is a
  • 4:26 - 4:27
    function
  • 4:27 - 4:32
    of one double that returns a double, so a single value function that operates on doubles
  • 4:32 - 4:33
    here.
  • 4:33 - 4:36
    That is the syntax in C++ for specifying what you want coming in here
  • 4:36 - 4:37
    is
  • 4:37 - 4:41
    not a double, not a ray of doubles, anything funky like that. It is a function of a
  • 4:41 - 4:43
    double that returns a double.
  • 4:43 - 4:45
    And then in the body of this code,
  • 4:45 - 4:47
    when I say FN here
  • 4:47 - 4:50
    where I would've said sine, or square root, or identified
  • 4:50 - 4:51
    a particular function,
  • 4:51 - 4:53
    it's using the parameter
  • 4:53 - 4:57
    that was passed in by the client, so it says call the client's function
  • 4:57 - 4:58
    passing these values
  • 4:58 - 5:01
    to plot over the range
  • 5:01 - 5:03
    using their function
  • 5:03 - 5:07
    in. So the idea is that valid calls to plot now become things like plot -and then give it an
  • 5:07 - 5:09
    interval, zero to two,
  • 5:09 - 5:11
    and you give the name of a function:
  • 5:11 - 5:15
    the sine function, which comes out of the math library, the square root function,
  • 5:15 - 5:16
    also in the math library.
  • 5:16 - 5:19
    It could be that the function is something you wrote yourself,
  • 5:19 - 5:21
    the my function.
  • 5:21 - 5:25
    In order for this to be valid though, you can't just put any old function name there. It is
  • 5:25 - 5:27
    actually being quite specific about it
  • 5:27 - 5:30
    that plot was to find [inaudible]. It took a double, returned a double. That's the kind of function
  • 5:30 - 5:33
    that you can give it, so any function that has that
  • 5:33 - 5:36
    prototype, so it matches that
  • 5:36 - 5:40
    format is an acceptable one to pass in. If you try to pass something that actually
  • 5:40 - 5:42
    just does some other kind of function,
  • 5:42 - 5:45
    it doesn't have the same prototype, so the get line that takes no arguments and returns a string just
  • 5:45 - 5:47
    doesn't match on any front.
  • 5:47 - 5:51
    And if I try to say here, plot the get line function of the integral two to five,
  • 5:51 - 5:53
    it will quite rightfully complain to me
  • 5:53 - 5:54
  • 5:54 - 5:56
    that that just doesn't make sense.
  • 5:56 - 5:57
    That's because it doesn't match.
  • 5:57 - 6:02
    So a little bit of syntax here, but actually kind of a very powerful thing, and it allows us to write to
  • 6:02 - 6:05
    an addition to kind of parameterizing on these things you think of as traditional
  • 6:05 - 6:07
    data, integers, and strings, and whatnot. It's
  • 6:07 - 6:12
    also say as part of your operations, you may need to make a call out to some other function.
  • 6:12 - 6:15
    Let's leave it open what that function is and allow the client to specify what function
  • 6:15 - 6:20
    to call at that time. So in this case for a plot, what function that you're trying to plot, let the client
  • 6:20 - 6:21
    tell you,
  • 6:21 - 6:29
    and then based on what they ask you to plot you can plot different things. All right. Way in the back. Is
  • 6:29 - 6:34
    there a similar setup for multivariable functions [inaudible]? Certainly. So all I would
  • 6:34 - 6:37
    need to do if this was a function that took a couple arguments, I would say double comma double comma int. If it
  • 6:37 - 6:42
    returned void, [inaudible] returned. Sometimes it looks a little bit like the prototype kinda taken out of context
  • 6:42 - 6:43
    and stuffed in there,
  • 6:43 - 6:46
    and then those parens around the function are a very important part of that
  • 6:46 - 6:50
    which is telling you yeah, this is a function of these
  • 6:50 - 6:52
    with this prototype information.
  • 6:52 - 6:54
    Behind you. No? You're good now. Somebody else? Over here? Is that FN a fixed
  • 6:54 - 6:56
    name
  • 6:56 - 6:57
  • 6:57 - 7:01
    [inaudible]? No. It's just like any parameter name. You get to pick it. So I could've called it plot function, my function,
  • 7:01 - 7:08
    your function, whatever I wanted. Here
  • 7:08 - 7:12
    in the front. [Inaudible] Java? So Java doesn't really have a similar mechanism that looks like this. C does, so C++ inherits it
  • 7:12 - 7:14
    from
  • 7:14 - 7:18
    C. There are other ways you try to accomplish this in Java. It tries to support the same functionality in the end, but it
  • 7:18 - 7:19
  • 7:19 - 7:22
    uses a pretty different approach than a functions as
  • 7:22 - 7:27
    data approach. [Inaudible]?
  • 7:27 - 7:28
    Can you pass like a
  • 7:28 - 7:31
    method, like an operator? So typically not.
  • 7:31 - 7:35
    This syntax that's being used here is for a free function, a function
  • 7:35 - 7:38
    that's kind of out in the global namespace, that level.
  • 7:38 - 7:42
    There is a different syntax for passing a method, which is a little bit more messy,
  • 7:42 - 7:46
    and we won't tend to need it, so I won't go there with you, but it does exist. It just as it stands
  • 7:46 - 7:50
    does not want a method. It wants a function. So
  • 7:50 - 7:54
    a method meaning member function of a class.
  • 7:54 - 7:56
    Okay, so let me -that was kind of just to
  • 7:56 - 8:00
    set the groundwork for the problem we were trying to solve in set,
  • 8:00 - 8:05
    which was set is holding this collection of elements that the client has stuffed in there. It's
  • 8:05 - 8:05
    a generic
  • 8:05 - 8:07
    templative class, so it doesn't
  • 8:07 - 8:11
    have any preconceived notion about what's being stored. Are the strings? Are they student structures?
  • 8:11 - 8:13
    Are they integers?
  • 8:13 - 8:17
    And that in order to perform its operations efficiently, it actually is using this
  • 8:17 - 8:22
    notion of ordering, keeping them in an order so that it can iterate an order. It can quickly find on the basis
  • 8:22 - 8:27
    of using order to quickly decide where something has to be and if it's present in the collection.
  • 8:27 - 8:29
    So how does it know how to compare
  • 8:29 - 8:31
    something that's of unknown type?
  • 8:31 - 8:36
    Well, what it does is it has a default strategy. It makes an assumption that
  • 8:36 - 8:39
    if I used equals equals and less than that would tell me kinda where to go.
  • 8:39 - 8:42
    And so it's got this idea that it wants to know. We'll give it two things.
  • 8:42 - 8:47
    Are they the same thing? In which case, it uses kinda the zero to show the ordering between them. If one
  • 8:47 - 8:51
    precedes the other, it wants to have some negative number that says well this one precedes it, or some
  • 8:51 - 8:53
    positive number
  • 8:53 - 8:54
    if it follows it.
  • 8:54 - 8:57
    So it applies this operation to the [inaudible] that
  • 8:57 - 9:02
    are there, and for strings, and ints, and doubles, and characters,
  • 9:02 - 9:04
    this works perfectly well.
  • 9:04 - 9:08
    And so that's how without us going out of our way, we can have sets of things
  • 9:08 - 9:11
    that respond to the built in relational operators
  • 9:11 - 9:14
    without any special effort as a client.
  • 9:14 - 9:17
    But what we can get into trouble with right is when
  • 9:17 - 9:20
    equals equals less than don't make sense for type.
  • 9:20 - 9:24
    So let me just go actually type some code, and I'll show you. I have it on the slide, but I'm gonna -wow,
  • 9:24 - 9:29
    this chair suddenly got really short. [Inaudible] fix that. Okay.
  • 9:29 - 9:30
  • 9:30 - 9:33
    We'll go over here because I think it's better just to see it really happening,
  • 9:33 - 9:36
    so I'm gonna ignore this piece of code because it's not what I wanted.
  • 9:36 - 9:40
    But if I make some student T structure,
  • 9:40 - 9:40
    and it's got
  • 9:40 - 9:44
    the first and last name, and it's got the ID number,
  • 9:44 - 9:48
    and maybe that's all I want for now
  • 9:48 - 9:48
    -that if
  • 9:48 - 9:51
    down in my main -can't
  • 9:51 - 9:54
    find my main. There it is. I'm
  • 9:54 - 9:56
    gonna need that piece of code later, so I'm gonna leave it there.
  • 9:56 - 9:58
    If I make a set,
  • 9:58 - 10:05
    and I say I'd like a set of students -my class
  • 10:05 - 10:09
    -and I do this. And so I feel like I haven't gotten [inaudible]. I made this structure. I
  • 10:09 - 10:14
    say I'd like to make a set of students, that each student is in the class exactly once and I don't need any duplicates,
  • 10:14 - 10:18
    and I go through the process of trying to compile this, it's gonna give me some complaints.
  • 10:18 - 10:21
    And its complaint, which is a little bit hard to see up here, is
  • 10:21 - 10:26
    there's no match for operator equals equals in one, equals equals two, and operator less than in
  • 10:26 - 10:27
    one equals
  • 10:27 - 10:30
    two. So it's kind of showing me another piece of code that's kind of a
  • 10:30 - 10:35
    hidden piece of code I haven't actually seen directly, which is this operator compare function.
  • 10:35 - 10:37
    And that is the one that the set is using, as
  • 10:37 - 10:40
    it has this idea of what's the way it should compare two things. It says well I have some of this
  • 10:40 - 10:46
    operator compare that works generically on any type of things using equals equals and less than. And
  • 10:46 - 10:50
    if I click up here on the instantiate from here, it's gonna help me to understand what caused
  • 10:50 - 10:52
    this problem.
  • 10:52 - 10:53
    The problem was caused by
  • 10:53 - 10:57
    trying to create a set who was holding student T.
  • 10:57 - 11:00
    And so this gives you a little bit of an insight on how the template
  • 11:00 - 11:02
    operations are handled by the compiler, that
  • 11:02 - 11:06
    I have built this whole set class, and it depends on there being equals equals and less than
  • 11:06 - 11:08
    working for the type.
  • 11:08 - 11:13
    The fact that it didn't work for student T wasn't a cause for alarm until
  • 11:13 - 11:15
    I actually tried to instantiate
  • 11:15 - 11:18
    it. So at the point where I said I'd like to make a set holding student
  • 11:18 - 11:22
    T, the first point where the compiler actually goes through the process of generating a whole set
  • 11:22 - 11:25
    class, the set angle brackets student T,
  • 11:25 - 11:29
    filling in all the details, kinda working it all out, making the add, and contains, and whatever
  • 11:29 - 11:32
    operations, and then in the process of those, those
  • 11:32 - 11:36
    things are making calls that try to take student T objects and compare them using less than and
  • 11:36 - 11:37
    equals equals.
  • 11:37 - 11:39
    And that causes that code to fail.
  • 11:39 - 11:42
    So the code that's really failing is kind of somewhere in the class library,
  • 11:42 - 11:44
    but it's failing because
  • 11:44 - 11:49
    the things that we're passing through and instantiating for don't work with that
  • 11:49 - 11:50
    setup.
  • 11:50 - 11:52
    So it's something we've gotta
  • 11:52 - 11:54
    fix, we've gotta do something about.
  • 11:54 - 11:59
    Let me go back here and say what am I gonna do. Well, so
  • 11:59 - 12:01
    there's my error message. Same one,
  • 12:01 - 12:04
    right? Saying yeah, you can't do that
  • 12:04 - 12:05
    with a [inaudible]. Well, what
  • 12:05 - 12:06
    we do
  • 12:06 - 12:09
    is we use this notion of functions as data
  • 12:09 - 12:12
    to work out a solution for this problem
  • 12:12 - 12:15
    that if you think about kinda what's going on, the set actually knows everything about how
  • 12:15 - 12:19
    to store things. It's a very fancy efficient structure that says given your things, keeps them
  • 12:19 - 12:20
    in order, and
  • 12:20 - 12:22
    it manages to update, and
  • 12:22 - 12:24
    insert, and search that thing very efficiently.
  • 12:24 - 12:28
    But it doesn't know given any two random things how to compare them other than this assumption
  • 12:28 - 12:32
    it was making about less than and equals equals being a way to tell.
  • 12:32 - 12:37
    If it wants to have sort of a more sophisticated handling of that, what it needs to do is cooperate
  • 12:37 - 12:38
    with the client -that
  • 12:38 - 12:43
    the implementer of the set can't do it all. So there's these two programmers that
  • 12:43 - 12:44
    need to work in harmony.
  • 12:44 - 12:50
    So what the set does is it allows for the client to specify by providing a function.
  • 12:50 - 12:54
    It says well when I need to compare two things, how about you give me the name of a function
  • 12:54 - 12:58
    that when given two elements will return to me their ordering, this integer zero,
  • 12:58 - 13:00
    negative, positive that tells me how to put them in place.
  • 13:00 - 13:05
    And so the set kind of writes all of its operations in terms of well there's some function I can call, this function that
  • 13:05 - 13:07
    will compare two things.
  • 13:07 - 13:11
    If they don't specify one, I'll use this default one that maps them to the relationals, but if they do give
  • 13:11 - 13:15
    me one, I'll just ask them to do the comparison. And so then as its doing its searching and inserting
  • 13:15 - 13:16
    and whatnot,
  • 13:16 - 13:19
    it's calling back. We call that calling back to the client.
  • 13:19 - 13:22
    So the client writes a function. If I wanna put
  • 13:22 - 13:25
    student Ts into a set,
  • 13:25 - 13:28
    then I need to say when you compare two student Ts, what do you look at to know if they're
  • 13:28 - 13:29
    the same or how to order them.
  • 13:29 - 13:31
    So maybe I'm gonna order them by ID number.
  • 13:31 - 13:33
    Maybe I'm gonna use their first and last name.
  • 13:33 - 13:37
    Whatever it means for two things to be equal and have some sense of order,
  • 13:37 - 13:38
    I supply
  • 13:38 - 13:40
    -I write the function,
  • 13:40 - 13:43
    and then I pass it to the set constructor.
  • 13:43 - 13:44
    I say here's the function to use.
  • 13:44 - 13:49
    The set will hold on to that function. So I say here's the compare student structure function.
  • 13:49 - 13:52
    It holds onto that name, and when needed it calls back.
  • 13:52 - 13:53
    It says I'm
  • 13:53 - 13:57
    about to go look for a student structure. Is this the one? Well, I don't know if two student structures are
  • 13:57 - 14:01
    the same. I'll ask the client. Here's two student structures. Are they the same?
  • 14:01 - 14:02
    And then
  • 14:02 - 14:06
    as needed, it'll keep looking or
  • 14:06 - 14:09
    insert and add and do whatever I need to do.
  • 14:09 - 14:13
    So let's go back over here.
  • 14:13 - 14:19
    I'll write a little function. So the
  • 14:19 - 14:21
    prototype for it is it takes two elem Ts
  • 14:21 - 14:24
    and it returns an int.
  • 14:24 - 14:31
    That int is expected to have a value zero if they are the same,
  • 14:31 - 14:33
    and a value that is negative if
  • 14:33 - 14:37
    the first argument precedes the second. So if A is less than B, returns some negative
  • 14:37 - 14:41
    thing. You can return negative one, or negative 100, or negative one million,
  • 14:41 - 14:42
    but
  • 14:42 - 14:45
    you need to return some negative value. And then if A [cuts
  • 14:45 - 14:47
    out] some ordering, so they're not equal
  • 14:47 - 14:51
    [cuts out] later, it will return some positive value: 1, 10, 6 million.
  • 14:51 - 14:55
    So if I do [cuts out] say I use ID num as my comparison.
  • 14:55 - 14:56
    Based on [cuts
  • 14:56 - 14:58
    out] are the same, if they are
  • 14:58 - 15:01
    I can return zero. And if
  • 15:01 - 15:03
    ID num of A
  • 15:03 - 15:07
    is less than the ID num of B, I can
  • 15:07 - 15:08
    return negative one, and then in
  • 15:08 - 15:11
    the other case, I'll return one.
  • 15:11 - 15:14
    So it will compare them on the basis [cuts out]
  • 15:14 - 15:16
    figuring that the name field
  • 15:16 - 15:17
    at that point is
  • 15:17 - 15:19
    nothing new.
  • 15:19 - 15:24
    And then the way I use that is over here when I'm constructing it is there is
  • 15:24 - 15:27
    [cuts out] to the constructor, and so that's how I would pass [cuts out] add
  • 15:27 - 15:29
    parens to the [cuts
  • 15:29 - 15:35
    out] as I'm declaring it, and then I pass the name. Do I call it compare student or compare students? I can't remember. Compare student. Okay.
  • 15:35 - 15:38
    And
  • 15:38 - 15:42
    this then -so now there's nothing going on in the code -causes it to compile,
  • 15:42 - 15:45
    and that if you were to put
  • 15:45 - 15:50
    let's say a see out statement in your comparison function just for fun, you would find out as
  • 15:50 - 15:53
    you were doing adds, and compares, and removes, and whatnot on this set
  • 15:53 - 15:58
    that you would see that your call kept being made. It kept calling back to you as the
  • 15:58 - 16:01
    client saying I need to compare these things. I need to compare these things to decide where to put something,
  • 16:01 - 16:03
    whether it had something,
  • 16:03 - 16:04
    and whatnot.
  • 16:04 - 16:08
    And then based on your ordering, that would control for example how the iterator worked, that
  • 16:08 - 16:12
    the smallest one according to your function would be the one first returned by your
  • 16:12 - 16:14
    iterator, and it would move through larger
  • 16:14 - 16:17
    or sort of later in the ordering ones until the end. So
  • 16:17 - 16:20
    it's a very powerful mechanism
  • 16:20 - 16:24
    that's at work here because it means that for
  • 16:24 - 16:28
    anything you wanna put in a set, as long as you're willing to say how it is you compare
  • 16:28 - 16:28
    it,
  • 16:28 - 16:33
    then the set will take over and do the very efficient storing, and searching, and organizing
  • 16:33 - 16:34
    of that.
  • 16:34 - 16:38
    But you, the only piece you need to supply is this one little thing it can't figure
  • 16:38 - 16:42
    out for itself, which is given your type of thing, how do you compare
  • 16:42 - 16:46
    it. For the built in types string, and int, and double, and car,
  • 16:46 - 16:51
    it does have a default comparison function, that one that was called operator compare. Let
  • 16:51 - 16:55
    me go open it for you
  • 16:55 - 16:56
    [inaudible] the 106.
  • 16:56 - 16:58
    So there is a
  • 16:58 - 17:00
    compare function dot H,
  • 17:00 - 17:02
    and this is actually what
  • 17:02 - 17:05
    the default version of it looks. It's actually written as a template itself
  • 17:05 - 17:07
    that given two things it
  • 17:07 - 17:08
    just
  • 17:08 - 17:11
    turns around and asks the built in operators to help us out with that.
  • 17:11 - 17:15
    And that is the name that's being used
  • 17:15 - 17:17
    if I open the set
  • 17:17 - 17:20
    and you look at its constructor call. I had said that I would come back and tell you about
  • 17:20 - 17:22
    what this was
  • 17:22 - 17:25
    that the argument going into the set constructor
  • 17:25 - 17:31
    is one parameter whose name is CMP function that takes two elem type things -so here's the
  • 17:31 - 17:33
    one in the example of the two argument prototype
  • 17:33 - 17:38
    -returns an int, and then it uses a default assignment for that of operator compare, the one we just
  • 17:38 - 17:39
    looked at,
  • 17:39 - 17:41
    so that if you don't specify it, it goes
  • 17:41 - 17:46
    through and generates the standard comparison function for the type,
  • 17:46 - 17:47
    which
  • 17:47 - 17:49
    for built-ins will work fine, but for
  • 17:49 - 17:53
    user defined things is gonna create compiler errors.
  • 17:53 - 17:56
    So you can also choose if you don't like the default ordering works.
  • 17:56 - 18:01
    So for example, if you wanted to build a set of strings that was case insensitive
  • 18:01 - 18:04
    -so the default string handling would be to use equals equals and less than, which actually
  • 18:04 - 18:08
    does care about case. It doesn't think that capital [inaudible] is the same as lower case.
  • 18:08 - 18:10
    If you wanted it to consider them the same,
  • 18:10 - 18:14
    you could supply your own. A compare case insensitively function took two strings,
  • 18:14 - 18:16
    converted their case, and then compared them.
  • 18:16 - 18:19
    And then when you establish a set of strings, instead of letting the default argument take over,
  • 18:19 - 18:24
    go ahead and use your case insensitive compare, and then now you have a set of strings
  • 18:24 - 18:26
    that operates case insensitively.
  • 18:26 - 18:28
    So you can change
  • 18:28 - 18:31
    the ordering, adapt the ordering, whatever you like for the primitives, as well as supply the necessary
  • 18:31 - 18:32
    one for the
  • 18:32 - 18:34
    things the built-ins
  • 18:34 - 18:35
    don't have
  • 18:35 - 18:40
    properties for. So then that's that piece of code right
  • 18:40 - 18:41
    there. All right. So
  • 18:41 - 18:44
    does that make sense?
  • 18:44 - 18:45
    Well, now you know
  • 18:45 - 18:49
    kind of the whole range of things that are available in the class library. All
  • 18:49 - 18:53
    right, so we saw the four sequential containers, the vector, stack, queue,
  • 18:53 - 18:55
    and the grid that you kinda indexed ordering,
  • 18:55 - 18:58
    and kinda allowed you to throw things in and get them back out.
  • 18:58 - 19:02
    We went through the map, which is the sort of fancy heavy lifter that does that key value
  • 19:02 - 19:04
    lookup, and then we've seen the set, which does kind of
  • 19:04 - 19:06
    aggregate collection management,
  • 19:06 - 19:10
    and very efficient operations for kind of searching, retrieving, ordering,
  • 19:10 - 19:12
  • 19:12 - 19:13
    joining with other kind of sets, and stuff
  • 19:13 - 19:16
    that also has a lot of high utility. I wanna do
  • 19:16 - 19:19
    one quick little program with you before I start recursion
  • 19:19 - 19:21
    just because I think it's kinda cool
  • 19:21 - 19:23
    is to talk a little about this idea of like once
  • 19:23 - 19:26
    you have these ADTs, you can solve a lot of cool problems, and that's certainly what this
  • 19:26 - 19:29
    week's assignment is about. It's like well here are these tasks that
  • 19:29 - 19:31
    if you didn't have
  • 19:31 - 19:34
    -So ADTs, just a reminder -I say this word as though everybody knows exactly what it means
  • 19:34 - 19:37
    -is just the idea of an abstract data type.
  • 19:37 - 19:39
    So an abstract data type is a data type
  • 19:39 - 19:42
    that you think of in terms of what it provides as an abstraction.
  • 19:42 - 19:44
    So a queue is this FIFO line,
  • 19:44 - 19:47
    and how it works internally, what it's implemented as, we're not worried
  • 19:47 - 19:51
    about it at all. We're only worried about the abstraction of enqueue and dequeue, and it coming first
  • 19:51 - 19:53
    in first out. So we talk
  • 19:53 - 19:56
    about ADTs, we say once we have a queue, a stack, or a vector,
  • 19:56 - 19:58
    we know what those things do,
  • 19:58 - 20:00
    what a mathematical set is about.
  • 20:00 - 20:03
    We build on top of that. We write code
  • 20:03 - 20:06
    that leverages those ADTs to do cool things
  • 20:06 - 20:10
    without having to also manage the low level details of where the memory came from for these things,
  • 20:10 - 20:13
    how they grow, how they search, how
  • 20:13 - 20:15
    they store and organize the data.
  • 20:15 - 20:17
    You just get to do real cool things.
  • 20:17 - 20:21
    So you probably got a little taste of that at the end of 106A when you get to use the array list and the
  • 20:21 - 20:24
    hashmap to do things. This set kinda just expands out
  • 20:24 - 20:26
    to fill out some other niches
  • 20:26 - 20:28
    where you can do a lot of really cool things
  • 20:28 - 20:30
    because you have these things around to build on.
  • 20:30 - 20:34
    So one of the things that happens a lot is you tend to do layered things, and you'll see a little bit of this
  • 20:34 - 20:38
    in the assignment you're doing this week where it's not just a set of something, it's a set of a map
  • 20:38 - 20:39
    of something, or
  • 20:39 - 20:43
    a vector of queues, a map of set. So I gave you a couple of examples here of
  • 20:43 - 20:44
    the things
  • 20:44 - 20:48
    that might be useful. Like if you think of what a smoothie is, it's a set of things
  • 20:48 - 20:52
    mixed together, some yogurt, and some different fruits, some wheatgrass, whatever it is you have in
  • 20:52 - 20:53
    it.
  • 20:53 - 20:57
    And that the menu for a smoothie shop is really just a bunch of those sets, so each set
  • 20:57 - 21:00
    of ingredients is a particular smoothie they have, and then
  • 21:00 - 21:04
    the set of all those sets is the menu that they post up on the board you can come
  • 21:04 - 21:06
    in and order.
  • 21:06 - 21:07
    The
  • 21:07 - 21:09
    compiler tends to use a map
  • 21:09 - 21:13
    to keep track of all the variables that are in scope. As you declare variables, it
  • 21:13 - 21:17
    adds them to the map so that when you later use that name, it knows where to find it.
  • 21:17 - 21:19
    Well, it also has to manage though
  • 21:19 - 21:21
    not just one map,
  • 21:21 - 21:25
    but the idea is as you enter and exit scopes, there is this
  • 21:25 - 21:27
    layering of open scopes.
  • 21:27 - 21:32
    So you have some open scope here. You go into a for loop where you open another scope. You add some
  • 21:32 - 21:33
    new variables
  • 21:33 - 21:37
    that when you look it actually shadows then at the nearest definition, so if you had two variables
  • 21:37 - 21:41
    of the same, it needs to look at the one that's closest.
  • 21:41 - 21:44
    And then when you exit that scope, it needs those variables to go away and
  • 21:44 - 21:48
    no longer be accessible. So one model for that could be very much a stack of
  • 21:48 - 21:51
    maps where each of those maps represents the scope
  • 21:51 - 21:52
    that's active,
  • 21:52 - 21:56
    a set of variable names, and maybe their types and information is stored in that map. And then
  • 21:56 - 21:58
    you stack them up. As you
  • 21:58 - 22:01
    open a scope, you push on a new empty map.
  • 22:01 - 22:03
    You put things into it, and then
  • 22:03 - 22:06
    maybe you enter another scope. You push on another new empty map. You stick things into it, but as you
  • 22:06 - 22:11
    exit and hit those closing braces, you pop those things from the stack to get to
  • 22:11 - 22:13
    this previous
  • 22:13 - 22:16
    environment you were in. So let me do a
  • 22:16 - 22:19
    little program with you. I just have
  • 22:19 - 22:22
    this idea of how this would work, and we'll see if
  • 22:22 - 22:26
    I can code something up. So I have -let's go
  • 22:26 - 22:28
    back over
  • 22:28 - 22:32
    here. I'm going to -this is the piece of code that just reads
  • 22:32 - 22:34
    words, so that's a
  • 22:34 - 22:36
    fine piece of code to have here.
  • 22:36 - 22:38
    I have a file here -let
  • 22:38 - 22:39
    me open
  • 22:39 - 22:43
    it up for you -that just contains the contents of the Official Scrabble Players' Dictionary,
  • 22:43 - 22:45
    Edition 2.
  • 22:45 - 22:47
    It's got a lot of words it.
  • 22:47 - 22:48
    It's pretty long.
  • 22:48 - 22:49
    It's still loading.
  • 22:49 - 22:53
    Let's go back and do something else while it's loading.
  • 22:53 - 22:54
    It happened to have
  • 22:54 - 22:55
  • 22:55 - 22:58
    about 120,000 words I think is what it would be right now.
  • 22:58 - 22:59
    So it's busy loading,
  • 22:59 - 23:01
    and I have this question for you.
  • 23:01 - 23:04
    There are certain words that are anagrams of each other.
  • 23:04 - 23:06
  • 23:06 - 23:08
    The word
  • 23:08 - 23:10
    cheap can be anagrammed into the word peach,
  • 23:10 - 23:11
    things like that.
  • 23:11 - 23:12
  • 23:12 - 23:13
    And so I
  • 23:13 - 23:15
    am curious for the
  • 23:15 - 23:16
    Official Scrabble Players' Dictionary
  • 23:16 - 23:20
    what -so if you imagine that some words can be anagrammed a couple times, five
  • 23:20 - 23:24
    or six different words just on how you can rearrange the letters. I'm curious to know what the largest anagram
  • 23:24 - 23:28
    cluster is in the Official Scrabble Players' Dictionary. So I'd like to know across
  • 23:28 - 23:30
    all 127,000 words
  • 23:30 - 23:34
    that they form little clusters, and I'd like to find out what's the biggest of
  • 23:34 - 23:39
    those clusters.
  • 23:39 - 23:41
    Okay. That's my goal. So
  • 23:41 - 23:45
    here's what I've got going. I've got something that's gonna read them one by one.
  • 23:45 - 23:48
    So let's brainstorm for a second.
  • 23:48 - 23:49
  • 23:49 - 23:51
    I want a way to take a particular word and kinda stick
  • 23:51 - 23:55
    it with its other anagram cluster friends.
  • 23:55 - 23:57
    What's a way I might do that?
  • 23:57 - 24:03
    Help me design my data structure.
  • 24:03 - 24:06
    Help me out. [Inaudible]. I've
  • 24:06 - 24:08
    got the word peach.
  • 24:08 - 24:10
    Where should I stick it so I can ? You could treat each string as like a set of
  • 24:10 - 24:13
    ? So I have this
  • 24:13 - 24:17
    string,
  • 24:17 - 24:21
    which represents the letters. I've
  • 24:21 - 24:22
    got the word peach. I wanna be able to stick peach with cheap,
  • 24:22 - 24:26
    so where should I stick peach in such a way that I could find it. And you've got this idea that
  • 24:26 - 24:27
  • 24:27 - 24:29
    the letters there are a set.
  • 24:29 - 24:32
    They're not quite a set, though. Be careful because the word banana has a
  • 24:32 - 24:36
    couple As and a couple Ns, and so it's not that I'd really want it to come down to be
  • 24:36 - 24:39
    the set BAN. I wouldn't wanna coalesce the duplicates on that, so I really do wanna preserve all the letters that
  • 24:39 - 24:43
    are in there,
  • 24:43 - 24:46
    but your idea's getting us somewhere. It's like there is this idea of kind of like for any
  • 24:46 - 24:49
    particular word there's the
  • 24:49 - 24:53
    collection of letters that it is formed from.
  • 24:53 - 24:56
    And somehow if I could use that as an identifier
  • 24:56 - 24:58
    in a way that was reliable
  • 24:58 - 25:01
    -anybody
  • 25:01 - 25:02
    got any
  • 25:02 - 25:03
    ideas
  • 25:03 - 25:08
    about what to do with that? If you did that a vector, each letter and then the frequency of each letter in the word? So I could certainly do that. I could build kind of a vector that
  • 25:08 - 25:12
    had kinda frequencies, that had this little struct that maybe it was number of times it occurs.
  • 25:12 - 25:15
    Then I could try to build something on the basis of that vector that was like
  • 25:15 - 25:18
    here is -do these two vectors
  • 25:18 - 25:22
    match? Does banana match apple? And you'd say well no. It turns out they don't
  • 25:22 - 25:23
    have the same letters. I'm trying to
  • 25:23 - 25:27
    think of a really easy way to represent that. Your idea's good,
  • 25:27 - 25:31
    but I'm thinking really lazy. So somebody help me who's lazy. Could you
  • 25:31 - 25:36
    have a map where the key is all the letters in alphabetical order and the value is a
  • 25:36 - 25:41
    vector of all the ? Yeah. That is a great idea. You're taking his idea, and you're making it easier.
  • 25:41 - 25:43
    You're capitalizing on lazy, which is yeah
  • 25:43 - 25:45
    -I wanna keep track of all the letters that are in the word,
  • 25:45 - 25:48
    but I wanna do it in a way that makes it really easy for
  • 25:48 - 25:52
    example to know whether two have the same -we'll call it the signature. The signature of
  • 25:52 - 25:54
    the word is the letter frequency across it.
  • 25:54 - 25:57
    If I could come up with a way to represent the signature that was really to compare two
  • 25:57 - 26:01
    signatures quickly to see if they're the same, then I will have less work to do. And so your idea is a great
  • 26:01 - 26:04
    one. We take the letters and we alphabetize them.
  • 26:04 - 26:08
    So cheap and peach both turn into the same ACEHP. It's
  • 26:08 - 26:13
    a nonsense thing, but it's a signature. It's unique for any anagram, and
  • 26:13 - 26:14
    if I use a map
  • 26:14 - 26:15
    where that's the key,
  • 26:15 - 26:17
    then I can associate with every word
  • 26:17 - 26:20
    that had that same signature.
  • 26:20 - 26:23
    So
  • 26:23 - 26:27
    let's start building that. So let me write
  • 26:27 - 26:29
    -I wanna call this the signature.
  • 26:29 - 26:31
    Given a string,
  • 26:31 - 26:33
    it's going to
  • 26:33 - 26:34
    alphabetize it.
  • 26:34 - 26:37
    I'm going to write the
  • 26:37 - 26:40
    dumbest version of this ever,
  • 26:40 - 26:47
    but I'm just gonna use a simple sorting routine on this. So
  • 26:47 - 26:51
    smallest is gonna small index -we'll call it min index. That's a better name for it.
  • 26:51 - 26:53
    Min index equals I,
  • 26:53 - 26:57
    and then I'm gonna look through the string, and I'm gonna find the smallest letter that's
  • 26:57 - 26:58
    there and
  • 26:58 - 27:03
    move it to the front. That's basically gonna be my strategy. I've
  • 27:03 - 27:05
    got the wrong place to start, though. I'm gonna
  • 27:05 - 27:11
    look from I to the one. So if
  • 27:11 - 27:13
    S sub J
  • 27:13 - 27:15
    is less than S sub min index,
  • 27:15 - 27:20
    so it is a smaller letter, then the min index gets to be J.
  • 27:20 - 27:23
    So far, what I've done is I've run this loop that kind
  • 27:23 - 27:27
    of starts in this case on the very first iteration and says the first character in Slot 0, that
  • 27:27 - 27:31
    must be the min. And then it looks from there to the end. Is there anything smaller? Whenever it
  • 27:31 - 27:34
    find anything smaller, it updates the min index, so when it's done
  • 27:34 - 27:39
    after that second loop has fully
  • 27:39 - 27:40
    operated,
  • 27:40 - 27:44
    then min index will point to the smallest alphabetically
  • 27:44 - 27:45
    character in the
  • 27:45 - 27:49
    string that we have. Then I'm gonna swap it to the front.
  • 27:49 - 27:52
    So S sub I
  • 27:52 - 27:55
    with S sub min index, and
  • 27:55 - 28:08
    I'll write a swap because swap is pretty easy to write.
  • 28:08 - 28:10
    See about how we do this, okay.
  • 28:10 - 28:12
    So
  • 28:12 - 28:13
    if I
  • 28:13 - 28:16
    -I like how this works, and then let me stop and say
  • 28:16 - 28:17
    right here, now
  • 28:17 - 28:19
    is a good time to test this thing.
  • 28:19 - 28:23
    I just wrote signature. It probably works, but it potentially
  • 28:23 - 28:24
    could not.
  • 28:24 - 28:27
    And I'm just gonna show you a little bit of how I write code. This is good to know. As I
  • 28:27 - 28:28
    say,
  • 28:28 - 28:29
  • 28:29 - 28:32
    I put some code in here that I plan on throwing away.
  • 28:32 - 28:35
    Enter word
  • 28:35 - 28:42
    -and then I throw it through my function. I think I called it
  • 28:42 - 28:44
    signature, didn't I?
  • 28:44 - 28:47
    Signature S
  • 28:47 - 28:51
    -so the idea being if it doesn't work, I wanna find out sooner rather than
  • 28:51 - 28:52
    later. It
  • 28:52 - 28:55
    doesn't like my use of get line. Is that because it's not
  • 28:55 - 28:57
    included?
  • 28:57 - 29:00
    Yes, it's not. So let's go get the right header files. This is half the
  • 29:00 - 29:03
    battle sometimes is figuring out what
  • 29:03 - 29:04
    headers
  • 29:04 - 29:05
    are needed to make
  • 29:05 - 29:10
    your compiler happy. So now it's up
  • 29:10 - 29:14
    here at enter word, and I say what does cheap come out as?
  • 29:14 - 29:16
    It goes into the debugger. That's good.
  • 29:16 - 29:17
    So it wasn't right.
  • 29:17 - 29:19
    Now we're gonna get to find out
  • 29:19 - 29:21
    what does it not like about that.
  • 29:21 - 29:27
    It says
  • 29:27 - 29:29
    -did we forget to return something?
  • 29:29 - 29:30
    Let's go look at our code.
  • 29:30 - 29:34
    So it's complaining about -oh yeah, I can see where we're gonna get in trouble here. It's
  • 29:34 - 29:38
    complaining that the return -it was saying that I'm trying to print something and it looks like garbage, that
  • 29:38 - 29:40
    what I'm trying to print didn't make sense at all.
  • 29:40 - 29:47
    I could say well that's funny. Let's go look at signature. [Inaudible] pass the [inaudible].
  • 29:47 - 29:51
    That's probably a good idea. We ought to fix that while we're there.
  • 29:51 - 29:54
    Let me leave that bug in for a minute because I'm gonna fix my first bug, and then I'll come back
  • 29:54 - 29:55
    to that one.
  • 29:55 - 29:57
    So it turns out what it's really complaining about though is it has to do with
  • 29:57 - 30:01
    -I said well what does signature return. Somehow, what's being returned by signature is being interpreted
  • 30:01 - 30:04
    as total crap when it got back to main, and there's a very good reason for that
  • 30:04 - 30:06
    because I never returned
  • 30:06 - 30:09
    anything. So maybe if I had been less cavalier about the fact that it was giving me a warning
  • 30:09 - 30:11
    over here that said
  • 30:11 - 30:14
    control reaches the end of non-void function, but I was being lazy and I didn't look
  • 30:14 - 30:17
    -was that I
  • 30:17 - 30:18
    didn't pay attention to the fact.
  • 30:18 - 30:20
    So let's leave my other bug in that you've already pointed out
  • 30:20 - 30:24
    because this is exactly what it's like when you're doing it. You enter words and you say cheap,
  • 30:24 - 30:25
    and it says cheap, and you're like
  • 30:25 - 30:28
    no.
  • 30:28 - 30:32
    And then you're like about how about an. Hey look, that's in alphabetical order. And then you spend all your time thinking about words
  • 30:32 - 30:37
    for a while. Well, tux. That's in alphabetical order. It seems to work. You could do that for a while, but
  • 30:37 - 30:39
    then
  • 30:39 - 30:41
    you're like this whole cheap,
  • 30:41 - 30:42
    not good.
  • 30:42 - 30:46
    Okay, so I come back here and my friend who's already one step ahead of me has pointed out that my swap
  • 30:46 - 30:50
    is missing the all important pass by reference
  • 30:50 - 30:53
    that as it is it what swapping the copies that got passed to
  • 30:53 - 30:56
    the function, but of course nothing was happening back here in signature land.
  • 30:56 - 30:57
    So if I fix that, and
  • 30:57 - 30:59
    I come back in here,
  • 30:59 - 31:01
    I'm gonna feel better about this. Oh,
  • 31:01 - 31:04
    my gosh. And even tux still works.
  • 31:04 - 31:08
    And if I say banana, I should get a bunch of As, a bunch of B and an N, so there's various cases to test out if you have
  • 31:08 - 31:09
    multiple letters, and if
  • 31:09 - 31:12
    you have letters that are the same letters like
  • 31:12 - 31:16
    apple so that it doesn't lose your duplicates, and it seems to come out right.
  • 31:16 - 31:17
    So given this,
  • 31:17 - 31:19
    the word peach
  • 31:19 - 31:22
    should come down to the same signature as cheap, and so that seems to indicate we're on
  • 31:22 - 31:24
    the path
  • 31:24 - 31:24
    toward
  • 31:24 - 31:26
    building the thing we wanted to build. So I
  • 31:26 - 31:28
    have this read file thing that I
  • 31:28 - 31:29
    have left over from last time that
  • 31:29 - 31:31
    reads each word,
  • 31:31 - 31:35
    and I wanna change my vector
  • 31:35 - 31:37
    into a map of
  • 31:37 - 31:40
    set of string.
  • 31:40 - 31:48
    What capitalization do you not like? [Inaudible].
  • 31:48 - 31:52
    Yeah. So it turns out this file happens to all be
  • 31:52 - 32:00
    lower case, but there's no harm in doing this.
  • 32:00 - 32:03
    That way even if they weren't, it'll
  • 32:03 - 32:06
    take care of that problem for us if we wanted it to.
  • 32:06 - 32:09
    [Inaudible].
  • 32:09 - 32:11
    I want lower case. Oh
  • 32:11 - 32:12
    yeah, I do. Well, your
  • 32:12 - 32:15
    case. You guys are so picky.
  • 32:15 - 32:18
    All right. Here's my deal. I'm gonna take my map,
  • 32:18 - 32:21
    and this is gonna be a line of code that's gonna make your head spin.
  • 32:21 - 32:31
    Just go with it.
  • 32:31 - 32:38
    This is the do all
  • 32:38 - 32:40
    craziness. Okay.
  • 32:40 - 32:41
    So
  • 32:41 - 32:43
    I've got this map,
  • 32:43 - 32:46
    and what I wanna do is under the signature of this word,
  • 32:46 - 32:50
    I want to look up the set of strings that's associated with it and tack this one in.
  • 32:50 - 32:54
    And the add in this case with the set, I know that's gonna do non-duplication. In fact,
  • 32:54 - 32:58
    the file doesn't contain duplicates, but if it did, I certainly don't want it to record it
  • 32:58 - 33:00
    twice anyway, so I might as well do this.
  • 33:00 - 33:04
    Now this form of this is a heavy lifter for a small piece of
  • 33:04 - 33:04
    code.
  • 33:04 - 33:09
    The signature then went and converted into the ACEHP form.
  • 33:09 - 33:12
    I used that as the key into the table.
  • 33:12 - 33:16
    If it was already there, it's gonna retrieve me an existing set that I'm gonna just go ahead and
  • 33:16 - 33:21
    add a word onto. If it wasn't there, the behavior for the brackets is to create
  • 33:21 - 33:26
    kind of a new empty
  • 33:26 - 33:28
    value for that. So it'll use the key
  • 33:28 - 33:32
    and create a default value for type. Well, the default value for set of string -so if you just create a
  • 33:32 - 33:32
    set
  • 33:32 - 33:36
    without any other information in the default constructor will always create you a
  • 33:36 - 33:40
    nice clean empty set. So in fact, it will get me exactly what I want which is to put it in there with an empty
  • 33:40 - 33:43
    set that I will immediately add the word into. So after
  • 33:43 - 33:46
    I do this, I should have this fully populated map. And
  • 33:46 - 33:49
    then I'm
  • 33:49 - 33:56
    gonna do this just as a little test. I'm gonna say num words when I'm done
  • 33:56 - 33:58
    to feel
  • 33:58 - 34:00
    a little bit confident about what
  • 34:00 - 34:04
    got put in there. How
  • 34:04 - 34:06
    about I call it -that's a good idea. It's
  • 34:06 - 34:11
    so fussy. C++ never does what you want.
  • 34:11 - 34:15
    I think I called this thing OSPD2.txt that has the
  • 34:15 - 34:20
    words in it.
  • 34:20 - 34:29
    And then I need to declare the variable that I'm sticking all this stuff into is a set. So
  • 34:29 - 34:32
    go in, load stuff,
  • 34:32 - 34:35
  • 34:35 - 34:36
    doing it's thing, the
  • 34:36 - 34:38
    number of words 112,882. Okay.
  • 34:38 - 34:42
    That's close enough. I can't remember the number that's in there, but that sounds like a fine
  • 34:42 - 34:43
  • 34:43 - 34:47
    approximation of it. So I feel like it did sort of manage to do something for it. And I can actually do this if I
  • 34:47 - 34:51
    just wanna get a little glimpse of it is to use my iterator to look
  • 34:51 - 34:53
    at
  • 34:53 - 35:04
    something that's in there. Wait, is map a set of string? Iterator --
  • 35:04 - 35:05
    file iter.hasNext. I'm gonna
  • 35:05 - 35:07
    say key
  • 35:07 - 35:10
    equals iter.next,
  • 35:10 - 35:13
  • 35:13 - 35:18
    and I'm going to print that key, and I'm going to print the size of the
  • 35:18 - 35:28
    set because I'm at this
  • 35:28 - 35:33
    point -I should see gobs of printing come out of this thing. It
  • 35:33 - 35:37
    takes a little bit of a while to process the thing, and then
  • 35:37 - 35:42
    see gobs and gobs of stuff going by. It looks like a lot of things are ones if you can
  • 35:42 - 35:45
    imagine reading them as they go by because a lot of words are really just not anagrams
  • 35:45 - 35:48
    of anything else. But
  • 35:48 - 35:52
    some of the shorter ones have sort of a better chance.
  • 35:52 - 35:56
    So you can find out here at the end that there are EEIKLPST. I don't know what that is. Leakiest? No. I don't
  • 35:56 - 35:58
    know
  • 35:58 - 35:59
    what that word is. I
  • 35:59 - 36:03
    should write it in for any of them.
  • 36:03 - 36:09
    This dictionary has a bunch of really crazy words in it too, so it makes it especially challenging. What is that? I don't know. That one
  • 36:09 - 36:10
    almost looks
  • 36:10 - 36:14
    like beginners, but it's got an F in it. It's the F beginners, a very famous
  • 36:14 - 36:16
    word. You guys have probably heard
  • 36:16 - 36:19
    of it. So I've seen that, and now I wanna do the thing where I will pick the
  • 36:19 - 36:21
    largest one. I'd like to know.
  • 36:21 - 36:23
    Somebody should tell me.
  • 36:23 - 36:25
    Int max size [cuts
  • 36:25 - 36:28
    out]
  • 36:28 - 36:30
    max key, so I'll
  • 36:30 - 36:33
    set this to be zero, and max
  • 36:33 - 36:34
    key is that.
  • 36:34 - 36:35
    And then I'm gonna do this. If
  • 36:35 - 36:38
    the
  • 36:38 - 36:40
    size of this key is
  • 36:40 - 36:43
    greater than my max size,
  • 36:43 - 36:55
    then it's gonna get to be the new key.
  • 36:55 - 37:06
    So after I did all of that then [cuts out]
  • 37:06 - 37:08
    key.
  • 37:08 - 37:13
    And then I probably wanna see what they are, so why don't I go ahead and take a look. Ah, I have to go to
  • 37:13 - 37:17
    the type, though. [Inaudible]
  • 37:17 - 37:21
    to equals
  • 37:21 - 37:22
    M of
  • 37:22 - 37:27
    key -max key, I guess, dot
  • 37:27 - 37:29
    iterator,
  • 37:29 - 37:30
    [inaudible] IT
  • 37:30 - 37:37
    has next, CLIT.next [inaudible].
  • 37:37 - 37:43
    So it went back. It found the maximum, in this case using the size of the sets as the distinguishing
  • 37:43 - 37:47
    feature. And
  • 37:47 - 37:49
    then max is AEPRS,
  • 37:49 - 37:51
    which
  • 37:51 - 37:53
    it's got a big old list of about 12.
  • 37:53 - 37:55
    I think that's
  • 37:55 - 37:58
    12, actually. Maybe 13.
  • 37:58 - 37:59
    So now you know.
  • 37:59 - 38:02
    You can impress your friends at parties. This is the kind of thing you can win bar bets on.
  • 38:02 - 38:03
    Oh,
  • 38:03 - 38:06
    yeah. What's the size of the largest
  • 38:06 - 38:07
    anagram cluster?
  • 38:07 - 38:10
    Everybody wants to know this kind of stuff. I can't believe you guys can sleep at night without
  • 38:10 - 38:12
    actually knowing this.
  • 38:12 - 38:16
    And what's neat to know though [inaudible] just to point out a couple of things that -you can use a little decomposition
  • 38:16 - 38:18
    on this code,
  • 38:18 - 38:23
    but there's kind of a very small amount of things we're having to do. For example, one of the things that's really
  • 38:23 - 38:27
    powerful, things like the map where we can just figure out how to key the things
  • 38:27 - 38:29
    to store the collection right under that key,
  • 38:29 - 38:33
    then looking it up and adding something is a very
  • 38:33 - 38:36
    efficient sort of direct operation, just building on these things and it going through and doing all
  • 38:36 - 38:37
    the work of
  • 38:37 - 38:41
    storing them, sorting them, making it efficient for us to retrieve them and look them up such that I can
  • 38:41 - 38:44
    process 100,000 words
  • 38:44 - 38:49
    in the blink of an eye, and then go back through, look at them all, find the biggest one,
  • 38:49 - 38:51
    get my information out.
  • 38:51 - 38:55
    When you make the call to M.size, is
  • 38:55 - 39:00
    that the number of words? [Inaudible]. That is the number of keys. Keys, right. So that's not actually [inaudible].
  • 39:00 - 39:01
    Yeah. So it doesn't know anything about
  • 39:01 - 39:04
    everything else that was [inaudible], but in fact that's why it's
  • 39:04 - 39:07
    [inaudible]. I know the dictionary has about 127,000 words. It turns out they form about
  • 39:07 - 39:08
  • 39:08 - 39:11
    112 unique
  • 39:11 - 39:14
    signatures, and so there's actually another 20,000 words that are
  • 39:14 - 39:17
    clung onto some existing signature. That's the number of unique signatures across
  • 39:17 - 39:20
    the dictionary, not the number of words, so that's probably the wrong name to call
  • 39:20 - 39:22
    it. For the M
  • 39:22 - 39:24
  • 39:24 - 39:28
    signature word thing where [inaudible] of the default just to create a new stack, that
  • 39:28 - 39:32
    works as well for vectors [inaudible]? Yeah. It works for anything if you were just to declare
  • 39:32 - 39:34
    it on the stack and the right thing happened,
  • 39:34 - 39:37
    so vectors, set, maps. All those things do.
  • 39:37 - 39:41
    But the primitive types like int and double, it doesn't. So it would work for string. String
  • 39:41 - 39:45
    is an empty string. So for some of the fancier, more modern types tend to actually
  • 39:45 - 39:49
    know how to just default construct themselves into a good state, but the primitives don't do that.
  • 39:49 - 39:53
    So if you were having a map of ints and you wanted to have them start at zero, you need to really
  • 39:53 - 39:58
    start them at zero. You can call just M sub this. It would be garbage, and it would just be operating with
  • 39:58 - 40:02
    garbage from that way forward.
  • 40:02 - 40:06
    All right. Well, we're good. What
  • 40:06 - 40:08
    I'm gonna give you
  • 40:08 - 40:10
    is the
  • 40:10 - 40:11
    eight minute
  • 40:11 - 40:12
    discussion of recursion
  • 40:12 - 40:14
    that whets your appetite for
  • 40:14 - 40:18
    the things we're gonna be doing next week.
  • 40:18 - 40:21
    So recursion is one of those things I think that when you haven't yet
  • 40:21 - 40:24
    had a chance to explore it first hand and other people tell you about it, it
  • 40:24 - 40:30
    has sort of an awe inspiring sort of mystery, some fear, and whatnot.
  • 40:30 - 40:31
    So
  • 40:31 - 40:34
    first off, I wanna kinda shake that fear off. It is a little bit
  • 40:34 - 40:35
    hard to wrap your head around
  • 40:35 - 40:38
    the first time you see it, but we're gonna have a whole week's worth of time to spend on it, so
  • 40:38 - 40:41
    we're gonna
  • 40:41 - 40:42
    try to give you a lot of
  • 40:42 - 40:46
    different ways to think about it, and different problems to see to kinda help you do it. And I think
  • 40:46 - 40:48
    once you do get your head around it, it turns out then
  • 40:48 - 40:53
    you'll discover how infinitely powerful it is, that there is kind of a simple idea in it
  • 40:53 - 40:57
    that once you kinda fully get your head around, you can explore and solve lots of different
  • 40:57 - 41:00
    problems using this just one technique again and again.
  • 41:00 - 41:02
    So in itself, it's
  • 41:02 - 41:04
    a little bit mysterious at first glance, but then
  • 41:04 - 41:06
    once you kind of
  • 41:06 - 41:09
    master it, you'll be amazed at the kind of neat things you can do with it.
  • 41:09 - 41:15
    So it is certainly what I'd consider an indispensable tool in a programmer's tool kit. The kind of
  • 41:15 - 41:19
    problems you can solve using the techniques you have so far is fundamentally limited. And
  • 41:19 - 41:22
    part of what we need to do in this class is expose you to these new ways of solving
  • 41:22 - 41:26
    harder, more sophisticated problems that the old techniques don't work for. One of the
  • 41:26 - 41:31
    cool things about recursion is it actually lends very simple, elegant, short solutions
  • 41:31 - 41:35
    to problems that at first glance seem completely unsolvable.
  • 41:35 - 41:36
    That if you can
  • 41:36 - 41:40
    formulate a structure for [inaudible], you will discover that the code is not long
  • 41:40 - 41:45
    to write. It's not tedious to write. The tricky part is to figure out how
  • 41:45 - 41:47
    to express it,
  • 41:47 - 41:50
    so it's more of a thinking puzzle than it is a coding puzzle. I
  • 41:50 - 41:55
    certainly like thinking puzzles as much as coding puzzles if not more.
  • 41:55 - 41:58
    The general sort of idea is that
  • 41:58 - 42:02
    you are going to try to solve a problem
  • 42:02 - 42:05
    -instead of sort of breaking it down into component tasks like if I need to make dinner, I need
  • 42:05 - 42:08
    to go to the store and buy things, and I need to come home, and chop them, and get
  • 42:08 - 42:12
    the recipe. You think of what your standard decomposition is all about
  • 42:12 - 42:16
    -breaking down your tasks into A, B, and C, and D, and then you add them all together to get
  • 42:16 - 42:17
  • 42:17 - 42:20
    the whole task done. Recursion has this kind of very different way of thinking about the problem, which is like
  • 42:20 - 42:22
    well if I needed to get Task A done,
  • 42:22 - 42:26
    and I had Task A prime, which was somehow a lot like the task I was trying to solve, but it somehow
  • 42:26 - 42:29
    was a little bit simpler, a little bit easier,
  • 42:29 - 42:32
    a little bit more manageable than the one I started out to solve,
  • 42:32 - 42:34
    and if I had that solution
  • 42:34 - 42:37
    -so if somehow I could delegate it off to some
  • 42:37 - 42:39
  • 42:39 - 42:43
    minion who works for me, and then I could use that to solve my problem, then my
  • 42:43 - 42:45
    job would be made much easier by using that result of
  • 42:45 - 42:48
    solving a similar problem that's a little bit [inaudible].
  • 42:48 - 42:51
    Okay, that seems a little bit wacky.
  • 42:51 - 42:55
    Let me give you sort of an example of how this might work.
  • 42:55 - 42:58
    So your standard problem, I said yeah, it's like you do these dissimilar sub problems.
  • 42:58 - 43:04
    Let's imagine I had this goal where I wanted to survey the Stanford student body.
  • 43:04 - 43:06
    I don't want just like a
  • 43:06 - 43:08
    haphazard
  • 43:08 - 43:11
    most of the people involved. Let's say I really wanted to get input from every single person
  • 43:11 - 43:16
    on campus whether they think having cardinal as your mascot is a ridiculous choice or not. So
  • 43:16 - 43:19
    let's imagine I really wanna hear from all 10,000 students.
  • 43:19 - 43:21
    Now I can stand out in White Plaza with a big
  • 43:21 - 43:26
    note pad and try to accost people and sort of work my way down the list.
  • 43:26 - 43:29
    And then I'd be there for eons and never solve my problem.
  • 43:29 - 43:31
    Instead, what I decide to do is I say well I'm gonna
  • 43:31 - 43:32
    recruit some people to help me
  • 43:32 - 43:34
    because I'm lazy
  • 43:34 - 43:35
    as we've already established,
  • 43:35 - 43:37
    and I would like to get
  • 43:37 - 43:41
    some other people to join in my quest to answer these burning questions and to
  • 43:41 - 43:43
    solve the survey.
  • 43:43 - 43:45
    So what I do is I round up ten people let's say,
  • 43:45 - 43:49
    and I say would you help me, and I decide to divide the campus into kind of ten
  • 43:49 - 43:51
    partitions. And I say if you
  • 43:51 - 43:54
    could survey all the people whose names begin with A, B, and C, that would really help.
  • 43:54 - 43:56
    And if you could do [inaudible] Gs, and
  • 43:56 - 44:00
    if you would do -and if I divide up the alphabet that way, give each of them two or three letters, and I
  • 44:00 - 44:02
    say if you would go get the data, it'd be really
  • 44:02 - 44:05
    easy for me to do my job then. If I just took all their
  • 44:05 - 44:08
    data [inaudible]. Well, being
  • 44:08 - 44:12
    the kind of lazy person that I am, it's likely that the ten people I recruit would have similar lazy qualities
  • 44:12 - 44:15
    because lazy people hang out with other lazy people.
  • 44:15 - 44:18
    And so the person who was in charge of A, B, C, the first thing they do is turn around and find ten
  • 44:18 - 44:22
    more friends, and then they divide it up and say could you do the
  • 44:22 - 44:23
    AA through AM and so on.
  • 44:23 - 44:25
    If they divide it into these pools of
  • 44:25 - 44:30
    one tenth of what they were responsible for, and say you can go get the information from these people,
  • 44:30 - 44:31
    and if they did the same thing
  • 44:31 - 44:35
    -so if everybody along the road. We started with 10,000. Now each person had 1,000
  • 44:35 - 44:39
    to survey. They asked their friend to do 100. Their friend asked ten people to do
  • 44:39 - 44:39
    ten.
  • 44:39 - 44:43
    And then at some point, the person who has ten says well I just need to ask these ten people.
  • 44:43 - 44:45
  • 44:45 - 44:47
    Once I get their data, we don't need to do anything further.
  • 44:47 - 44:50
    So at some point the problem becomes so small, so simple,
  • 44:50 - 44:55
    even though it was kind of the same problem all along. I just reproduced the same problem we had, but in
  • 44:55 - 44:58
    a slightly more tractable form, but then I divided it around. Divide and conquer
  • 44:58 - 45:00
    sometimes they call this to where
  • 45:00 - 45:01
    I spread out the work
  • 45:01 - 45:05
    around a bunch of people to where each person's contribution is just a little part of the whole.
  • 45:05 - 45:09
    You had to find the ten volunteers around underneath you and
  • 45:09 - 45:11
    get their
  • 45:11 - 45:14
    help in solving the problem, but nobody had to do much work, and that's kind of a really
  • 45:14 - 45:18
    interesting way to solve a problem. It sounds like a very big problem of surveying
  • 45:18 - 45:19
    10,000 people,
  • 45:19 - 45:23
    but by dividing and conquer, everybody has a little tiny role to play. You leverage a lot
  • 45:23 - 45:24
    of people getting it done,
  • 45:24 - 45:29
    and there is this self-similarity to it, which is kind of intriguing that
  • 45:29 - 45:34
    everybody is trying to solve the same problem but just at different levels of scale.
  • 45:34 - 45:36
    And so this idea
  • 45:36 - 45:38
    applies to things like phone trees rather than
  • 45:38 - 45:42
    trying to get the message out to everybody in my class, it might be that I call two people who call two
  • 45:42 - 45:45
    friends who call two friends until everybody gets covered.
  • 45:45 - 45:48
    Nobody does the whole job. Everybody just does a little part of it. Sometimes you'll see these
  • 45:48 - 45:50
    fractal drawings where
  • 45:50 - 45:55
    there is a large leaf which when you look closer actually consists of littler leaves,
  • 45:55 - 45:57
    which themselves are littler leaves, so that at
  • 45:57 - 45:58
    every level of scale
  • 45:58 - 46:01
    the same image is being reproduced,
  • 46:01 - 46:05
    and the result kind of on the outside is something that in itself if you look deeper has the
  • 46:05 - 46:08
    same problem but smaller embedded in it.
  • 46:08 - 46:10
    So it's a pretty neat
  • 46:10 - 46:11
    sort of
  • 46:11 - 46:13
    way of solving things.
  • 46:13 - 46:14
    I am
  • 46:14 - 46:18
    gonna tell you a little bit about how the code looks, and then I really am not gonna be able to show you much code today.
  • 46:18 - 46:23
    I think it's actually even better to show you this code on Monday when we can come back fresh,
  • 46:23 - 46:27
    but that it involves taking -So we're looking at functional recursion first,
  • 46:27 - 46:29
    and functional recursion is
  • 46:29 - 46:33
    taking some sort of functions, a function that takes an input and returns an answers, returns
  • 46:33 - 46:36
    non-void thing that comes back.
  • 46:36 - 46:39
    And we're gonna be looking at problems where you're trying to solve this big problem,
  • 46:39 - 46:44
    and that if you had the answer to making a call to yourself on a smaller version of
  • 46:44 - 46:45
    the problem
  • 46:45 - 46:48
    -maybe one call, maybe two calls, maybe several calls
  • 46:48 - 46:52
    -that you could add those, or multiply those, or combine those in some way to answer the
  • 46:52 - 46:53
    bigger problem. So
  • 46:53 - 46:56
    if I were trying to solve
  • 46:56 - 47:03
    this campus survey, having the answers to these smaller campus surveys gives me that total result. And
  • 47:03 - 47:05
    so the -I really should not try to do this in two minutes. What
  • 47:05 - 47:09
    I should do is try to tell you on Monday. We'll come back and we'll talk about recursion, and it
  • 47:09 - 47:11
    will be an impressive week. Meanwhile, work on
  • 47:11 - 47:14
    your ADT homework.
Title:
Lecture 7 | Programming Abstractions (Stanford)
Description:

Lecture 7 by Julie Zelenski for the Programming Abstractions Course (CS106B) in the Stanford Computer Science Department.

Julie explains the idea of functions as data and specific plot functions and continues onto client feedback functions and ADTs. She then delves into recursion and solving problems using recursion.

Complete Playlist for the Course:
http://www.youtube.com/view_play_list?p=FE6E58F856038C69

CS 106B Course Website:
http://cs106b.stanford.edu

Stanford Center for Professional Development:
http://scpd.stanford.edu/

Stanford University:
http://www.stanford.edu/

Stanford University Channel on YouTube:
http://www.youtube.com/stanforduniversity/

more » « less
Video Language:
English
Duration:
47:32
N. Ueda edited English subtitles for Lecture 7 | Programming Abstractions (Stanford)
Eunjeong_Kim added a translation

English subtitles

Revisions