< Return to Video

Count II (15 mins)

  • 0:00 - 0:06
    So up to now. We've looped over data, and
    written of statements to count one thing
  • 0:06 - 0:10
    or another. But really what you want to do
    is be able to count multiple things. So
  • 0:10 - 0:14
    that you can compare them, like are there
    more boys or girls or whatever they have
  • 0:14 - 0:18
    some quality. So that's what we're going
    to do in this section. So in order to do
  • 0:18 - 0:22
    this basically what I want to do is just
    have multiple counter variables. So
  • 0:22 - 0:26
    instead of just having count, have a few
    of them. So let me just show you in this
  • 0:26 - 0:31
    code section. So let's say I wanna go
    through and I wanna count my goal is do
  • 0:31 - 0:35
    more boy or girl names end in Y, like who
    knows the answer? And well use the
  • 0:35 - 0:40
    computer, so what I'm gonna do. Is
    introduce two count variables. So outside
  • 0:40 - 0:44
    the loop, whereas before I just said count
    equals zero I'm just gonna use the simple
  • 0:44 - 0:49
    form of, of calling my variables count1,
    count2, and so on. Pretty unimaginative,
  • 0:49 - 0:53
    but it's simple. So I'm gonna h, in this
    case I'm gonna have two variables. So I'll
  • 0:53 - 0:58
    say count1 equals zero and count2 equals
    zero. And my intention is that, well, in
  • 0:58 - 1:02
    count1 I'll keep track of the boy case,
    and in count2 I'll keep track of the girl
  • 1:02 - 1:07
    case. So, inside the loop. This looks very
    much like what we did before. So, I have
  • 1:07 - 1:11
    an if task where I'm looking for, rows
    where the name ends with Y, and the gender
  • 1:11 - 1:15
    is boy. And so when that's true, I'll bump
    up count one. So, count one being sort of
  • 1:15 - 1:20
    a boy counter. And then I have, following
    it, I have a very similar if statement,
  • 1:20 - 1:24
    that looks for any name ending in Y, but
    I'm looking for a gender that equals girl.
  • 1:25 - 1:29
    And in that case, I'll bump up count two.
    So this, so the loop runs, and it's just
  • 1:29 - 1:34
    going to be counting up the girl, it's,
    you know, at the same time, it's counting
  • 1:34 - 1:39
    up the boy and girl cases. And then when
    it gets the end here. I'll just run it.
  • 1:39 - 1:44
    Then it just says boy count colon count
    one and girl count, count two. So we see
  • 1:44 - 1:48
    it turns out more girl name, more girl
    names end in Y than, than boy names or
  • 1:48 - 1:53
    whatever it is. Obviously with this
    formula you could test any number of
  • 1:53 - 1:57
    things. One thi ng I should point out is
    that, this is gonna be sort of our
  • 1:57 - 2:02
    official, class format for how complicated
    I wanna make things. So, I've got the one
  • 2:02 - 2:06
    loop. And then I could have, you know,
    generally just two or three variables. But
  • 2:06 - 2:09
    any number of variables, I set them to
    zero right here. And then for each one, I
  • 2:09 - 2:13
    have an if statement. And I wanna point
    out, the if statements are one after
  • 2:13 - 2:16
    another. And, in fact, the order doesn't
    really matter. What, what I would point
  • 2:16 - 2:20
    is, the if statements are not inside of
    each other. I'm not gonna do that. That's
  • 2:20 - 2:24
    more complicated. Of the stuff we can do,
    perfectly interesting work just sticking
  • 2:24 - 2:28
    with this form. Alright. So let me just
    try and natural extension with this, so
  • 2:28 - 2:32
    we'll just go to three variables, I'll
    just show you how that works. So, for
  • 2:32 - 2:37
    three variables, I'm just gonna stick with
    my trivial naming convention. Count one,
  • 2:37 - 2:41
    count two, count three. So I set these
    three variables to zero outside the loop.
  • 2:42 - 2:46
    And this case, the questions I wanna ask
    it, or answer, is, do more names end in A
  • 2:46 - 2:50
    or I or O? Who knows. So I've got the
    three counters and I'll use count one to
  • 2:50 - 2:54
    count the A case, and count two for the I
    case, and count three to the O case. So
  • 2:54 - 2:59
    here's the sort of obvious if statement.
    If name ends with A, and then [inaudible]
  • 2:59 - 3:03
    count one equals count one plus one. And
    then likewise there is an if statement for
  • 3:03 - 3:07
    the I case that bumps up count two, and,
    and if statement for the O case that bumps
  • 3:07 - 3:11
    up count three. And here I've got these
    three print statements, outside the loop.
  • 3:11 - 3:15
    So these run right the loop has completed
    so its bumped up all the counters to
  • 3:15 - 3:20
    whatever they're going to be, and then we
    just print them out. So it's trapped. Huh.
  • 3:20 - 3:25
    [inaudible]. So A. Totally dominates. 377
    in or just like, yeah, whatever. Thanks
  • 3:25 - 3:32
    for playing. Nice try. Just a little bit
    stylistic thing I'll point out here. My
  • 3:32 - 3:37
    naming convention here is, I mean it's.
    Kind of lame, you know, just one, two,
  • 3:37 - 3:41
    three. Another way we could of done this
    is, in this case we could of called this
  • 3:41 - 3:45
    one count A and count I. I li ke this,
    count A and count I. So, it would be more
  • 3:45 - 3:49
    demonic of, of what it was counting. But
    then it has the disadvantage of whenever
  • 3:49 - 3:54
    you copy pasted or switched from one
    example to another you would have to like
  • 3:54 - 3:58
    to remember rename. So, I decided to go
    with this very trivial simple just one,
  • 3:58 - 4:02
    two, three scheme but we could of done
    something more complicated there. The
  • 4:02 - 4:05
    other thing I'll point out is that it's,
    it's natural to find, to use copy paste
  • 4:05 - 4:09
    for these so you kind of get your first
    case working and then you [inaudible].
  • 4:09 - 4:12
    However, you do that there this very
    natural error where you have to be very
  • 4:12 - 4:15
    careful that you're manipulating the right
    variable. So that in this "if" statement
  • 4:15 - 4:19
    I'm manipulating count one and then in
    this "if" statement count two and so and
  • 4:19 - 4:22
    so. That's the sort of thing that
    happening to [inaudible]. Might help with,
  • 4:22 - 4:25
    but it's, that is, no matter what you're
    doing it's a common error. So you just
  • 4:25 - 4:29
    have to be a little bit careful about
    that. Right. So now that we've got the
  • 4:29 - 4:33
    ability to count multiply things, I wanna,
    kinda expand our data set a little bit. So
  • 4:34 - 4:38
    I did this survey, which actually I didn't
    bring up here. Hum, so in Google
  • 4:38 - 4:42
    spreadsheets, since it's all, everything
    is just intrinsically online, it works
  • 4:42 - 4:46
    really well for sharing data and doing
    easy stuff. Hum, so it has this feature,
  • 4:46 - 4:50
    where you can put up a little, sort of,
    form survey in front of people. So I made
  • 4:50 - 4:54
    this trivial survey where I ask gender,
    and favorite color, and favorite T.V show
  • 4:54 - 4:59
    that's on, and I just sent this out to my
    class. Hum, and the way it works is every
  • 4:59 - 5:03
    time someone's submitted a set of answers,
    and this' anonymous. It would go into the
  • 5:03 - 5:08
    spreadsheet. And so this is a, Google Dock
    spreadsheet and what you is there's, it's,
  • 5:08 - 5:13
    it's a table. So here's a column for
    gender, and here's a column for color. And
  • 5:13 - 5:17
    these are just the answers. And you see is
    that every time someone types in an answer
  • 5:17 - 5:22
    to the survey that just goes in as one
    row. And so we have data to sort of play
  • 5:22 - 5:26
    around with. We have favorite color,
    favorite TV show, f avorite book, and what
  • 5:26 - 5:31
    not. What I found is that it's easiest to
    do stuff with color, and sport, and,
  • 5:31 - 5:36
    favorite soda drink, 'cause there's enough
    repetition there. If you look at book,
  • 5:36 - 5:40
    there's, there's just so many books
    published that there, you know, most books
  • 5:40 - 5:44
    just appear once. Anyway this is
    interesting data to, look at just to see
  • 5:44 - 5:49
    what's going on with people who are, I
    guess about twenty years old in 2012. So,
  • 5:49 - 5:53
    from Google spreadsheets you can export
    that data in CSV format I think I
  • 5:53 - 5:57
    mentioned that before. It's a really
    common interchange format, and I just
  • 5:57 - 6:02
    cleaned up the data a little bit so I
    removed dots. There was a problem where
  • 6:02 - 6:06
    people would type in a Dr. Pepper either
    with a dot after the first R or not so I
  • 6:06 - 6:11
    just removed all dots but other than the
    data just looks like whatever people typed
  • 6:11 - 6:16
    in. So with that data we can write do all
    sorts of interesting problems, so here
  • 6:16 - 6:21
    I've set some up. So this data is
    available as survey/2012.csb so we could
  • 6:21 - 6:25
    load that into a table. There's a function
    I haven't talked about before the table
  • 6:25 - 6:30
    has called convert to lowercase. What that
    does is it goes through the table and it
  • 6:30 - 6:34
    modifies all the text to just be lowercase
    letters. So, for example if we want to
  • 6:34 - 6:39
    count, oh, well how many people have blue
    as their favorite color. Well there's this
  • 6:39 - 6:42
    problem that did they type, upper case B
    blue, or lower case, or you know, all
  • 6:42 - 6:46
    lowercase or whatever. So by calling this
    function we just cha-, all the data is now
  • 6:46 - 6:50
    gonna be lowercase. So we just don't have
    to think about that variation of what
  • 6:50 - 6:54
    people typed. So, I'm gonna do that as a
    simplification here. Alright. So let me
  • 6:54 - 6:58
    look at, so there's some sample problems
    here, and as usual we've got the,
  • 6:58 - 7:02
    solutions level so this will, I'll just
    try this out. So it says right code to
  • 7:02 - 7:06
    print the soda field of each route. So,
    what this, what I'm gonna do here. I could
  • 7:06 - 7:10
    just print the whole row but it, it's so
    much data it doesn't make a lot of sense.
  • 7:10 - 7:15
    But what could be interesting is, suppose
    you were curious about what people put for
  • 7:15 - 7:19
    the soda data. W hat you could say is, get
    field, and you need to know what the names
  • 7:19 - 7:23
    of the fields are. Their printed
    [inaudible] somewhere up here. Anyway the
  • 7:23 - 7:28
    name of the field that the soda drinks
    answer is, soda. So I'm just gonna print
  • 7:28 - 7:34
    those. And I'm not gonna count anything.
    Now we'll comment that out. So if I just
  • 7:34 - 7:38
    print that, what we get is it just goes
    through all the rows. And, and remember,
  • 7:38 - 7:42
    now, it's all lower case, and we can just
    kinda see what's there. This is maybe a
  • 7:42 - 7:46
    good first step if you sorta wanna see,
    oh, well, what things seems to come up
  • 7:46 - 7:51
    [inaudible]? Or just, kind of, if you're
    just curious, about TV shows or movies or
  • 7:51 - 7:55
    whatever. [inaudible]. This also shows, I
    guess, ultimately that row.getfield really
  • 7:55 - 8:00
    does just return a string. And so, print
    understands string, so if we just put it
  • 8:00 - 8:04
    there, [inaudible]. Alright, so what I'd
    like to do is, count the favorite sodas.
  • 8:04 - 8:09
    So I wanna say, I'm just gonna say Sprite,
    Dr. Pepper and Coke. Lets count those
  • 8:09 - 8:14
    three. So what I'm go to do is following
    my previous strategy, as I'll say count
  • 8:14 - 8:20
    one equals zero, count two equals zero,
    and count three equals zero. So my
  • 8:20 - 8:25
    intention is that I'll, you know, I'll
    follow, whatever the order is, what I say.
  • 8:26 - 8:32
    Sprite, Dr. Pepper and Coke. So what do I
    wanna say? If row dot get field soda is
  • 8:32 - 8:37
    equal to Sprite. And then write, I can
    just write it in lower case letters,
  • 8:37 - 8:44
    'cause I know that it has been changed. So
    I wanna say, count one is equal to count
  • 8:44 - 8:52
    one plus one. So that's, that's counting
    one drink. We'll just try it. So this is
  • 8:52 - 8:59
    count Sprite. Print count one, and I'm
    gonna, I'm gonna get rid of this line.
  • 8:59 - 9:06
    I'm, I'm not gonna print all the sodas as
    we go. Okay, let's try that one. Okay.
  • 9:06 - 9:12
    Sprite eight, that seems to work. So I
    wanna check it, because then. I'm gonna do
  • 9:12 - 9:19
    some merciless copy-paste. So then I'm
    gonna count doctor pepper. And I'm gonna
  • 9:19 - 9:25
    manipulate count two in that case. And I'm
    gonna count Coke. And I'm gonna play count
  • 9:25 - 9:30
    three. So this is the case I was saying,
    where you have to be careful. Copy base is
  • 9:30 - 9:35
    great, but you gotta make sure you're
    doing the right thing. >> And here I'll
  • 9:35 - 9:42
    make two copies of this line. So then this
    is gonna be Dr. Pepper, count two. And
  • 9:42 - 9:49
    Coke is count three. I have to say, there
    was one other cleanup I did in the data.
  • 9:49 - 9:53
    [inaudible], ... >> Okay. That looks
    right. One other is, people spelled Coke.
  • 9:53 - 9:57
    Sometimes they write Coca Cola, with a
    dash, or not, or whatever, so I just
  • 9:57 - 10:01
    changed those all to Coke. So. This is a
    different data set, but this is following
  • 10:01 - 10:06
    kind of my earlier example of counting
    three things. So it should do well. So the
  • 10:06 - 10:10
    case I'd like to, the complexity I'd like
    to work out is, like, well, you know,
  • 10:10 - 10:14
    really, if you look at the data, sometimes
    people would say Coke, and sometimes they
  • 10:14 - 10:18
    would say Diet Coke. And sometimes they
    would say Dr. Pepper, and sometimes they
  • 10:18 - 10:22
    would say Diet Dr. Pepper. So how could I
    get count two, let's say, to include both.
  • 10:22 - 10:28
    I want to include Dr. Pepper and also Diet
    Dr. Pepper. And the way to make it more
  • 10:28 - 10:38
    inclusive there is to use, the or, oops.
    So I'm going to say, or. Grab this. Just
  • 10:38 - 10:45
    us we've done before. I'll say well, so
    this is doctor pepper, or diet doctor
  • 10:45 - 10:55
    pepper. And do the same thing for this.
    [inaudible] Be on this. So I'm gonna say
  • 10:55 - 11:05
    diet sprite. Oops, [inaudible]. Diet Dr.
    Pepper and here we'll say or Diet Coke.
  • 11:05 - 11:13
    Alright, so without diet it was eight four
    eight. Assuming this code is correct,
  • 11:13 - 11:22
    let's read it. So we see it was eight four
    eight. So, then Dr. Pepper [inaudible]. No
  • 11:22 - 11:26
    one drinks Diet Sprite in my class,
    apparently. Or likes it their best. So Dr.
  • 11:26 - 11:31
    Pepper, Pepper went up from four to seven.
    So it about doubled. And actually, Coke
  • 11:31 - 11:35
    also about doubled. So I guess that we've,
    we've learned something a little bit
  • 11:35 - 11:39
    there. That, for those drinks, diet
    represents about half. So I also obviously
  • 11:39 - 11:43
    like this example, 'cause now we're sort
    of combining multiple techniques. That
  • 11:43 - 11:47
    we're doing the counting, and then we're
    doing things like using or, and, or
  • 11:47 - 11:52
    whatever on the if test that's controlling
    the count inside the loop. Okay, so let me
  • 11:52 - 11:57
    try these are kind of, yo u know,
    non-trivial examples. Let me try another
  • 11:57 - 12:03
    one. Let's try, let's try another one of
    the fields. Let's try the sport field. So
  • 12:03 - 12:09
    this is where people there were a bunch of
    different sports identified, but I'm going
  • 12:09 - 12:14
    to look at the sports of Soccer and
    football were common ones identified. So
  • 12:14 - 12:19
    I'll just use count one and count two. So
    I'll say if sport is equal to soccer,
  • 12:19 - 12:25
    let's say that'll, we'll do that on count
    one and otherwise or. Football is the
  • 12:25 - 12:30
    other one, so that one we'll do in count
    two. So here we'll say soccer and
  • 12:30 - 12:36
    football. And this count three, I'm just
    gonna stop doing. Okay, so we're gonna, so
  • 12:36 - 12:42
    this should just go through, and count how
    many times soccer was the sport that was
  • 12:42 - 12:48
    named. And how many times football, and
    then we'll see if the assumptions work
  • 12:48 - 12:54
    out. So you see, we've got soccer seven,
    football twelve. So football's pretty
  • 12:54 - 13:00
    skippingly ahead there. So the last thing
    we want to try here is well that we also
  • 13:00 - 13:06
    have this gender data. So what I wanna do
    is let's just take this seven. Number for
  • 13:06 - 13:11
    soccer and let's break it down by gender
    and so wait when I say I can break it
  • 13:11 - 13:16
    down, what I am doing is like let's have
    one counter for women playing soccer and
  • 13:16 - 13:21
    one counter for men playing soccer. So
    let's say count one will be women playing
  • 13:21 - 13:26
    soccer. So how do I, what do I do here.
    And this is gonna be an, an and. So, I
  • 13:26 - 13:32
    say, well if it's row dot, get field
    gender. And then, you, you, it just, you
  • 13:32 - 13:39
    have to know how the data is coded. In
    this case, the data is coded as female, is
  • 13:39 - 13:47
    the word for, used for that. So, count one
    is gonna be, answers. Where the row is
  • 13:47 - 13:55
    also where they said [inaudible] and here
    I'll say soccer and male. And so that will
  • 13:55 - 14:04
    go into count two. So here I'll say soccer
    and female. And then here I'll say soccer,
  • 14:04 - 14:11
    oops, M. Okay. So, let's try that. So,
    soccer without looking at gender, it was
  • 14:11 - 14:16
    seven. So now if you look at it you'll see
    there's actually this huge generator. So I
  • 14:16 - 14:20
    don't know if maybe there's just a lot of
    women's soccer team. People in my class
  • 14:20 - 14:24
    this quarter, or who knows. But a nyway,
    yeah. So of, of those seven, six, were
  • 14:24 - 14:29
    female, and there was, one male who
    identified soccer as their favorite sport
  • 14:29 - 14:33
    [inaudible]. So, this is just another
    example of using, you know, combining, the
  • 14:33 - 14:38
    counting with and, and or. My previous one
    was or. This one, I used and. So, that's,
  • 14:38 - 14:42
    that's sort of as complicated as I wanna
    get with this table data. I think it has a
  • 14:42 - 14:46
    nice very kind of realistic feeling that
    you have a, data set. And then you just,
  • 14:46 - 14:50
    the computer rips over it with a little
    bit of this logic that you write. And then
  • 14:50 - 14:54
    eventually just comes up with a couple
    numbers, that, you know, are gonna help
  • 14:54 - 14:58
    you analyze what's going on. So that's a
    very realistic way that computers are
  • 14:58 - 15:01
    used, and an excellent format for writing
    exercises.
Title:
Count II (15 mins)
Video Language:
English
stanford-bot edited English subtitles for Count II (15 mins)
stanford-bot edited English subtitles for Count II (15 mins)
stanford-bot edited English subtitles for Count II (15 mins)
stanford-bot edited English subtitles for Count II (15 mins)
stanford-bot edited English subtitles for Count II (15 mins)
stanford-bot added a translation

English subtitles

Revisions