-
So up to now. We've looped over data, and
written of statements to count one thing
-
or another. But really what you want to do
is be able to count multiple things. So
-
that you can compare them, like are there
more boys or girls or whatever they have
-
some quality. So that's what we're going
to do in this section. So in order to do
-
this basically what I want to do is just
have multiple counter variables. So
-
instead of just having count, have a few
of them. So let me just show you in this
-
code section. So let's say I wanna go
through and I wanna count my goal is do
-
more boy or girl names end in Y, like who
knows the answer? And well use the
-
computer, so what I'm gonna do. Is
introduce two count variables. So outside
-
the loop, whereas before I just said count
equals zero I'm just gonna use the simple
-
form of, of calling my variables count1,
count2, and so on. Pretty unimaginative,
-
but it's simple. So I'm gonna h, in this
case I'm gonna have two variables. So I'll
-
say count1 equals zero and count2 equals
zero. And my intention is that, well, in
-
count1 I'll keep track of the boy case,
and in count2 I'll keep track of the girl
-
case. So, inside the loop. This looks very
much like what we did before. So, I have
-
an if task where I'm looking for, rows
where the name ends with Y, and the gender
-
is boy. And so when that's true, I'll bump
up count one. So, count one being sort of
-
a boy counter. And then I have, following
it, I have a very similar if statement,
-
that looks for any name ending in Y, but
I'm looking for a gender that equals girl.
-
And in that case, I'll bump up count two.
So this, so the loop runs, and it's just
-
going to be counting up the girl, it's,
you know, at the same time, it's counting
-
up the boy and girl cases. And then when
it gets the end here. I'll just run it.
-
Then it just says boy count colon count
one and girl count, count two. So we see
-
it turns out more girl name, more girl
names end in Y than, than boy names or
-
whatever it is. Obviously with this
formula you could test any number of
-
things. One thi ng I should point out is
that, this is gonna be sort of our
-
official, class format for how complicated
I wanna make things. So, I've got the one
-
loop. And then I could have, you know,
generally just two or three variables. But
-
any number of variables, I set them to
zero right here. And then for each one, I
-
have an if statement. And I wanna point
out, the if statements are one after
-
another. And, in fact, the order doesn't
really matter. What, what I would point
-
is, the if statements are not inside of
each other. I'm not gonna do that. That's
-
more complicated. Of the stuff we can do,
perfectly interesting work just sticking
-
with this form. Alright. So let me just
try and natural extension with this, so
-
we'll just go to three variables, I'll
just show you how that works. So, for
-
three variables, I'm just gonna stick with
my trivial naming convention. Count one,
-
count two, count three. So I set these
three variables to zero outside the loop.
-
And this case, the questions I wanna ask
it, or answer, is, do more names end in A
-
or I or O? Who knows. So I've got the
three counters and I'll use count one to
-
count the A case, and count two for the I
case, and count three to the O case. So
-
here's the sort of obvious if statement.
If name ends with A, and then [inaudible]
-
count one equals count one plus one. And
then likewise there is an if statement for
-
the I case that bumps up count two, and,
and if statement for the O case that bumps
-
up count three. And here I've got these
three print statements, outside the loop.
-
So these run right the loop has completed
so its bumped up all the counters to
-
whatever they're going to be, and then we
just print them out. So it's trapped. Huh.
-
[inaudible]. So A. Totally dominates. 377
in or just like, yeah, whatever. Thanks
-
for playing. Nice try. Just a little bit
stylistic thing I'll point out here. My
-
naming convention here is, I mean it's.
Kind of lame, you know, just one, two,
-
three. Another way we could of done this
is, in this case we could of called this
-
one count A and count I. I li ke this,
count A and count I. So, it would be more
-
demonic of, of what it was counting. But
then it has the disadvantage of whenever
-
you copy pasted or switched from one
example to another you would have to like
-
to remember rename. So, I decided to go
with this very trivial simple just one,
-
two, three scheme but we could of done
something more complicated there. The
-
other thing I'll point out is that it's,
it's natural to find, to use copy paste
-
for these so you kind of get your first
case working and then you [inaudible].
-
However, you do that there this very
natural error where you have to be very
-
careful that you're manipulating the right
variable. So that in this "if" statement
-
I'm manipulating count one and then in
this "if" statement count two and so and
-
so. That's the sort of thing that
happening to [inaudible]. Might help with,
-
but it's, that is, no matter what you're
doing it's a common error. So you just
-
have to be a little bit careful about
that. Right. So now that we've got the
-
ability to count multiply things, I wanna,
kinda expand our data set a little bit. So
-
I did this survey, which actually I didn't
bring up here. Hum, so in Google
-
spreadsheets, since it's all, everything
is just intrinsically online, it works
-
really well for sharing data and doing
easy stuff. Hum, so it has this feature,
-
where you can put up a little, sort of,
form survey in front of people. So I made
-
this trivial survey where I ask gender,
and favorite color, and favorite T.V show
-
that's on, and I just sent this out to my
class. Hum, and the way it works is every
-
time someone's submitted a set of answers,
and this' anonymous. It would go into the
-
spreadsheet. And so this is a, Google Dock
spreadsheet and what you is there's, it's,
-
it's a table. So here's a column for
gender, and here's a column for color. And
-
these are just the answers. And you see is
that every time someone types in an answer
-
to the survey that just goes in as one
row. And so we have data to sort of play
-
around with. We have favorite color,
favorite TV show, f avorite book, and what
-
not. What I found is that it's easiest to
do stuff with color, and sport, and,
-
favorite soda drink, 'cause there's enough
repetition there. If you look at book,
-
there's, there's just so many books
published that there, you know, most books
-
just appear once. Anyway this is
interesting data to, look at just to see
-
what's going on with people who are, I
guess about twenty years old in 2012. So,
-
from Google spreadsheets you can export
that data in CSV format I think I
-
mentioned that before. It's a really
common interchange format, and I just
-
cleaned up the data a little bit so I
removed dots. There was a problem where
-
people would type in a Dr. Pepper either
with a dot after the first R or not so I
-
just removed all dots but other than the
data just looks like whatever people typed
-
in. So with that data we can write do all
sorts of interesting problems, so here
-
I've set some up. So this data is
available as survey/2012.csb so we could
-
load that into a table. There's a function
I haven't talked about before the table
-
has called convert to lowercase. What that
does is it goes through the table and it
-
modifies all the text to just be lowercase
letters. So, for example if we want to
-
count, oh, well how many people have blue
as their favorite color. Well there's this
-
problem that did they type, upper case B
blue, or lower case, or you know, all
-
lowercase or whatever. So by calling this
function we just cha-, all the data is now
-
gonna be lowercase. So we just don't have
to think about that variation of what
-
people typed. So, I'm gonna do that as a
simplification here. Alright. So let me
-
look at, so there's some sample problems
here, and as usual we've got the,
-
solutions level so this will, I'll just
try this out. So it says right code to
-
print the soda field of each route. So,
what this, what I'm gonna do here. I could
-
just print the whole row but it, it's so
much data it doesn't make a lot of sense.
-
But what could be interesting is, suppose
you were curious about what people put for
-
the soda data. W hat you could say is, get
field, and you need to know what the names
-
of the fields are. Their printed
[inaudible] somewhere up here. Anyway the
-
name of the field that the soda drinks
answer is, soda. So I'm just gonna print
-
those. And I'm not gonna count anything.
Now we'll comment that out. So if I just
-
print that, what we get is it just goes
through all the rows. And, and remember,
-
now, it's all lower case, and we can just
kinda see what's there. This is maybe a
-
good first step if you sorta wanna see,
oh, well, what things seems to come up
-
[inaudible]? Or just, kind of, if you're
just curious, about TV shows or movies or
-
whatever. [inaudible]. This also shows, I
guess, ultimately that row.getfield really
-
does just return a string. And so, print
understands string, so if we just put it
-
there, [inaudible]. Alright, so what I'd
like to do is, count the favorite sodas.
-
So I wanna say, I'm just gonna say Sprite,
Dr. Pepper and Coke. Lets count those
-
three. So what I'm go to do is following
my previous strategy, as I'll say count
-
one equals zero, count two equals zero,
and count three equals zero. So my
-
intention is that I'll, you know, I'll
follow, whatever the order is, what I say.
-
Sprite, Dr. Pepper and Coke. So what do I
wanna say? If row dot get field soda is
-
equal to Sprite. And then write, I can
just write it in lower case letters,
-
'cause I know that it has been changed. So
I wanna say, count one is equal to count
-
one plus one. So that's, that's counting
one drink. We'll just try it. So this is
-
count Sprite. Print count one, and I'm
gonna, I'm gonna get rid of this line.
-
I'm, I'm not gonna print all the sodas as
we go. Okay, let's try that one. Okay.
-
Sprite eight, that seems to work. So I
wanna check it, because then. I'm gonna do
-
some merciless copy-paste. So then I'm
gonna count doctor pepper. And I'm gonna
-
manipulate count two in that case. And I'm
gonna count Coke. And I'm gonna play count
-
three. So this is the case I was saying,
where you have to be careful. Copy base is
-
great, but you gotta make sure you're
doing the right thing. >> And here I'll
-
make two copies of this line. So then this
is gonna be Dr. Pepper, count two. And
-
Coke is count three. I have to say, there
was one other cleanup I did in the data.
-
[inaudible], ... >> Okay. That looks
right. One other is, people spelled Coke.
-
Sometimes they write Coca Cola, with a
dash, or not, or whatever, so I just
-
changed those all to Coke. So. This is a
different data set, but this is following
-
kind of my earlier example of counting
three things. So it should do well. So the
-
case I'd like to, the complexity I'd like
to work out is, like, well, you know,
-
really, if you look at the data, sometimes
people would say Coke, and sometimes they
-
would say Diet Coke. And sometimes they
would say Dr. Pepper, and sometimes they
-
would say Diet Dr. Pepper. So how could I
get count two, let's say, to include both.
-
I want to include Dr. Pepper and also Diet
Dr. Pepper. And the way to make it more
-
inclusive there is to use, the or, oops.
So I'm going to say, or. Grab this. Just
-
us we've done before. I'll say well, so
this is doctor pepper, or diet doctor
-
pepper. And do the same thing for this.
[inaudible] Be on this. So I'm gonna say
-
diet sprite. Oops, [inaudible]. Diet Dr.
Pepper and here we'll say or Diet Coke.
-
Alright, so without diet it was eight four
eight. Assuming this code is correct,
-
let's read it. So we see it was eight four
eight. So, then Dr. Pepper [inaudible]. No
-
one drinks Diet Sprite in my class,
apparently. Or likes it their best. So Dr.
-
Pepper, Pepper went up from four to seven.
So it about doubled. And actually, Coke
-
also about doubled. So I guess that we've,
we've learned something a little bit
-
there. That, for those drinks, diet
represents about half. So I also obviously
-
like this example, 'cause now we're sort
of combining multiple techniques. That
-
we're doing the counting, and then we're
doing things like using or, and, or
-
whatever on the if test that's controlling
the count inside the loop. Okay, so let me
-
try these are kind of, yo u know,
non-trivial examples. Let me try another
-
one. Let's try, let's try another one of
the fields. Let's try the sport field. So
-
this is where people there were a bunch of
different sports identified, but I'm going
-
to look at the sports of Soccer and
football were common ones identified. So
-
I'll just use count one and count two. So
I'll say if sport is equal to soccer,
-
let's say that'll, we'll do that on count
one and otherwise or. Football is the
-
other one, so that one we'll do in count
two. So here we'll say soccer and
-
football. And this count three, I'm just
gonna stop doing. Okay, so we're gonna, so
-
this should just go through, and count how
many times soccer was the sport that was
-
named. And how many times football, and
then we'll see if the assumptions work
-
out. So you see, we've got soccer seven,
football twelve. So football's pretty
-
skippingly ahead there. So the last thing
we want to try here is well that we also
-
have this gender data. So what I wanna do
is let's just take this seven. Number for
-
soccer and let's break it down by gender
and so wait when I say I can break it
-
down, what I am doing is like let's have
one counter for women playing soccer and
-
one counter for men playing soccer. So
let's say count one will be women playing
-
soccer. So how do I, what do I do here.
And this is gonna be an, an and. So, I
-
say, well if it's row dot, get field
gender. And then, you, you, it just, you
-
have to know how the data is coded. In
this case, the data is coded as female, is
-
the word for, used for that. So, count one
is gonna be, answers. Where the row is
-
also where they said [inaudible] and here
I'll say soccer and male. And so that will
-
go into count two. So here I'll say soccer
and female. And then here I'll say soccer,
-
oops, M. Okay. So, let's try that. So,
soccer without looking at gender, it was
-
seven. So now if you look at it you'll see
there's actually this huge generator. So I
-
don't know if maybe there's just a lot of
women's soccer team. People in my class
-
this quarter, or who knows. But a nyway,
yeah. So of, of those seven, six, were
-
female, and there was, one male who
identified soccer as their favorite sport
-
[inaudible]. So, this is just another
example of using, you know, combining, the
-
counting with and, and or. My previous one
was or. This one, I used and. So, that's,
-
that's sort of as complicated as I wanna
get with this table data. I think it has a
-
nice very kind of realistic feeling that
you have a, data set. And then you just,
-
the computer rips over it with a little
bit of this logic that you write. And then
-
eventually just comes up with a couple
numbers, that, you know, are gonna help
-
you analyze what's going on. So that's a
very realistic way that computers are
-
used, and an excellent format for writing
exercises.