So, this is a pretty tough question. If you weren't able to
get it yourself, that's okay. What I'd encourage you to do is stop
in the middle of the explanation, once the part where you got
stuck on makes sense, and see if you can finish it yourself. So,
what we need to do is define a function, so we're going to
define a function hash-string, and it takes two inputs, the keyword and number
of buckets. And we're going to keep track of where we are in the
circle. We need to introduce a new variable to do that, and we
should start at position zero. So, we'll initialize that variable.
We'll use h to represent the hash, and we'll initialize that
to zero. Now, we want to go through the characters in
the key word. So, we'll have a for loop that goes
through each character in keyword. And for each character, we
want to add to the hash. So, we're going to add to
the hash, the value of that character. We could do the
modulo here, so we could, at this point, use modulo buckets.
We have to be careful to have the parentheses here. If we
just have the modulo buckets here, we wouldn't get the right result.
because it would do ord c modulo buckets, but what we really
need to do is modulo buckets, the sum that we get from h
and ord c. And then at the end, we're going to return the
hash value. We could, instead of doing the modulo each time here, we
could do the modulo just once at the end. That would be
computing a big number if we have a really big string, and then,
at the end, compute our modulo buckets. Either way should work.
This way's a little better, in the sense that if our
string is very long, we would have to compute a really,
really big number, which gets to be more expensive. And we
might even run out if it's a super long keyword. So,
it's better to do the modulo here. We get the same
result either way though. Let's try this in the Python interpreter.
So, here's the code that we wrote out on the sketchpad.
We have our variable, h, which is going to keep track
of the hash value. We're going to go through all the
characters in the keyword, adding each one into the hash
value, modulo the number of buckets. So let's try the examples.
So a, with 12 buckets, hashes the bucket one as
we expect, and if we look at b. And hash just
the bucket two, also as we expect. So, one other thing
we should try, if you remember, when we had the really
bad hash string function. One of the many problems it had
was it didn't work on all strings, in particular, did not
work on empty strings. Do you think our hash string function
here will work on the empty string? Try to guess what
the result should be before I run it. And then I
will run it. And you see the result is zero. So, no
error. And it makes sense that the result is zero. We
start with h is zero. When there are no characters in the
string, we don't go through this loop at all. So, h
is still zero when we return. And let's also try the longer
example. When we hash the string udacity with 12 buckets, we get
bucket 11. We should be able to increase the number of buckets.
So, let's increase the number of buckets. Let's suppose we had 1,000 buckets,
and we get bucket 755. This isn't enough to convince us that our hash string
function is distributing all strings well between
all buckets. But at least we're getting
a, a fairly large number that says we might be using all the buckets.