 ## ← Better Hash Functions Solution - Intro to Computer Science

• 2 Followers
• 57 Lines

### Get Embed Code x Embed video Use the following code to embed this video. See our usage guide for more details on embedding. Paste this in your document somewhere (closest to the closing body tag is preferable): ```<script type="text/javascript" src='https://amara.org/embedder-iframe'></script> ``` Paste this inside your HTML body, where you want to include the widget: ```<div class="amara-embed" data-url="http://www.youtube.com/watch?v=9vGXclxo8Kc" data-team="udacity"></div> ``` 2 Languages

• English [en] original
• Japanese [ja]

Showing Revision 5 created 05/24/2016 by Udacity Robot.

1. So, this is a pretty tough question. If you weren't able to
2. get it yourself, that's okay. What I'd encourage you to do is stop
3. in the middle of the explanation, once the part where you got
4. stuck on makes sense, and see if you can finish it yourself. So,
5. what we need to do is define a function, so we're going to
6. define a function hash-string, and it takes two inputs, the keyword and number
7. of buckets. And we're going to keep track of where we are in the
8. circle. We need to introduce a new variable to do that, and we
9. should start at position zero. So, we'll initialize that variable.
10. We'll use h to represent the hash, and we'll initialize that
11. to zero. Now, we want to go through the characters in
12. the key word. So, we'll have a for loop that goes
13. through each character in keyword. And for each character, we
14. want to add to the hash. So, we're going to add to
15. the hash, the value of that character. We could do the
16. modulo here, so we could, at this point, use modulo buckets.
17. We have to be careful to have the parentheses here. If we
18. just have the modulo buckets here, we wouldn't get the right result.
19. because it would do ord c modulo buckets, but what we really
20. need to do is modulo buckets, the sum that we get from h
21. and ord c. And then at the end, we're going to return the
22. hash value. We could, instead of doing the modulo each time here, we
23. could do the modulo just once at the end. That would be
24. computing a big number if we have a really big string, and then,
25. at the end, compute our modulo buckets. Either way should work.
26. This way's a little better, in the sense that if our
27. string is very long, we would have to compute a really,
28. really big number, which gets to be more expensive. And we
29. might even run out if it's a super long keyword. So,
30. it's better to do the modulo here. We get the same
31. result either way though. Let's try this in the Python interpreter.
32. So, here's the code that we wrote out on the sketchpad.
33. We have our variable, h, which is going to keep track
34. of the hash value. We're going to go through all the
35. characters in the keyword, adding each one into the hash
36. value, modulo the number of buckets. So let's try the examples.
37. So a, with 12 buckets, hashes the bucket one as
38. we expect, and if we look at b. And hash just
39. the bucket two, also as we expect. So, one other thing
40. we should try, if you remember, when we had the really
41. bad hash string function. One of the many problems it had
42. was it didn't work on all strings, in particular, did not
43. work on empty strings. Do you think our hash string function
44. here will work on the empty string? Try to guess what
45. the result should be before I run it. And then I
46. will run it. And you see the result is zero. So, no
47. error. And it makes sense that the result is zero. We
48. start with h is zero. When there are no characters in the
49. string, we don't go through this loop at all. So, h
50. is still zero when we return. And let's also try the longer
51. example. When we hash the string udacity with 12 buckets, we get
52. bucket 11. We should be able to increase the number of buckets.
53. So, let's increase the number of buckets. Let's suppose we had 1,000 buckets,
54. and we get bucket 755. This isn't enough to convince us that our hash string
55. function is distributing all strings well between
56. all buckets. But at least we're getting
57. a, a fairly large number that says we might be using all the buckets.