
Title:
Keywords and Buckets Solution  Intro to Computer Science

Description:

So there are two correct answers, the third one, and

the fifth one. And this is why a hash table is such a great

advance over the linear index, is that we can double the number of keywords, and

double the number of buckets, and the look up

time stays the same. With the linear index, if we

double the number of keywords for each look up, we

need to go through the loop once for each keyword.

If the keyword was near the end or one that wasn't

in the table, the time to look up the keyword would

double as we double the number of keywords. With a hash

table if we also double the number of buckets when we double

the number of keywords, well then the number of keywords in

each bucket stays the same. We're dividing the keywords evenly between

buckets. So this is the number, the number of keywords per

bucket is the number of keywords divided by the number of buckets.

Of we double both, that number stays approximately the same. The

time to look up only depends on the number of keywords per

bucket. The time to find the bucket is very fast, right? We

just need to run the hash function, find that element of the

list. Both of those don't depend on the size of the

list, how long it takes to do that. And then we have

to look through the bucket, the size of the bucket, we have

to look though each element in the bucket one at a time.

So if we keep the number of key words per bucket

the same, the look up time stays essentially the same. So that's

the great property that hash tables have. If we double the

number of buckets, as we double the number of keywords, the expected

look up time doesn't change. For the other possibilities, well if

we double the number of keywords and keep the same number of

buckets, that's going to get slower because the number of keywords

per bucket will approximately double. So it's going to take about twice

as long for each look up. If we keep the same

number of keywords, but double the number of buckets, well then it's

going to actually get faster. We'll have the same number of keywords, double

the number of buckets. So, this value will be approximately half of

what it was before. The expected lookup time will be about

half of what is before we doubled the number of buckets. If

we have the number of keywords keeping the same number of buckets,

well that has essentially the same affect. The average number of keywords

per bucket will be half of what it

was before, so the expected lookup time will be

about half of what it was. And finally

if we have both, well that's going to keep

the ratio the same so the expected look

up time will be about the same. So that's

why these two are the correct answers, that those

are expected to leave the lookup time essentially unchanged.