2

I've been learning about using a hashtable to efficiently check for items in a list without looping through the whole thing, but there's one thing that I don't get:

Why hashed keys?

It seems like:

var wordList = { 'aa' : ['aa'], 'aah' : ['aah'], 'ahhed' : ['aahed'] }; 

Would work just as well as:

var wordList = { '/* hashed value of aa*/' : ['aa'], '/* hashed value of aah*/' : ['aah'], '/* hashed value of aahed*/' : ['aahed'] }; 

What's the performance difference between looking up a hashed key and a simple name key?

5
  • 2
    Your first version is the way a hash table looks to the programmer. The hashes are dealt with under the hood; you don't manipulate the hash values directly.CommentedJul 22, 2014 at 21:29
  • @RobertHarvey right, I'm just using the structure in the open to show my meaning. In my case, where the table has no collisions (a very long list of unique words), is there any benefit to hashing the keys?CommentedJul 22, 2014 at 21:33
  • 1
    @RobertHarvey explains it well. When you convert the string into a number, the number is just an index into the array. Then you don't have to search at all, or do any string comparisons at all. You just go straight to that index and get your answer. BOOYA!
    – sea-rob
    CommentedJul 22, 2014 at 21:55
  • uhm, you do have to search from the index returned from the hashing function, and you have to do string comparisons until you get a match. the number of entries into the hash table is far less than the number of possible strings. the hashing function is not injective (one-to-one). two different strings might hash to the same starting index. when the hash table is first filled, the string that is first placed gets that starting index and strings with the same hash index (that's a collision) that are later placed get bumped over to higher (and, hopefully nearby) addresses.CommentedAug 4, 2015 at 20:43

2 Answers 2

7

Because you still need a way to search the associative array. The hash table is your search mechanism; it gives you O(1) performance for any given key search, by computing an index into the bucket containing your item.

What is your underlying search mechanism, if it's not a hash table? A Binary Search Tree? That's good for very large tables, but it's not O(1); it's O(log n). If you don't have a search mechanism at all (i.e. you're searching the entire array from top to bottom to find your key), your performance is O(n).

If your keys are already ordered, you can use a Binary Search without maintaining a tree; that is also O(log n) performance.

14
  • Ok, I'm going to add this to my question: What's the performance difference between looking up a hashed key and a simple name key?CommentedJul 22, 2014 at 21:38
  • What is your underlying search mechanism for the simple name key? A hashed key is still a "simple name key," as you call it; it just has a hash value associated with it.CommentedJul 22, 2014 at 21:38
  • I'm not sure. My goal is to learn to reference a long list of words efficiently. I guess by using indexOf(), I can take advantage of the keyed pairs, without needing to loop. That's the performance increase, as I understand it. I just don't understand the difference between a hashed key and a simple name key for my situation, where all names would be unique.CommentedJul 22, 2014 at 21:41
  • 3
    Apparently you're still not getting that you need some sort of search mechanism to make it fast. Otherwise, you have to loop through every key in your set to find the one you're looking for. Lists that retrieve values based on a key, as in list[wordName] always have some such search mechanism operating under the hood.CommentedJul 22, 2014 at 22:37
  • 1
    @gnat: As you've already adequately demonstrated, that information is easily obtained.CommentedAug 4, 2015 at 6:29
0

Well, if you need to find something in your associative array, you need to iterate through each of the keys, and find equality. This means that in the worst case, you'll need to iterate through your whole collection to find the good key.

Accessing an array with an index (myArray[0] for example) is instant; it doesn't require any searching, because the runtime of the languages knows exactly where to find this value.

When using a hashtable, you compute the hash code of a given key to find the associated value. The hashcode is an index in the underlying array of the hashtable. This means that finding a key in a hashtable is as fast as accessing an array by index.

An hashtable's code is a bit more complex, but this is the general idea.

2

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.