Why a hashtable? Why not just a non-hashed associative array?

Question

I've been learning about using a hashtable to efficiently check for items in a list without looping through the whole thing, but there's one thing that I don't get:

Why hashed keys?

It seems like:

var wordList = { 'aa' : ['aa'], 'aah' : ['aah'], 'ahhed' : ['aahed'] };

Would work just as well as:

var wordList = { '/* hashed value of aa*/' : ['aa'], '/* hashed value of aah*/' : ['aah'], '/* hashed value of aahed*/' : ['aahed'] };

What's the performance difference between looking up a hashed key and a simple name key?

Your first version is the way a hash table looks to the programmer. The hashes are dealt with under the hood; you don't manipulate the hash values directly. — Robert Harvey, CommentedJul 22, 2014 at 21:29
@RobertHarvey right, I'm just using the structure in the open to show my meaning. In my case, where the table has no collisions (a very long list of unique words), is there any benefit to hashing the keys? — CuriousWebDeveloper, CommentedJul 22, 2014 at 21:33
@RobertHarvey explains it well. When you convert the string into a number, the number is just an index into the array. Then you don't have to search at all, or do any string comparisons at all. You just go straight to that index and get your answer. BOOYA! — sea-rob, CommentedJul 22, 2014 at 21:55
uhm, you do have to search from the index returned from the hashing function, and you have to do string comparisons until you get a match. the number of entries into the hash table is far less than the number of possible strings. the hashing function is not injective (one-to-one). two different strings might hash to the same starting index. when the hash table is first filled, the string that is first placed gets that starting index and strings with the same hash index (that's a collision) that are later placed get bumped over to higher (and, hopefully nearby) addresses. — robert bristow-johnson, CommentedAug 4, 2015 at 20:43
see also: I'm trying to understand hash tables - can someone explain it to me - clearly? — gnat, CommentedAug 20, 2015 at 11:26

Robert Harvey · Accepted Answer · 2015-08-04 14:36:51Z

7

Because you still need a way to search the associative array. The hash table is your search mechanism; it gives you O(1) performance for any given key search, by computing an index into the bucket containing your item.

What is your underlying search mechanism, if it's not a hash table? A Binary Search Tree? That's good for very large tables, but it's not O(1); it's O(log n). If you don't have a search mechanism at all (i.e. you're searching the entire array from top to bottom to find your key), your performance is O(n).

If your keys are already ordered, you can use a Binary Search without maintaining a tree; that is also O(log n) performance.

edited Aug 4, 2015 at 14:36

answered Jul 22, 2014 at 21:33

Robert Harvey

200k55 gold badges469 silver badges680 bronze badges

Ok, I'm going to add this to my question: What's the performance difference between looking up a hashed key and a simple name key?
– CuriousWebDeveloper
CommentedJul 22, 2014 at 21:38
What is your underlying search mechanism for the simple name key? A hashed key is still a "simple name key," as you call it; it just has a hash value associated with it.
– Robert Harvey
CommentedJul 22, 2014 at 21:38
I'm not sure. My goal is to learn to reference a long list of words efficiently. I guess by using indexOf(), I can take advantage of the keyed pairs, without needing to loop. That's the performance increase, as I understand it. I just don't understand the difference between a hashed key and a simple name key for my situation, where all names would be unique.
– CuriousWebDeveloper
CommentedJul 22, 2014 at 21:41
3
Apparently you're still not getting that you need some sort of search mechanism to make it fast. Otherwise, you have to loop through every key in your set to find the one you're looking for. Lists that retrieve values based on a key, as in list[wordName] always have some such search mechanism operating under the hood.
– Robert Harvey
CommentedJul 22, 2014 at 22:37
1
@gnat: As you've already adequately demonstrated, that information is easily obtained.
– Robert Harvey
CommentedAug 4, 2015 at 6:29

| Show 9 more comments

Jazzwave06 · Accepted Answer · 2014-07-22 21:52:31Z

Well, if you need to find something in your associative array, you need to iterate through each of the keys, and find equality. This means that in the worst case, you'll need to iterate through your whole collection to find the good key.

Accessing an array with an index (myArray[0] for example) is instant; it doesn't require any searching, because the runtime of the languages knows exactly where to find this value.

When using a hashtable, you compute the hash code of a given key to find the associated value. The hashcode is an index in the underlying array of the hashtable. This means that finding a key in a hashtable is as fast as accessing an array by index.

An hashtable's code is a bit more complex, but this is the general idea.

Oh and btw, the javascript runtime treats all objects as hashtables. It can be useful to know. code var o = {}; o["i"] = 0; — Jazzwave06, CommentedJul 22, 2014 at 23:11

Stack Exchange Network

Why a hashtable? Why not just a non-hashed associative array?

2 Answers 2

Linked

Hot Network Questions

Why a hashtable? Why not just a non-hashed associative array?

2 Answers 2

Linked

Related

Hot Network Questions