Using HashTable without overriding hashcode()

Question

Today I was asked this question during an interview:

What will happen if we do not override the hashcode method for our class, then add it to HashTable and then try to get objects?

What could go wrong?

Read this book if you want to get a job programming java.
– Martin Schröder
CommentedApr 17, 2012 at 6:48 — Martin Schröder, CommentedApr 17, 2012 at 6:48
@MartinSchröder I have a job already)
– VextoR
CommentedApr 25, 2012 at 7:22 — VextoR, CommentedApr 25, 2012 at 7:22

SRKX · Accepted Answer · 2012-04-16 08:24:30Z

The idea with a HashTable when you try retrieving an object is that the data structure computes the hash code of the object using the GetHashCode() method and then goes through a list using the Equals() method.

With default GetHashCode() implementation, two perfectly similar objects might end up yielding different hash codes which means that if you do not use the exact same instance, you will never find the your object in the HashTable.

In general, you want to make sure of two things when implementing hash codes:

If A.Equals(B) then A.GetHashCode()==B.GetHashCode()
Try to get a distribution of hash codes properly spread to get the maximum efficiency from the hash table (if too few hash codes are possible, you'll end up searching a list).

I believe @SRKX was using C# in his example, where hashCode is the Java equivalent. The answer is valid in both languages. — DevSolo, CommentedApr 16, 2012 at 16:02
@SpencerK: yeah as DevSolo said, I was trying to explain the concept not to get into language-specific details which is, I hope, what the interviewer is looking for. — SRKX, CommentedApr 16, 2012 at 17:06

Joonas Pulakka · Accepted Answer · 2012-04-16 18:18:28Z

What's gonna happen if we do not override hashcode method for our class, then add it to HashTable and then try to get objects?

It depends on what "adding to HashTable" means. Java's Hashtable doesn't have any add method. The interviewer probably meant put method, which takes in a key and a value. The value can be anything (could be even null in HashMap, which is the present-day version of Hashtable). Nothing special happens regardless of whether or not you override the value object's hashcode, or any other method.

The interviewer probably meant that the key object's hashcode wouldn't be overridden. Only then the object identity issues, as pointed out in other answers, come into play. Even then, you don't necessarily have to override the key's hashcode. For example, if you use Strings as keys, they already have an appropriate hashcode implementation in them. Besides, they can't be subclassed. Furthermore, if you do override hashcode but don't override equals, you may get some amazing behavior...

If the question really was exactly what you wrote, I would have teased the interviewer with these questions. A good programmer doesn't assume that the interviewer probably meant this or that. He asks instead.

Hashtable does not support null.
– Spencer Kormos
CommentedApr 16, 2012 at 15:57 — Spencer Kormos, CommentedApr 16, 2012 at 15:57

JeremyP · Accepted Answer · 2012-04-16 15:22:47Z

The answer is "nothing bad unless you have overridden equals()".

The general point is that if two objects compare equal i.e. if

 a.equals(b)

then they must have the same hashcode i.e.

 a.hashCode() == b.hashCode()

also if two objects have different hash codes, they must not compare equal.

This is especially relevant when you put objects in a hash table. This is because a hash table is an array of lists (called buckets usually). The hash bucket is indexed using the hash code, typically you use hashCode % arraySize.

So when you put an object in the hash table, you take the hash code of the key and use it to determine the bucket. You then put a key-value pair in the bucket. When you want to retrieve an object from the hash table, you take the hash code of the key to find the bucket and test the key of all the key-value pairs in the bucket with .equals() to determine which object is the one you want.

So if you have two key objects which compare equal but have different hash codes and you use one as a key in a hash table, you won't be able to search for it using the other key object because you'll be looking in the wrong bucket.

The implementation of equals() in Object only returns true if the two objects are actually the same object and hashCode() returns the object reference. However, if you override equals() (e.g. String does so that different strings containing the same character sequence compare equal) then you must override hashCode()

Good answer, but I think the first sentence is misleading. If you are using a HashMap or Hashtable, you should override hashCode() and equals(). The first sentence is only true if you really intend to use object identity as equality e.g. you really want to compare objects with == instead of equals(). — scarfridge, CommentedApr 16, 2012 at 19:14
@scarfridge: I kind of agree that just using object identity to identify the keys is pretty much useless. I was thinking nothing bad in terms of (for example) having the value of hashCode() change while the object is being used as a key in a map. — JeremyP, CommentedApr 17, 2012 at 8:59

ratchet freak · Accepted Answer · 2012-04-16 08:29:01Z

you won't find your object if you get with a different object that is equal to the object you put

or to give an example:

MyClass obj1 = new MyClass(1); MyClass obj2 = new MyClass(1); assert obj1.equals(obj2); assert obj1.hashcode()!=obj2.hashcode(); //this is wat happens if you don't inclde hashcode table.put(obj1,2); table.get(obj2) // will likely return null but that is a gamble table.get(obj1) // but this will return the object passed in

the reason for this is that HashTable (and HashMap) will use the hashcode to limit the space it has to search through to find the object and that relies on the assumption that if obj1.equals(obj2) then obj1.hashcode == obj2.hashcode()

rich_markle · Accepted Answer · 2012-04-16 21:21:46Z

I would only add, that all of these concepts have to with identity and comparison.

There is a contract with hash code:

1.) Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

2.) If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

3.) It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

The upshot is that if the hash codes are the same the entries in the table overwrite each other and that could be surprising to some...

Hope this helps.

user281377 · Accepted Answer · 2012-04-16 08:18:31Z

The answer is: similar objects (all fields having equal values) would not create the same hash code, so you would need exactly the same (identical) object that was used for put to retrieve from hashtable, which is not feasible in most cases.

Joachim Sauer · Accepted Answer · 2012-04-16 08:20:23Z

Assuming your class extends only Object, then the hashCode() implementation of your class will depend on the object identity. I.e. that the hash code of two different instances will (almost certainly) be different, even if they hold the exact same value.

This means that you will most likely not find the object again in the Map (you might find it by chance, however).

java_mouse · Accepted Answer · 2012-04-16 15:42:57Z

If the hashcode method is not overridden, the answer to this question really depends on if the same key object which was used to "put" is going to used for "get" as well:

a) If the same key object is used - "get" will find the value. Because it will locate the bucket using the same "key" and hence will find the value object.

b) If some another "equivelent" key object is used - Since possibly the hashcode is going to be different due to default implementation of the hashcode method in Object and hence it might get into a different bucket and might not be able to get the value object.

Stack Exchange Network

Using HashTable without overriding hashcode()

8 Answers 8

Hot Network Questions

Using HashTable without overriding hashcode()

8 Answers 8

Related

Hot Network Questions