Thursday, December 6, 2007

Java, HashSet, Equals and hashCode

Today, I was working on the code that attempts to add 2D points to java.util.HashSet and test if HashSet contains them.
Point implemented equals method correctly, and accoding to java.util.Set documentation contains should: "Returns true if this set contains the specified element. More formally, returns true if and only if this set contains an element e such that (o==null ? e==null : o.equals(e))."

Well, documentation is wrong - and for a good reason - to implement specification as documented one would need to iterate all objects (in the set) for check for equality. What java.util.Set does is to use hashCode() value to index objects and compare them.

Therefore, if your class does not implement hashCode correctly, following test will fail:

ArrayList points = new ArrayList() ;
points.add( new IntPoint2D( 2, 2 ) ) ;
assertTrue(points.contains(new IntPoint2D(2,2)));


A dilemma is what to return as hashCode? I opted, knowing that my data will be local to implement hashCode as:

public int hashCode()
{
return (x & 0xffff) | ((y & 0xffff)<<16) ;
}


That will be good enough solution for the problem I'm facing.

There is corollary of this implementation:
1. if one assumes that hashCode() is natively implemented to return a pointer to underlying objects, and one tests for identity of the objects contained in the set (often the case) than Set works great.
2. if one needs to overload hashCode() and stuff more complex object in it, it needs to be aware of remote or not-so-remote chance that hashCode will return same value for two different objects, thereby yielding incorrect code.