.net – How to implement a GetHashCode compatible Equals method, when the space is greater than 32 bits

equalsgethashcodeiequatablenet

In .NET you need that Equals(object) and GetHashCode() are compatible. But sometimes you can't:

public class GreaterThan32Bits
{
    public int X { get; set; }
    public int Y { get; set; }
}

Because the data density is greater than 32 bits, and GetHashCode returns an Int32, you will have 3 solutions (assuming a correctly implemented GetHashCode):

  1. Avoid code duplication discarded as incorrect

    public override bool Equals(object other)
    {
        if(ReferenceEquals(null, other)) return false;
        if(ReferenceEquals(this, other)) return true;
        return this.GetHashCode() == other.GetHashCode();
    }
    
  2. Implement Equals separately from GetHashCode()

    public override bool Equals(object obj)
    {
        if(ReferenceEquals(null, other)) return false;
        if(ReferenceEquals(this, other)) return true;
        var other = obj as GreaterThan32Bits;
        if(this.X == other.X) return this.Y == other.Y;
        return false;
    }
    
  3. Implement a greater precision GetHashCode64, the overrided GetHashCode (32 bits) will return (int)GetHashCode64(), and Equals will return this.GetHashCode64() == other.GetHashCode64()

Which one would you implement?

The first solution is imprecise incorrect, but cleaner. The second option seems clean, but gets very complex when the class has more properties. The third option is a compromise.

Best Answer

The requirement is as follows: if (a.Equals(b)), then a.GetHashCode() == b.GetHashCode()

Not the other way round.

You shouldn't implement Equals() in term of GetHashCode(), ever. It's perfectly valid for GetHashCode to have collisions, but Equals() must not return false positives.

I'd suggest this implementation:

public override int GetHashCode()
{
    return unchecked( this.X * p1 + this.Y * p2 );
}

public override bool Equals(object obj) 
{
    var other = obj as GreaterThan32Bits;
    // you must do the null test after the cast, otherwise the
    // function crashes when obj is not a GreaterThan32Bits instance
    if (ReferenceEquals(other, null)) return false;
    return this.X == other.X && this.Y == other.Y;
}

Where p1 and p2 are large primes. This usually results in a good hash function (few hash collisions -> Dictionary becomes efficient). If the X and Y values are independent (for example, you don't expect many points on a straight line like X=Y), then even something simple like X ^ Y can be a good hash function.

But then again, you only need a good hash function if you actually use the class as keys in a dictionary (or other hash table).

In fact, it's perfectly fine to always return 0 in GetHashCode() and only implement Equals(). Dictionary will still work correctly with such objects as keys, it'll just be inefficient.

Related Topic