Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling collections in GetHashCode implementation

I'm working on implementing GetHashCode() based on the HashCode struct in this answer here. Since my Equals method will consider collections using Enumerable.SequenceEqual(), I need to include the collections in my GetHashCode() implementation.

As a starting point, I'm using Jon Skeet's embedded GetHashCode() implementation to test the output of the HashCode struct implementation. This works as expected using the following test below -

private class MyObjectEmbeddedGetHashCode
{
    public int x;
    public string y;
    public DateTimeOffset z;

    public List<string> collection;

    public override int GetHashCode()
    {
        unchecked
        {
            int hash = 17;

            hash = hash * 31 + x.GetHashCode();
            hash = hash * 31 + y.GetHashCode();
            hash = hash * 31 + z.GetHashCode();

            return hash;
        }
    }
}

private class MyObjectUsingHashCodeStruct
{
    public int x;
    public string y;
    public DateTimeOffset z;

    public List<string> collection;

    public override int GetHashCode()
    {
        return HashCode.Start
            .Hash(x)
            .Hash(y)
            .Hash(z);
    }
}

[Test]
public void GetHashCode_CollectionExcluded()
{
    DateTimeOffset now = DateTimeOffset.Now;

    MyObjectEmbeddedGetHashCode a = new MyObjectEmbeddedGetHashCode() 
    { 
        x = 1, 
        y = "Fizz",
        z = now,
        collection = new List<string>() 
        { 
            "Foo", 
            "Bar", 
            "Baz" 
        } 
    };

    MyObjectUsingHashCodeStruct b = new MyObjectUsingHashCodeStruct()
    {
        x = 1,
        y = "Fizz",
        z = now,
        collection = new List<string>() 
        { 
            "Foo", 
            "Bar", 
            "Baz" 
        }
    };

    Console.WriteLine("MyObject::GetHashCode(): {0}", a.GetHashCode());
    Console.WriteLine("MyObjectEx::GetHashCode(): {0}", b.GetHashCode());

    Assert.AreEqual(a.GetHashCode(), b.GetHashCode());
}

The next step is to consider the collection in the GetHashCode() calculation. This requires a small addition to the GetHashCode() implementation in MyObjectEmbeddedGetHashCode.

public override int GetHashCode()
{
    unchecked
    {
        int hash = 17;

        hash = hash * 31 + x.GetHashCode();
        hash = hash * 31 + y.GetHashCode();
        hash = hash * 31 + z.GetHashCode();

        int collectionHash = 17;

        foreach (var item in collection)
        {
            collectionHash = collectionHash * 31 + item.GetHashCode();
        }

        hash = hash * 31 + collectionHash;

        return hash;
    }
}

However, this is a little bit more difficult in the HashCode struct. In this example, when a collection of type List<string> is passed into the Hash<T> method, T is List<string> so trying to cast obj to ICollection<T> or IEnumerable<T> doesn't work.

I can successfully cast to IEnumerable, but it causes boxing and I found I have to worry about excluding types like string that implement IEnumerable.

Is there a way to reliably cast obj to ICollection<T> or IEnumerable<T> in this scenario?

public struct HashCode
{
    private readonly int hashCode;

    public HashCode(int hashCode)
    {
        this.hashCode = hashCode;
    }

    public static HashCode Start
    {
        get { return new HashCode(17); }
    }

    public static implicit operator int(HashCode hashCode)
    {
        return hashCode.GetHashCode();
    }

    public HashCode Hash<T>(T obj)
    {
        // I am able to detect if obj implements one of the lower level
        // collection interfaces. However, I am not able to cast obj to
        // one of them since T in this case is defined as List<string>,
        // so using as to cast obj to ICollection<T> or IEnumerable<T>
        // doesn't work.
        var isGenericICollection = obj.GetType().GetInterfaces().Any(
            x => x.IsGenericType && 
            x.GetGenericTypeDefinition() == typeof(ICollection<>));

        var c = EqualityComparer<T>.Default;

        // This works but using IEnumerable causes boxing.
        // var h = c.Equals(obj, default(T)) ? 0 : ( !(obj is string) && (obj is IEnumerable) ? GetCollectionHashCode(obj as IEnumerable) : obj.GetHashCode());

        var h = c.Equals(obj, default(T)) ? 0 : obj.GetHashCode();
        unchecked { h += this.hashCode * 31; }
        return new HashCode(h);
    }

    public override int GetHashCode()
    {
        return this.hashCode;
    }
}
like image 741
brdmllr Avatar asked Sep 03 '25 17:09

brdmllr


1 Answers

You can address the collection issue in a couple of ways:

  1. Use a non-generic interface, e.g. ICollection or IEnumerable.
  2. Add an overload for the Hash() method, e.g. Hash<T>(IEnumerable<T> list) { ... }

That said, IMHO it would be better to just leave the struct HashCode alone and put the collection-specific code in your actual GetHashCode() method. E.g.:

public override int GetHashCode()
{
    HashCode hash = HashCode.Start
        .Hash(x)
        .Hash(y)
        .Hash(z);

    foreach (var item in collection)
    {
        hash = hash.Hash(item);
    }

    return hash;
}

If you do want a full-featured version of the struct HashCode type, it looks to me as though that same page you referenced has one: https://stackoverflow.com/a/2575444/3538012

The naming of the members is different, but it's basically the same idea as the struct HashCode type, but with overloads for other complex types (as in my suggestion #2 above). You could use that, or just apply the techniques there to your implementation of struct HashCode, preserving the naming conventions used in it.

like image 77
Peter Duniho Avatar answered Sep 07 '25 07:09

Peter Duniho