I need to use a hash function which belongs to a family of k-wise independent hash functions. Any pointers on any library or toolkit in C, C++ or python which can generate a set of k-wise independent hash functions from which I can pick a function.
Background: I am trying to implement this algorithm here: http://researcher.watson.ibm.com/researcher/files/us-dpwoodru/knw10b.pdf for the Distinct Elements problem.
I have looked at this thread: Generating k pairwise independent hash functions which mentions using Murmur hash to generate a pairwise independent hash function. I was wondering if there is anything similar for k-wise independent hash functions. If there is none available, would it be possible for me to construct such a set of k-wise independent hash functions.
Thanks in advance.
The simplest k-wise independent hash function (mapping positive integer x < p to one of m buckets) is just
where p is some big random prime (261-1 will work)
and ai are some random positive integers less than p, a0 > 0.
2-wise independent hash:
h(x) = (ax + b) % p % m
again, p is prime, a > 0, a,b < p (i.e. a can't be zero but b can when that is a random choice)
These formulas define families of hash functions. They work (in theory) if you select a hash function randomly from corresponding family (i.e. if you generate random a's and b) each time you run your algorithm.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With