Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Binning a set into subsets deterministically

I have an unordered set of n unique, positive integers. I want to partition it into ceil(n / k) unordered sets of up to k numbers (k << n) in a deterministic way (I want to get a sequence of the same ceil(n / k) output sets without depending on a fixed order of the input).

Is there a way to do so that has superior algorithmic complexity to sorting the set and partitioning the resulting sequence?

like image 819
Mike Graham Avatar asked Jan 23 '26 08:01

Mike Graham


1 Answers

Here is an approach whose average performance is O(n), which beats O(n log(n)).

Pick m with r < m < n, where r = ceil(n / k) is the number of buckets. For each number, hash it and stick it into one of m bags. (I'm deliberately using bags vs buckets, with bags smaller.) This is a O(n) task. Now we can construct our r buckets by running through the m bags, and when fill a bucket up up, we quickselect a single one of the m bags. Running through those bags is O(m), putting them into the r is O(n), and splitting the r-1 bags on a boundary with quickselect is O(r * (n/m)) which is definitely contained within O(n).

like image 55
btilly Avatar answered Jan 25 '26 06:01

btilly



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!