Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weighted Random Selection from a Sorted List

I have a problem in which I have a large list of items sorted by "weights". I need to be able to randomly select items from this list, but the items closer to the start (greater weights) must have a greater chance of being selected based on an "elitism" factor.

I realize that similar questions have been asked before, but the catch here is that this list will be changing over time. New values will be sorted into the list as the last item is deleted (to keep a constant sized pool of "optimized" values).

First off, what would be the most efficient way of doing the selection? Selection must happen in real time from a list anywhere from 50 to 1000 item long.

Second, what would be the best data structure to use here? I'm using C#.

I just thought of a possible solution, but I'd like some feedback on the idea. What if I were to generate a random float value within a certain range, and then do something along the lines of squaring it? Small values would return small values, and large values would return MUCH larger values. From what I can tell, mapping this result to the length of the list should give the desired effect. Does this sound right?

like image 824
Camander Avatar asked Dec 04 '25 10:12

Camander


2 Answers

Unfortunately, I can't offer any code right now, but some ideas:

As your list is sorted from high weighted to low weighted, you should be able to use a random number generator based on a normal distribution. If you don't have such a random number generator at hand, you can transform a uniform distribution into a normal distribution using the code found here: Random Gaussian Variables

I'm terrible at explaining, but I'l try: You can define the bias (the mean value) at 0 and the sigma (the deviation) at, let's say, 3. Then you take the absolute value from the generated number, as you may get negative numbers.

This will give you a number generator, that has a high probably around the bias number (0 in the above example), and a lower probability for numbers that deviate much from there.

As I said, I'm terrible at explaining

like image 61
Shelling Avatar answered Dec 06 '25 22:12

Shelling


This is what I would do:

private static int GetPosition(double value, int startPosition, int maxPosition, double weightFactor, double rMin)
{
    while (true)
    {
        if (startPosition == maxPosition) return maxPosition;

        var limit = (1 - rMin)*weightFactor + rMin;
        if (value < limit) return startPosition;
        startPosition = startPosition + 1;
        rMin = limit;
    }
}

static void Main()
{
    const int maxIndex = 100;
    const double weight = 0.1;

    var r = new Random();
    for (var i = 0; i < 200; i++)
        Console.Write(GetPosition(r.NextDouble(), 0, maxIndex, weight, 0) + " ");
}

A 0.1 weight factor means that the first item has a 10% chance to be chosen. All the other items have 90%.

The 2nd item has 10% of the remaining 90% = 9%

The 3rd item has 10% of the remaining 81% = 8.1%

...

As you increase the weight factor, it will be more likely that the first items are chosen over the last ones in the list. At a factor of 1, only the 1st item will be chosen.

For a weight of 0.1 and 10 items, here are the probabilities for each index:

0: 10%
1: 9%
2: 8.1%
3: 7.29%
4: 6.56%
5: 5.9%
6: 5.31%
7: 4.78%
8: 4.3%
9: 3.87%

EDIT

Of course, this would work only for many indexes (at least 10 for 0.1) otherwise it will give greater probabilities for the last index. For example if weight = 0.1 and maxIndex = 1, index 0 will have a probability of 10% but index 1 will have 90%.

like image 28
Andrei Tătar Avatar answered Dec 06 '25 22:12

Andrei Tătar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!