Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast random selection algorithm

Given an array of true/false values, what is the most efficient algorithm to select an index with a true value at random.

A sketch simple algorithm is

a <- the array
c <- 0
for i in a:
    if a[i] is true: c++
e <- random number in (0, c-1)
j <- 0
for i in e:
    while j is false: j++
return j

Can anyone come up with a faster algorithm? Maybe there is a way to only walk through the list once even if the number of true elements is not known at first?

like image 249
momeara Avatar asked Feb 05 '26 19:02

momeara


1 Answers

Use the "pick a random element from an infinite list" algorithm.

Keep an index of your current pick, and also a count of how many true values you've seen.

When you see a true value, increment the count and then replace your pick with the current index with a probability of P=(1/count). (So you always pick the first one you find... then you might switch to the second one, with probability 1/2, then you might switch to the third one with probabilty 1/3 etc.)

This requires only one scan over the list and constant storage. (It does require you to work out a larger number of random numbers, however.) In particular, it doesn't ever require you to either buffer the list or go back to the start - so it can work on an unbounded input stream.

See this answer for a sample LINQ implementation of the simple "pick a random element" algorithm; it would just need minor tweaks.

like image 160
Jon Skeet Avatar answered Feb 07 '26 10:02

Jon Skeet



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!