I know this might be a "old" question, but I want to focus on the probability.
My first question is:
in C, rand() will give a number from 0 to RAND_MAX, does each number in this interval have the same probability to be chosen by rand()?
The second question:
if rand() lets each number from 0 to RAND_MAX to have the same (or approximately same) probability to be chosen, when I want to get a random number from 0 to N-1 (N-1 < RAND_MAX), I'll do this generally:
rand()%N
But if RAND_MAX is NOT the multiple of N, the probability of random number chosen from 0 to N-1 might not be same
For instance, suppose RAND_MAX=150 and N=100, when I do rand()%100, the number from 0 to 49 will have a higher probability to be chosen than the number from 50 to 99 because 150 is not the multiple of 100.
Is there a algorithm or function in C, which can let each random number have the same probability to be chosen?
Leaving aside the fact that rand()%N is a very bad way to get a random number in the range 0..N-1, the question you ask is quite simple. Find the largest number, M say, that satisfies both M <= RAND_MAX and M % N == 0. Then when you call rand(), reject a value if it is >= M and call rand() again until you get a value that is < M.
But this particular nuance is pointless because rand()%N will be hopelessly biased. You need to use as many of the bits returned by in rand() as you possibly can.
Assuming rand() itself is evenly distributed (not always a good assumption.), call rand() again as needed.
Based on Trying to take 1 random digit at a time
#include <stdlib.h>
int rand_Flat_Distribution_0_to_Nm1(int N) {
assert(N <= RAND_MAX);
assert(N > 0);
int rmax = RAND_MAX - (RAND_MAX % N) - 1;
int r;
while ((r = rand()) > rmax);
return r%N;
}
Example:
If RAND_MAX was 32767 and N was 100, rmax would have the value of 32699. Any random value in the range 32700 to 32767 would get tossed and a new random value would be fetched, eliminated the bias %N typically causes.
This does not compensate for deficiencies in rand(). C does not specify the quality of rand(), just that it generates values 0 to RAND_MAX and RAND_MAX is a least 32767.
For values of N greater than RAND_MAX, a different solution is needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With