Good Enough Pseudo-Random Numbers
Particularly for test automation, one is always making use of Random functions – either to generate random inputs, or to jump from state to state in an FSM, and so on. The main problem with Random functions that I’ve observed is that they are seeded based on the current time – call it quick enough, and you’ll end up with the same results, which is undesirable. Randomness of numbers is a complex theoretical math and computer science problem, but my goal here is to see if there is an alternative way to generate pseudo-random numbers that doesn’t suffer from the initial seed problem. One easy way that I found to implement it in C# was to make use of GUIDs, which C# conveniently gives that to you. What if we take the GUID, convert it to a number, and Mod it to fall within the range that we want? Would that be pseudo-random enough? The code to create such a random function is very straightforward:
static int MyRandom(int minInclusive, int maxExclusive)
{
int len = maxExclusive - minInclusive;
if (len <= 0)
{
return -1;
}
string guid = Guid.NewGuid().ToString().ToUpper();
int sum = 0;
foreach (char c in guid)
{
if (c != '-')
{
sum = (10 * sum + ((c >= '0' && c <= '9') ? (int)(c - '0') : (int)(c - 'A' + 10))) % len;
}
}
return sum + minInclusive;
}
Slower than the Random C# function, but seemed to produce OK results. In order to test its randomness, what I did was create 25 buckets (from 0 to 24) and fill these buckets using the MyRandom(0, 25) calls, doing it 100,000,000 times. For an ideal Random function, one should see each bucket holding 4,000,000 at the end of the execution. 4,000,000 is equivalent to 4% of the data being evenly distributed across all buckets. Hence this is the magic number that we’re looking for: ~4% for each bucket. Here are the results:
0: 5.08%
1: 3.91%
2: 3.91%
3: 3.91%
4: 3.91%
5: 5.08%
6: 3.91%
7: 3.91%
8: 3.91%
9: 3.91%
10: 5.08%
11: 3.90%
12: 3.91%
13: 3.91%
14: 3.91%
15: 5.08%
16: 3.51%
17: 3.51%
18: 3.52%
19: 3.52%
20: 4.68%
21: 3.52%
22: 3.51%
23: 3.52%
24: 3.51%
For the purists, far from ideal. For the pragmatics, perfect. For me and my projects, good enough.
Thanks,
Marcelo
static int MyRandom(int minInclusive, int maxExclusive)
{
int len = maxExclusive - minInclusive;
if (len <= 0)
{
return -1;
}
string guid = Guid.NewGuid().ToString().ToUpper();
int sum = 0;
foreach (char c in guid)
{
if (c != '-')
{
sum = (10 * sum + ((c >= '0' && c <= '9') ? (int)(c - '0') : (int)(c - 'A' + 10))) % len;
}
}
return sum + minInclusive;
}
Slower than the Random C# function, but seemed to produce OK results. In order to test its randomness, what I did was create 25 buckets (from 0 to 24) and fill these buckets using the MyRandom(0, 25) calls, doing it 100,000,000 times. For an ideal Random function, one should see each bucket holding 4,000,000 at the end of the execution. 4,000,000 is equivalent to 4% of the data being evenly distributed across all buckets. Hence this is the magic number that we’re looking for: ~4% for each bucket. Here are the results:
0: 5.08%
1: 3.91%
2: 3.91%
3: 3.91%
4: 3.91%
5: 5.08%
6: 3.91%
7: 3.91%
8: 3.91%
9: 3.91%
10: 5.08%
11: 3.90%
12: 3.91%
13: 3.91%
14: 3.91%
15: 5.08%
16: 3.51%
17: 3.51%
18: 3.52%
19: 3.52%
20: 4.68%
21: 3.52%
22: 3.51%
23: 3.52%
24: 3.51%
For the purists, far from ideal. For the pragmatics, perfect. For me and my projects, good enough.
Thanks,
Marcelo
The fact that your multiplier is 10 (and your range is 25; their MCD is 5) is skewing the results towards multiples of 5; try using another multiplier (a larger prime number) and you'll get a better distribution without losing the simplicity of your solution. Depending on the language (C# and Java would work, JavaScript won't) you can also use a single random generator object (instead of recreating) and use it to generate your data.
ReplyDeleteprivate static Random rndGen = new Random();
static int MyRandom(int minInclusive, int maxExclusive) {
int len = maxExclusive - minInclusive;
return rndGen.Next(minInclusive, len);
}
Yeah, I noticed the skewing towards multiples of 5... good idea with the prime number, that should indeed do it. I modified the code, instead of 10 I used 1073676287, and voila!
ReplyDelete0: 4.00%
1: 4.00%
2: 4.00%
3: 4.00%
4: 4.00%
5: 4.00%
6: 4.00%
7: 4.00%
8: 4.00%
9: 4.00%
10: 4.00%
11: 4.00%
12: 4.00%
13: 4.00%
14: 4.00%
15: 4.00%
16: 4.00%
17: 4.00%
18: 4.00%
19: 4.00%
20: 4.00%
21: 4.00%
22: 4.00%
23: 4.00%
24: 4.00%