Skip to content

Pseudo random number generator gives same first output but then behaves as expected

An answer to this question on Stack Overflow.

Question

Using the random class, and a seed of time(NULL) the uniform distribution always gives the same first output, even with different compilings, but after the first output behaves as you would expect a pseudorandom number generator to behave.

Is this by construction, or am I using it incorrectly?

MWE:

#include <ctime>
#include <iostream>
#include <random>
using namespace std;
default_random_engine gen(time(NULL));
uniform_int_distribution<int> dist(10,200);
int main()
{
    for(int i = 0; i < 5; i++)
        cout<<dist(gen)<<endl;
    return 0;
}

The first three times I ran this program I get outputs of:

57
134
125
136
112

Before the second try I decided to delete empty line between uniform_int_distribution and int main() just to see if the seed was based on compile time, as you can see, that didn't matter.

57
84
163
42
146

Just running again:

57
73
181
160
46

So on my runs I keep getting 57 first, which of course isn't the end of the world, if I want different outputs I can throw away the first output. But it brings into question whether this is by design (if so why?) or if I am misusing the generator somehow (if so how?).

Answer

I'm not sure what's going wrong (yet!), but you can still initialize by time as follows without hitting the problem (borrowed from here).

#include <ctime>
#include <iostream>
#include <random>
#include <chrono>
using namespace std;
unsigned seed1 = std::chrono::system_clock::now().time_since_epoch().count();
default_random_engine gen(seed1); //gen(time(NULL));
uniform_int_distribution<int> dist(10,200);
int main()
{
    for(int i = 0; i < 5; i++)
        cout<<dist(gen)<<endl;
    return 0;
}

You can also use the random device, which is non-determinstic (it steals timing information from your key strokes, mouse movements, and other sources to generate unpredictable numbers). This is the strongest seed you can choose but the computer's clock is the better way to go if you don't need strong guarantees because the computer can run out of "randomness" if you use it to often (it takes many key strokes and mouse movements to generate a single truly random number).

std::random_device rd;
default_random_engine gen(rd());

Running

cout<<time(NULL)<<endl;
cout<<std::chrono::system_clock::now().time_since_epoch().count()<<endl;
cout<<rd()<<endl;

on my machine generates

1413844318
1413844318131372773
3523898368

so the chrono library is providing a significantly larger number and more rapidly changing number (that's in nanoseconds) than the ctime library. The random_device is producing non-deterministic numbers which are all over the map. So it seems as though the seeds ctime is producing may be too close together somehow and thus map partly to the same internal state?

I made another program which looks like this:

#include <iostream>
#include <random>
using namespace std;
int main(){
  int oldval           = -1;
  unsigned int oldseed = -1;
  cout<<"Seed\tValue\tSeed Difference"<<endl;
  for(unsigned int seed=0;seed<time(NULL);seed++){
    default_random_engine gen(seed);
    uniform_int_distribution<int> dist(10,200);
    int val = dist(gen);
    if(val!=oldval){
      cout<<seed<<"\t"<<val<<"\t"<<(seed-oldseed)<<endl;
      oldval  = val;
      oldseed = seed;
    }
  }
}

As you can see, this simply prints out the first output value for every possible random seed up to the current time along with the seed and number of previous seeds which had the same value. An excerpt of the output looks like this:

Seed  Value Seed Difference
0 10  1
669 11  669
1338  12  669
2007  13  669
2676  14  669
3345  15  669
4014  16  669
4683  17  669
5352  18  669
6021  19  669
6690  20  669
7359  21  669
8028  22  669
8697  23  669
9366  24  669
10035 25  669
10704 26  669
11373 27  669
12042 28  669
12711 29  669
13380 30  669
14049 31  669

So for every new first number there are 669 seeds which give that first number. Because the second and third numbers are different we are still generating unique internal states. I think we would have to understand much more about the default_random_engine in order to understand what is special about 669 (which can be factored into 3 and 223).

Given this, it's clear why the chrono and random_device libraries work better: the seeds they generate are always more than 669 apart. Keep in mind that even if the first number is the same what matters in many programs is that the sequence of numbers generated by distinct.