Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using operator>> to seed mt19937

In a blog post entitled "C++ Seeding Surprises," Melissa E. O'Neill reports that, "When std::seed_seq tries to “fix” high-quality seed data, it actually makes it worse." According O'Neill, a truly random seeding makes all states possible, but if you push such a seeding through std::seed_seq, it becomes less random, and certain states become unreachable through seeding.

So, if you have a good source of entropy, why not bypass seed_seq entirely?

That's what function seed_randomly() does below. It's taken from my rand_replacement repository on GitHub. It uses operator>> to overwrite all 624 state variables in mt19937.

template <typename ResultType>
class rand_replacement
{
public:
    using urbg_type = std::mt19937;
    using seed_type = typename std::mt19937::result_type;
private:
    urbg_type eng_{ seed_type{1u} };  // By default, rand() uses seed 1u.

    // ...

    void seed_randomly()
    {
        std::random_device rd;
        std::stringstream ss;
        for (auto i{ std::mt19937::state_size }; i--;)
            ss << rd() << ' ';
        ss >> eng_;
    }
};

Is this a novel and interesting idea, or is it really foolish?

Regarding std::stringstream: I understand that it is relatively slow, but that's okay. Seeding should be an infrequent operation.

Regarding std::random_device: I understand that random_device may be deterministic on some systems, may block on other systems, and also that it has a checkered history with minGW, but for now, at least, I am satisfied with it. My question is not about random_device; it is strictly focused on the idea of bypassing seed_seq using operator>>, a technique that could be used with any entropy source.

Are there any downsides?

By the way, the alternative, which uses seed_seq, is a tad bit more complex, and looks something like the following. Is it a better choice than what I coded above?

    void seed_randomly()
    {
        std::random_device rd;
        std::array<seed_type, std::mt19937::state_size> seeds;
        for (auto& s : seeds)
            s = rd();
        std::seed_seq const sseq{ std::cbegin(seeds), std::cend(seeds) };
        eng_.seed(sseq);
    }
like image 457
tbxfreeware Avatar asked Oct 25 '25 01:10

tbxfreeware


1 Answers

As alluded to at the end of the article it makes sense to bypass std::seed_seq but using operator>> doesn't seem like a great way of going about it. Providing an alternate implementation of a SeedSequence allows the MT's state to be populated directly from a std::random_device.

Something like:

#include <random>

struct rd_seed {
    using result_type = std::random_device::result_type;
    template< class RandomIt >
    void generate( RandomIt begin, RandomIt end ) {
        for ( std::random_device rd; begin != end; begin++ )
            *begin = rd();
    }
};

void seed(std::mt19937 &rng) {
    rd_seed seed;
    rng.seed(seed);
}

Melissa also suggested that it would be better if something like random_device provided a generate() method like this directly rather than having to make many calls into the OS to collect state 32bits at a time.

like image 129
Sam Mason Avatar answered Oct 26 '25 15:10

Sam Mason



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!