Random number generation - merging bot articles#79
Random number generation - merging bot articles#79ProfessionalMenace wants to merge 3 commits intoTCCPP:mainfrom
Conversation
jeremy-rifkin
left a comment
There was a problem hiding this comment.
Thanks for taking the time to write this! I have given it a quick look and left some comments below.
This is a very in-depth article on random number generators. If this is going to be put in cpp-tutorial, it should be much more beginner focused. I would recommend splitting this page in two: A beginner introduction in cpp-tutorial and a RNG deep-dive somewhere in the resources section.
wiki/cpp-tutorial/random.md
Outdated
| Mersenne Twister engine are large internal state (624 \* `sizeof(std::uint_fast32_t)` bytes) and difficulty of it's | ||
| seeding. |
There was a problem hiding this comment.
Seeding is no more difficult than any other C++ RandomNumberEngine
There was a problem hiding this comment.
The articles I've read online suggested otherwise but after reading through the standard it seems that there is a specific overload that solves this issue.
https://eel.is/c++draft/rand.eng.mers#6
wiki/cpp-tutorial/random.md
Outdated
| for generating a deterministic sequence of numbers that appear random. PRNGs maintain an internal state that is updated | ||
| each time a new number is generated. The initial state of the PRNG is called the seed and the process of setting the | ||
| initial state is called seeding. Two instances of the same PRNG initialized with the same seed will produce identical | ||
| sequences of numbers, which is important for reproducibility of statistical simulations. |
There was a problem hiding this comment.
Probably much more detail here than is necessary for beginners
There was a problem hiding this comment.
Simplified the wording a bit but kept mostly the same.
wiki/cpp-tutorial/random.md
Outdated
| | default_random_engine | implementation defined | | ||
| | minstd_rand0 | linear congruential | | ||
| | minstd_rand | linear congruential | | ||
| | mt19937 | mersenne twister | | ||
| | mt19937_64 | mersenne twister | | ||
| | ranlux24 | subtract with carry | | ||
| | ranlux48 | subtract with carry | | ||
| | knuth_b | minstd_rand0 with shuffle | | ||
| | philox4x32 (C++26) | counter-based philox | | ||
| | philox4x64 (C++26) | counter-based philox | |
There was a problem hiding this comment.
I'd recommend focusing just on mt19937 but mentioning as an aside that these other options exist
There was a problem hiding this comment.
Kept the table but added some notes
| initial state is called seeding. Two instances of the same PRNG initialized with the same seed will produce identical | ||
| sequences of numbers, which is important for reproducibility of statistical simulations. | ||
|
|
||
| In the [C++ standard library](https://en.cppreference.com/w/cpp/header/random), random engines are callable objects that |
There was a problem hiding this comment.
We should explain why rand() is bad / insufficient
There was a problem hiding this comment.
I added 1 paragraph about it. Quite unsure about the wording
jeremy-rifkin
left a comment
There was a problem hiding this comment.
Thanks for the changes! I've left some more comments below. The main structural issue from before still stands: This is too much of a deep-dive into random number generation for a tutorial article. There is a lot of great information here which would be great to have in the resources section of the wiki. I would say LCG, seed_seq, predefined generators table, and philox all belong in a resources page. For the beginner article, just focusing on rand(), a random engine, a random device, std::default_random_engine/std::mt19937, and the three most important distributions should be good. Then linking to the resource page for more in-depth information would be perfect.
| { | ||
| text: "Random", | ||
| link: "/cpp-tutorial/random", | ||
| collapsed: true, |
There was a problem hiding this comment.
| collapsed: true, |
| --- | ||
| bot_article: | | ||
| # Generating Random Numbers in C++ | ||
| ## Example: Printing Ten Random Dice Rolls |
There was a problem hiding this comment.
| ## Example: Printing Ten Random Dice Rolls | |
| Simple example of random number generation in C++: |
| #include <random> | ||
| #include <iostream> | ||
|
|
||
| int main() { | ||
| // initialize a random device | ||
| std::random_device dev; | ||
|
|
||
| // seed default_random_engine | ||
| std::default_random_engine gen{dev()}; | ||
|
|
||
| // initialize a uniform integer distribution | ||
| std::uniform_int_distribution<int> dis{1, 6}; | ||
|
|
||
| // roll the dice | ||
| for (int i = 0; i < 10; ++i) { | ||
| std::cout << dis(gen) << ' '; | ||
| } | ||
| } |
There was a problem hiding this comment.
Succinctness is important in a bot article
| #include <random> | |
| #include <iostream> | |
| int main() { | |
| // initialize a random device | |
| std::random_device dev; | |
| // seed default_random_engine | |
| std::default_random_engine gen{dev()}; | |
| // initialize a uniform integer distribution | |
| std::uniform_int_distribution<int> dis{1, 6}; | |
| // roll the dice | |
| for (int i = 0; i < 10; ++i) { | |
| std::cout << dis(gen) << ' '; | |
| } | |
| } | |
| #include <random> | |
| #include <iostream> | |
| int main() { | |
| std::random_device dev; // for seeding | |
| std::default_random_engine rng{dev()}; | |
| std::uniform_int_distribution<int> dist{1, 6}; | |
| for (int i = 0; i < 10; ++i) { | |
| std::cout << dist(rng) << ' '; | |
| } | |
| } |
| Unlike the C `rand()` function, which relies on common shared seed via `srand()`, the C++ random engines are independent | ||
| and each one has its own seed. This ensures thread safety, whereas C `rand()` does not. Another thing worth mentioning | ||
| about `rand()` is that it is implementation defined and can vary system to system. The C++ random library should be | ||
| always preferred. |
There was a problem hiding this comment.
This doesn't really capture why rand() is bad. Yes, it is implementation-defined and global, but the most important practical reason is poor randomness quality in all major implementations. rand() % n being biased could also be mentioned.
| # Generating Random Numbers | ||
|
|
||
| A [Pseudorandom Number Generator (PRNG)](https://en.wikipedia.org/wiki/Pseudorandom_number_generator) is an algorithm | ||
| for generating a sequence of (almost) random numbers. PRNGs maintain an internal state that is updated each time a new |
There was a problem hiding this comment.
How about "numbers that appear random" instead of "almost random"
|
|
||
| ## Seed Sequence | ||
|
|
||
| Seed sequence (`std::seed_seq`) is an utility in the standard library for converting a small number of inputs into a |
There was a problem hiding this comment.
| Seed sequence (`std::seed_seq`) is an utility in the standard library for converting a small number of inputs into a | |
| Seed sequence (`std::seed_seq`) is a utility in the standard library for converting a small number of inputs into a |
| ## Seed Sequence | ||
|
|
||
| Seed sequence (`std::seed_seq`) is an utility in the standard library for converting a small number of inputs into a | ||
| higher quality seed (does not contain large areas of zeros/ones) suitable for seeding PRNGs with large internal state |
There was a problem hiding this comment.
I don't think "does not contain large areas of zeros/ones" is really useful for beginners. It isn't really accurate to say it creates a "higher quality seed" as it can't magically increase entropy, but the mixing it does can improve distribution / bias.
| // initialize a uniform real distribution | ||
| std::uniform_real_distribution<double> dis{0.0, 1.0}; | ||
|
|
||
| // generate random numbers in the interval [0, 1) |
There was a problem hiding this comment.
Might be good to talk about the range
| double random_number = dis(gen); | ||
| std::cout << random_number << std::endl; |
There was a problem hiding this comment.
| double random_number = dis(gen); | |
| std::cout << random_number << std::endl; | |
| std::cout << dis(gen) << std::endl; |
| | default_random_engine | implementation defined | Reproducibility is not important | | ||
| | minstd_rand0 | linear congruential | Small internal state | | ||
| | minstd_rand | linear congruential | Small internal state | | ||
| | mt19937 | mersenne twister | Generally should be preferred | | ||
| | mt19937_64 | mersenne twister | Generally should be preferred | | ||
| | ranlux24 | subtract with carry | Statistical quality | | ||
| | ranlux48 | subtract with carry | Statistical quality | | ||
| | knuth_b | minstd_rand0 with shuffle | Almost no reason to use it | | ||
| | philox4x32 (C++26) | counter-based philox | Good for multi-threaded applications\* | | ||
| | philox4x64 (C++26) | counter-based philox | Good for multi-threaded applications\* | | ||
|
|
||
| \*Not yet implemented |
There was a problem hiding this comment.
The summary column here is largely editorializing. Noting that philox isn't implemented will quickly become outdated (and gcc actually already supports it).
#69