# 2 Introduction to Pseudorandom Numbers     continued...

We shall not provide a comprehensive treatise on random numbers. There are many good references which already do so. Knuth [32] is the definitive reference, although it may be a bit too abstract for the typical computational scientist. Some implementations of algorithms (of which, at large scale, some are good and some are not so good) can be found in Numerical Recipes [49]. Perhaps the most accessible and lucid exposition is that of Anderson [3], who presents some excellent illustrative graphics - an approach which we emulate to illustrate the properties of some random number generators. Also, our focus is upon efficiency and effectiveness when using random number generators in large-scale computations.

Random number generators should not be chosen at random. -- Donald Knuth (1986)

Consistent with the previous citation, we advise a modicum of caution in the use of pseudorandom numbers - especially in large-scale problems. There is an interesting anecdote from Knuth, who went to great lengths to implement what he thought was to be a superior random number generator. However, upon testing, it was found to produce very poor random numbers, illustrating that it is easy for even the experts a priori to misinterpret quality. The following comments derive from painful personal experiences of one of the authors.

• When problems arise with random number generators, they are exceedingly difficult to isolate. Often, the problem can be isolated only by replacing the existing random number generator with one which is definitely superior (although perhaps much slower). To make a selection of a superior generator requires both knowledge of the generator currently in use and sometimes requires in-depth knowledge of the properties of general random number generators.

• Problems rarely occur when solving small test problems (those for which analytical or experimental answers are known). Instead, problems arise in large scale, substantial examples involving perhaps millions or even billions of random numbers - where debugging is difficult due to the massive amount of data.

• Finally, in large-scale problems where one is porting a scalar to a vector or massively parallel algorithm, the random numbers usually are accessed in a different order. Thus it is sometimes not possible to duplicate the run exactly. The results converge only asymptotically, as the number of trials increases.

Thus, when problems occur, it is very difficult to isolate the problem to the random number generator because one tends to trace program execution an event step at a time, and it is only in aggregate over many random numbers that the behavior of the random number generator is flawed. In effect, one ``loses sight of the forest for all of the trees.'' Typically, in desperation and as a last resort after many days of debugging, one changes the random number generator and voila - the problem disappears!