5 Vectorization and Access via Multiple Processors: LCGs



next up previous
Next: 6 Lagged Fibonacci Generators Up: RN Chapter Previous: 4 N-tuple Generation With

5 Vectorization and Access via Multiple Processors: LCGs

Many Monte Carlo applications have characteristics that make them easy to map onto computers having multiple processors. Some of these parallel implementations require little or no interprocessor communication (such applications are called ``embarrassingly parallel'') and are typically easy to code on a parallel machine. Others require frequent communication and synchronization among processors and in general are more difficult to write and debug. In developing any parallel Monte Carlo code, it is important to be able to reproduce runs exactly in order to trace program execution. Processors in MIMD machines are subject to asynchronous and unbalanced external effects and are thus, for all practical purposes, impossible to keep aligned in time in a predictable way. If the assumption is made that random number generation from a single generator will occur, across processors, in a certain predictable order, then that assumption will quite likely be wrong. A number of techniques have been developed that guarantee reproducibility in multiprocessor settings and with various types of Monte Carlo problems. We will consider only simple extensions to our previous discussion of LCGs, but acknowledge that there are many approaches to parallel random number generation in the literature. The first situation we address involves using LCGs in a fixed number of MIMD processes, where that number is known at the beginning of a run.

Suppose we know in advance that we will have independent processes and that we will need independent streams of random numbers. Then the best strategy for using an LCG is to split its period into nonoverlapping segments, each of which will be accessed by a different process. This amounts to finding seeds which are known to be far apart on the cycle produced by the LCG. To find such seeds, first consider (for = 0), the LCG rule successively applied:

Thus we can ``leap ahead'' places of the period by multiplying the current seed value by mod . For our purposes, we would like starting seeds, spaced at roughly steps apart. Since is likely to be quite large, it is not practical to compute one step at a time. Instead we compute an -long array, , the power-of-two powers of :

where is the largest power-of-two power of that is still smaller than . I.e. is the integer part of the log (base 2) of . For example, assume that (very small, but big enough to show how it works). Then

and since 91 = 64 + 16 + 8 + 2 + 1, then

Thus for any , for all for which bit in the base-two representation of is a one. Therefore we can leap ahead by cycle steps with no more than multiplies. Once is computed, the seeds can be determined by the procedure:

With these seeds, each of the processes will generate random numbers from nearly equally-spaced points on the cycle. As long as no process needs more than random numbers, a condition easily met for some applications, then no overlap will occur. Everything just said for MIMD processes applies equally well to SIMD programs, where the number of random number streams needed is (usually) known at run time.

The development of the leap ahead technique just described assumed that in the LCG rule. For , Leap ahead can still be accomplished in a similar way, if one constructs the -long array of partial sums of the form: where, as before, is a power of two. The details are left as an exercise for the reader.

The second and more difficult case to consider is when we do not know at the beginning of program execution how many processes (generators) we will need. The splitting of processes in such programs are data driven and in most cases occur as the result of prior Monte Carlo trials taken many steps earlier. The problem is to spawn new LCG seeds in a way that is both reproducible and which yields independent new streams.

Here we only mention a generalized approach that works within limits. Further details can be found in [20]. Consider an LGC with the property that each has two successors, a ``left'' successor, , and a ``right'' successor, . These are generated as follows:

Figure 7 shows the action of these operations with respect to a starting seed . Taken separately, the and sequences are simple LCGs that traverse the set in different order. Alternatively (and the method in which these generators are typically used), the rule produces a pseudo-random leap-ahead for the sequence, thus deterministically producing a seed for a new, spawned, subsequence of the ``right'' cycle. With such a mechanism that uses only local information from a process, reproducibility can be established. Frederickson gives a formula for the selection of the constants in the succession rules that satisfies a particular independence criterion, given some constraints. The interested reader is referred to Frederickson, et al., 1984 [20] for further enlightenment.


Figure 7 Tree Generated from ``Left'' and ``Right'' Generators. View Figure

(See exercises 8 and 9.)



next up previous
Next: 6 Lagged Fibonacci Generators Up: RN Chapter Previous: 4 N-tuple Generation With



verena@csep1.phy.ornl.gov