As mentioned in the beginning of this section, there is at least limited potential for vectorization of a single lagged Fibonacci generator. Figure 12 is an illustration of vectorization applied to our generator. As the figure shows, the vector algorithm advances the register ahead by steps, so that the vector length of most of the operations is . Note that there is a vector copy operation of length . Care should be taken that no item of data is destroyed before it is needed. The easiest way to prevent unintentionally writing over needed data is to keep two copies of the Fibonacci register and, for each ``vector'' advance, use the old copy to construct the new one. None of the data in the old copy will be destroyed until the next vector advance, when it becomes the new copy. If vectorization of the Fibonacci generator is important --- and it could be, if random number generation consumes a large fraction of the execution time --- then clearly a long vector length is better than a short one. Processing with a vector length of 6, as our example has, would not yield much improvement over the scalar method. For vectorization to provide meaningful improvement over scalar processing, the vector operations should be long enough to make good use of the machine hardware. For example, on Cray machines where the vector registers are 64 words long (128 on the new models), this usually means vector lengths of tens of elements. For these machines, the generators and would be good choices, with respective vector lengths of 64 and 127.

In Figure 13 we list a sample Fortran code for initializing and generating
random numbers from the generator . Note that the register is
maintained as a set of 32-bit numbers, but that the number returned to the user
has only 31 bits. The initialization of the register is accomplished using the
Park and Miller LGC described in [47].
The seed, ```
iseed0`,'' supplied by the user may be any integer greater than or equal to
zero
and less than or equal to = 2,147,483,646. The register is
initialized in canonical form, so each value of ` iseed0` results in a
distinct cycle of random numbers. Since the function ` irnd175()` was
written to work on 64-bit machines, as well as 32-bit machines, the mask
operations were included to add clarity to the code. In many situations, the
32-bit mask operation could be eliminated, since the hardware would simply
ignore any overflow. The 31-bit mask could also be eliminated on any systems
that zero-fill on right shift operations. If the system performs a ``sign
extension'' type of fill, then the 31-bit mask would be required.

Figure 13: FORTRAN implementation of LFG(17,5,32).