Then, dividing the above two equations yields **R** the speedup ratio:

Note, as **F** and **D** appear in conjunction, we can combine their effects into
one.
Thus, hereafter, we let **D** represent the sum of **F** and **D**.
Moreover, it is generally possible to select long enough vector lengths so
that **F** is negligible; however, data motion, **D**, is always significant.
In effect, we pay the overhead of data motion (useless work ``gathering''
data elements into contiguous memory locations), so that we can perform
subsequent operations in the much faster vector hardware.

In
Figure 21,
we depict **R** versus for
(representative of Cray Hardware), with **D** varying parametrically.

Figure 21: Modified Amdahl's Law.

Several things are noteworthy from the graph:

- High values of
**R**are attainable, even if there is significant overhead in data motion, - must be large to obtain good speedup, and
*The curves are**relatively*insensitive to**D**

If we fix (nominally fixed by the architecture), we can consider equation
(1) to contain two independent parameters, and **D**.
Therefore, if we measure **R** for two different architectures with known but
different values of , we can determine both and **D**.

Burns et al. [3] have performed such an experiment on a Cray Y/MP and an ETA 10-G for the Monte Carlo simulation GAMTEB---one of Los Alamos' benchmark programs [2]. We measured (by turning vectorization ``on'' and ``off'' for critical loops) values of V/S of 12 and 25 for the Cray Y/MP and the ETA 10-G, respectively.