next up previous

3.6 Performance Models for Vector and Parallel Machines     continued...

In the 1960's, Amdahl [1] noted that speedup is limited by the size of the portion of a problem that is not executed faster. For example, suppose a program that executes in 10.0 seconds contains a key subroutine that accounts for 80% of the execution time. The rest of the program uses 20% of the total time, or 2.0 seconds. If we use a more efficient version of the subroutine that runs twice as fast, total execution time will drop to 6.0 seconds ( seconds for the subroutine, 2.0 seconds for the remainder of the program). If we find a parallel subroutine that speeds up perfectly on P processors, and run the program on an 8-processor machine, execution time will drop to 3.0 seconds ( seconds for the parallel portion, 2.0 seconds for the sequential portion). If we run on 100 processors, the total execution time will be 2.08 seconds. As more processors are used, the execution time gets closer to the time required for the sequential part, but it can never get lower than this. Since the fastest this program will ever run is 2.0 seconds, no matter how many processors are used, the maximum speedup is a factor of 5.0.

If we normalize the formula that defines speedup by letting the sequential execution time be 1 and expressing the other times as percentages of the sequential time, we derive the following formulation of Amdahl's law for parallel processors: