A second reason clock rate by itself is an inadequate measure of performance is that it doesn't take into account what happens during a clock cycle. This is especially true when comparing systems with different instruction sets. It is possible that a machine might have a lower clock rate, but because it requires fewer cycles to execute the same program it would have higher performance. For example, consider two machines, A and B, that are almost identical except that A has a multiply instruction and B does not. A simple loop that multiplies a vector by a scalar (the constant 3 in this example) is shown in the table below. The number of cycles for each instruction is given in parentheses next to the instruction.
Table 3: View.
The first instruction loads an element of the
vector into an internal
Next, machine A multiplies the vector
element by 3, leaving the result in the register. Machine B does
the same operation by shifting and adding, i.e. 3x = 2x + x.
B copies the
X to another register
left one bit (which
multiplies it by 2), and then adds
Y, again leaving the result in
X. Both machines then store the result back into the vector in
memory and branch back to the top of the loop if the vector index
is not at the end of the vector (the comparison and branch are
done by the
dbr instruction). Machine A might be slightly slower
than B, but since it takes fewer cycles it will execute the loop
faster. For example if A's cycle time is 9 MHz (.11s per cycle)
and B's cycle time is 10 MHz (.10s per cycle) A will execute one
pass through the loop in 1.1s but B will require 1.2s.