One of the advantages of this style of parallel machine organization is a savings in the amount of logic. Anywhere from 20% to 50% of the logic on a typical processor chip is devoted to control, namely to fetching, decoding, and scheduling instructions. The remainder is used for on-chip storage (registers and cache) and the logic required to implement the data processing (adders, multipliers, etc.). In an SIMD machine, only one control unit fetches and processes instructions, so more logic can be dedicated to arithmetic circuits and registers. For example, 32 PEs fit on one chip in the MasPar MP-1, and a 1024- processor system is built from 32 chips, all of which fit on a single board (the control unit occupies a separate board).

Vector processing is performed on an SIMD machine by distributing
elements of vectors across all data memories. For example,
suppose we have two vectors, ** a** and ** b**, and
a machine with 1024
PEs. We would store in location 0 of memory i and in
location 1
of memory i. To add ** a** and ** b**, the machine would tell each PE to
load the contents of location 0 into one register, the contents
of location 1 into another register, add the two registers, and
write the result. As long as the number of PEs is greater than
the length of the vectors, vector processing on an SIMD machine
is done in constant time, i.e. it does not depend on the length
of the vectors. Vector operations on a pipelined SISD vector
processor, however, take time that is a linear function of the
length of the vectors.