3.1.2 SIMD Computers



next up previous
Next: 3.1.3 MISD Computers Up: 3.1 Flynn's Taxonomy Previous: 3.1.1 SISD Computers

3.1.2 SIMD Computers

 


Figure 4 Flynn's Taxonomy View Figure

SIMD machines have one instruction processing unit, sometimes called a controller and indicated by a K in the PMS notation, and several data processing units, generally called D-units or processing elements (PEs). The first operational machine of this class was the ILLIAC-IV, a joint project by DARPA, Burroughs Corporation, and the University of Illinois Institute for Advanced Computation [5]. Later machines included the Distributed Array Processor (DAP) from the British corporation ICL, and the Goodyear MPP. Two recent machines, the Thinking Machines CM-1 and the MasPar MP-1, are discussed in detail in Section 3.1.2

The control unit is responsible for fetching and interpreting instructions. When it encounters an arithmetic or other data processing instruction, it broadcasts the instruction to all PEs, which then all perform the same operation. For example, the instruction might be `` add R3,R0.'' Each PE would add the contents of its own internal register R3 to its own R0. To allow for needed flexibility in implementing algorithms, a PE can be deactivated. Thus on each instruction, a PE is either idle, in which case it does nothing, or it is active, in which case it performs the same operation as all other active PEs. Each PE has its own memory for storing data. A memory reference instruction, for example ``load R0,100'' directs each PE to load its internal register with the contents of memory location 100, meaning the 100th cell in its own local memory.

One of the advantages of this style of parallel machine organization is a savings in the amount of logic. Anywhere from 20% to 50% of the logic on a typical processor chip is devoted to control, namely to fetching, decoding, and scheduling instructions. The remainder is used for on-chip storage (registers and cache) and the logic required to implement the data processing (adders, multipliers, etc.). In an SIMD machine, only one control unit fetches and processes instructions, so more logic can be dedicated to arithmetic circuits and registers. For example, 32 PEs fit on one chip in the MasPar MP-1, and a 1024- processor system is built from 32 chips, all of which fit on a single board (the control unit occupies a separate board).

Vector processing is performed on an SIMD machine by distributing elements of vectors across all data memories. For example, suppose we have two vectors, a and b, and a machine with 1024 PEs. We would store in location 0 of memory i and in location 1 of memory i. To add a and b, the machine would tell each PE to load the contents of location 0 into one register, the contents of location 1 into another register, add the two registers, and write the result. As long as the number of PEs is greater than the length of the vectors, vector processing on an SIMD machine is done in constant time, i.e. it does not depend on the length of the vectors. Vector operations on a pipelined SISD vector processor, however, take time that is a linear function of the length of the vectors.



next up previous
Next: 3.1.3 MISD Computers Up: 3.1 Flynn's Taxonomy Previous: 3.1.1 SISD Computers



verena@csep1.phy.ornl.gov