In an interleaved memory, the memory is divided into a set of
banks. An interleaved memory with
banks is said to be
-way interleaved. One way of allocating virtual addresses
to memory
modules is to divide the memory space (the set of all possible
addresses a processor can generate) into contiguous blocks. If
there are
banks, memory location
would
reside in bank number
(ignoring remainders). In an interleaved memory, however,
consecutive addresses reside in different banks: memory location
is in bank number
. For example,
suppose there are 4 banks,
each containing 256 bytes. The block-oriented scheme would assign
virtual addresses
to the first bank,
to the
second bank, and so on. The interleaved scheme would assign
addresses 0, 4, 8,
to the first bank, 1, 5, 9,
to the
second bank, etc.
(Figure 6).
However the memory space
is split up among the banks, as long as requests are sent to two
different banks they can be handled simultaneously. The processor
can request a transfer from location
on one cycle, and on the
next cycle request information from location
. If
and
are in
different banks, the information will be returned on successive
cycles. Note that the latency of the request, i.e. the number of
cycles a processor has to wait before receiving the contents of
location
, is not affected. However the bandwidth is improved;
if there are enough banks the memory system can potentially send
information at a rate of one word per processor cycle, regardless
of what the memory cycle time is.
The decision to allocate
addresses as contiguous blocks or in interleaved fashion depends
on how one expects information to be accessed. Programs are
compiled so instructions reside in successive addresses, so there
is a high probability that after a processor executes the
instruction at location
it will execute the instruction at
(Section 2.2). Compilers can also
allocate vector elements to
successive addresses, so operations on entire vectors can take
advantage of interleaving. For these reasons, vector processors
universally have some form of interleaved memory. However, shared
memory multiprocessors use the block-oriented scheme since
memory referencing patterns in an MIMD system are quite
different. There the goal is to connect a processor to a single
memory and use as much information as possible from that memory
before switching to another memory.
Systems often provide some
flexibility in fetching vector elements. In some systems it is
possible to load every
element,
for example when fetching
elements of a vector v that is stored in consecutive memory cells
with
the memory would return
,
,
The
interval between elements
is known as the stride. One interesting use of this
feature is in
accessing matrices. If the stride is set to one more than the
number of rows, a single memory request will return the diagonal
elements (assuming column major layout and the columns are stored
contiguously). Using a stride may cancel any benefits of
interleaving if programmers are not careful. In an extreme case,
setting the stride to the degree of interleaving means every item
is fetched from the same bank and the time between successive
elements will be the memory cycle time.