The decision to allocate addresses as contiguous blocks or in interleaved fashion depends on how one expects information to be accessed. Programs are compiled so instructions reside in successive addresses, so there is a high probability that after a processor executes the instruction at location i it will execute the instruction at i + 1(Section 2.2). Compilers can also allocate vector elements to successive addresses, so operations on entire vectors can take advantage of interleaving. For these reasons, vector processors universally have some form of interleaved memory. However, shared memory multiprocessors use the block-oriented scheme since memory referencing patterns in an MIMD system are quite different. There the goal is to connect a processor to a single memory and use as much information as possible from that memory before switching to another memory.
Systems often provide some flexibility in fetching vector elements. In some systems it is possible to load every element, for example when fetching elements of a vector v that is stored in consecutive memory cells with n = 4the memory would return , , The interval between elements is known as the stride. One interesting use of this feature is in accessing matrices. If the stride is set to one more than the number of rows, a single memory request will return the diagonal elements (assuming column major layout and the columns are stored contiguously). Using a stride may cancel any benefits of interleaving if programmers are not careful. In an extreme case, setting the stride to the degree of interleaving means every item is fetched from the same bank and the time between successive elements will be the memory cycle time.
Figure 7: Single Bus Microprocessor.