Let us examine the BLAS more carefully. For a more complete list of the BLAS,
see section 12.
Table 2 counts the number of
memory references and floating points operations performed by three related BLAS.
The last column gives the ratio **q** of flops to memory references. The significance
of **q** is that it tells us roughly how many flops we can perform per memory reference,
or how much useful work we can do compared to the time moving data; therefore,
the algorithms with the larger **q** values are better building blocks for other
algorithms.

Table 2 reflects a hierarchy of operations: Operations like ` saxpy`
operate on vectors and offer the worst **q** values; these are called Level 1
BLAS [10], and include inner products and other simple operations.
Operations like matrix--vector multiplication operate on matrices and vectors,
and offer slightly better **q** values; these are called Level 2 BLAS
[11], and include solving triangular systems of equations and
rank-1 updates of matrices (, **x** and **y** column vectors). Operations
like matrix--matrix multiplication operate on pairs of matrices, and
offer the best **q** values; these are called Level 3 BLAS [12], and
include solving triangular systems of equations with many right hand sides.

Table 2: Basic Linear Algebra Subroutines (BLAS).

Since the Level 3 BLAS have the highest **q** values, we endeavor to reorganize
our algorithms in terms of operations like matrix-matrix multiplication, rather
than saxpy (the LINPACK Cholesky is already constructed in terms of calls to
saxpy).