Recall that a band matrix is one which is nonzero only on certain diagonals. We say a matrix A has lower bandwidth lb if for i>j+lb, and upper bandwidth ub if for i<j-ub. If lb and/or ub are large, then the techniques of the Section 6 are still applicable, and the LAPACK routines for band matrices ( sgbsv and spbsv) have been optimized for this situation [37,21]. Different techniques are needed when the bandwidth is small. The reason is that proportionately less and less parallel work is available in updating the trailing submatrix, which is where we use the Level 3 BLAS. In the limiting case of tridiagonal matrices, the parallel algorithm derived as in section 6.1 and the standard serial algorithm are nearly identical.
The problem of solving banded linear systems with a narrow band has been studied by many authors, see for instance the references in [9,38]. We will only sketch some of the main ideas and we will do so for rather simple problems. The reader should keep in mind that these ideas can easily be generalized for more complicated situations, and many have appeared in the literature.