Thus far we have not said what we mean by a `sparse' matrix. A good
operational definition is that a matrix is sparse if it contains enough
zero entries to be worth taking advantage of them to reduce both the
storage and work required in solving a linear system. Ideally, we
would like to store and operate on only the nonzero entries of the
matrix, but such a policy is not necessarily a clear win in either
storage or work. The difficulty is that sparse data structures include
more overhead (to store indices as well as numerical values of nonzero
matrix entries) than the simple arrays used for dense matrices, and
arithmetic operations on the data stored in them usually cannot be
performed as rapidly either (due to indirect addressing of operands).
There is therefore a tradeoff in memory requirements between sparse and
dense representations and a tradeoff in performance between the
algorithms that use them. For this reason, a practical requirement for
a family of matrices to be `usefully' sparse is that they have only
nonzero entries, i.e. a (small) constant number of nonzeros
per row or column, independent of the matrix dimension. For example,
most matrices arising from finite difference or finite element
discretizations of PDEs satisfy this condition. In addition to the
number of nonzeros, their particular locations, or pattern, in the
matrix also has a major effect on how well sparsity can be exploited.
Sparsity arising from physical problems usually exhibits some
systematic pattern that can be exploited effectively, whereas the same
number of nonzeros located randomly might offer relatively little
advantage.