We note that many variations and hybrid implementations
that lie somewhere between pure column-Cholesky and pure
submatrix-Cholesky are possible. Perhaps the most important of these are the
multi-frontal methods (see, e.g., [55]), in which
updating operations are accumulated in and propagated through a series
of * front matrices* until finally being incorporated into the
ultimate target columns. Multifrontal methods have a number of
attractive advantages, most of which accrue from the localization of
memory references in the front matrices, thereby facilitating the
effective use of memory hierarchies, including cache, virtual memory
with paging, or explicit out-of-core solutions (the latter was the
original motivation for these methods [58].
In addition,
since the front matrices are essentially dense, the operations on them
can be done using optimized kernels, such as the BLAS, to take
advantage of vectorization or any other available architectural
features. For example, such techniques have been used to attain very
high performance for sparse factorization on conventional vector
supercomputers [59]
and on RISC workstations [60].