The sequential versions contain no array operations (except for the initialization of G) and are characterized by the familiar scalar do-loops over the matrix. The data-parallel replacements for these loops immediately follow so that the sequential and parallel versions can be conveniently compared. Some liberty (though not much) has been taken with the presentation of these examples in an attempt to make these comparisons easy and most useful. Also in order to facilitate these comparisons, the entire matrix is reduced for each pivot, rather than just those columns needing reduction, so that each version of each algorithm involves about twice as many (scalar element) operations as are really necessary; with some loss of clarity the algorithms can easily be adjusted to limit the number of operations accordingly.

An analysis of these algorithms shows that the sequential version of Simple_Gauss has about scalar operations, and the parallel version has about scalar operations in 10 parallel operations. The sequential version of Pivot_Gauss has about scalar operations, and the parallel version has about scalar operations in parallel operations. For reasonably large values of N, these results are summarized in Table 6. The last column of the table (idealistically) assumes that a data-parallel operation takes the same time as a scalar operation.

Table 6: Comparison of Gaussian Elimination Algorithms.