The convergence properties of an algorithm are described by two analytic quantities: convergence order and convergence ratio. A sequence is said to converge to if the following holds: . The sequence is said to converge to with order if is the largest nonnegative number for which a finite limit exists, where
When and , the sequence is said to converge linearly (e.g., for ); when and , the sequence converges superlinearly (e.g., ); and when , the convergence is quadratic (e.g., ). Thus, quadratic convergence is more rapid than superlinear, which in turn is faster than linear. The constant is the associated convergence ratio.
(See exercise 4.)
Convergence properties of most minimization algorithms are analyzed through their application to convex quadratic functions. Such functions can be written in the form of equation (3), where is a positive-definite matrix. We refer to this convex quadratic function throughout this chapter by . For such a function, the unique global minimum satisfies the linear system Since general functions can be approximated by a quadratic convex function in the neighborhood of their local minima, the convergence properties obtained for convex quadratic functions are usually applied locally to general functions. However, such generalizations do not guarantee good behavior in practice on complex, large-scale functions.