SD is one of the oldest and simplest methods. It is actually more important as a theoretical, rather than practical, reference by which to test other methods. However, `steepest descent' steps are often incorporated into other methods (e.g., Conjugate Gradient, Newton) when roundoff destroys some desirable theoretical properties, progress is slow, or regions of indefinite curvature are encountered.
At each iteration of SD, the search direction is taken as , the negative gradient of the objective function at . Recall that a descent direction satisfies . The simplest way to guarantee the negativity of this inner product is to choose . This choice also minimizes the inner product for unit-length vectors and, thus gives rise to the name Steepest Descent.