next up previous
Next: The steepest-descent method Up: Globalizing Newton's method: Descent Previous: Introduction

Descent Directions

The reader will recall that $p\in{\bf {\rm R}}^n$ is a descent direction for f at x(k) if

\begin{displaymath}\nabla f(x^{(k)})\cdot p<0.
\end{displaymath}

This condition implies that

\begin{displaymath}f(x^{(k)}+\alpha p)<f(x^{(k)})\ \mbox{for all}\\ alpha>0\ \mbox{sufficiently small}.
\end{displaymath}

Indeed, if $\phi$ represents the one-dimensional ``slice'' of f in the direction of p,

\begin{displaymath}\phi(\alpha)=f(x^{(k)}+\alpha p)
\end{displaymath}

then the descent condition implies that $\phi'(0)<0$ and hence that $\phi(\alpha)<\phi(0)$ for all $\alpha>0$ sufficiently small (see Figure 1).
  
Figure: The function $\phi (\alpha )=f(x^{(k)}+\alpha p)$, where p is a descent direction..
\includegraphics[height=2in,width=3in]{{neg.eps}}

Given x(k) and a descent direction p, it is possible to reduce f by moving in the direction of p, that is, by choosing an appropriate $\alpha>0$and defining $x^{(k+1)}=x^{(k)}+\alpha p$. A procedure for choosing $\alpha$ is referred to as a line search (since x(k+1) is found on the (half-)line parametrized as $x^{(k)}+\alpha p$). I will discuss line searches (a solution to the third difficulty described above in the Introduction) later. For now, I want to concentrate on methods for producing descent directions.

Newton's method produces the direction

\begin{displaymath}p=-\nabla^2f(x^{(k)})^{-1}\nabla f(x^{(k)}).
\end{displaymath}

This is a descent direction if

\begin{displaymath}\nabla f(x^{(k)})\cdot p<0,
\end{displaymath}

that is, if

\begin{displaymath}-\nabla f(x^{(k)})\cdot\nabla^2f(x^{(k)})^{-1}\nabla f(x^{(k)})<0,
\end{displaymath}

that is, if

 \begin{displaymath}
\nabla f(x^{(k)})\cdot\nabla^2f(x^{(k)})^{-1}\nabla f(x^{(k)})>0.
\end{displaymath} (1)

Condition (1) will hold if $\nabla^2f(x^{(k)})^{-1}$ is positive definite. The reader may recall that the eigenvalues of H-1 are simply the reciprocals of the eigenvalues of H,1 and therefore $\nabla^2f(x^{(k)})^{-1}$ is positive definite if and only if $\nabla^2f(x^{(k)})$ is positive definite. Of course, if $\nabla^2f(x^{(k)})$ is positive definite, then it is nonsingular, and in this case the first two difficulties mentioned in the Introduction disappear. If $\nabla^2f(x^{(k)})$ is positive definite, then the Newton step is well-defined and represents a descent direction.

The following observation is essential: If H is any symmetric positive definite matrix, then $p=-H^{-1}\nabla f(x^{(k)})$ is a descent direction for f at x(k). This suggests the following modification of Newton's method:

 \begin{displaymath}
x^{(k+1)}=x^{(k)}-\alpha_kH_k^{-1}\nabla f(x^{(k)}),
\end{displaymath} (2)

where Hk is a positive definite approximation of $\nabla^2f(x^{(k)})$ and $\alpha_k$ is a step-length parameter that is chosen by a line search. An iteration of the form (2) is referred to as a quasi-Newton iteration. In order for the rapid local convergence of Newton's method to be preserved, Hk should be a good approximation to (or equal to) $\nabla^2f(x^{(k)})$ when x(k) is close to x*.

I will now describe two specific quasi-Newton methods.


next up previous
Next: The steepest-descent method Up: Globalizing Newton's method: Descent Previous: Introduction
Mark S. Gockenbach
2003-01-30