Next: Understanding stationary points
Up: The spectral theorem and
Previous: Introduction
Symmetric matrices have many special properties, the most important of
which are expressed in the following theorem:
Theorem 2.1
Suppose

is symmetric. Then
- 1.
- every eigenvalue
of A is a real number and there exists
a (real) eigenvector
corresponding to
:
;
- 2.
- eigenvectors corresponding to distinct eigenvalues are necessarily
orthogonal:
- 3.
- there exists a diagonal matrix
and an orthogonal matrix
such that A=UDUT. The diagonal entries of D are the
eigenvalues of A and the columns of U are the corresponding eigenvectors:
An orthogonal matrix U satisfies, by definition,
UT=U-1, which
means that the columns of U are orthonormal (that is, any two of them
are orthogonal and each has norm one). The expression A=UDUT of a
symmetric matrix in terms of its eigenvalues and eigenvectors is referred to
as the spectral decomposition of A.
The spectral theorem implies that there is a change of variables which
transforms A into a diagonal matrix. Before explaining this change of
variables, I will show why it is important. The reader will recall that
every quadratic function in the n variables
can
be expressed in the form
The formula for q(x) involves n2 terms, and the variables are typically
coupled. However, if H happens to be a diagonal matrix, then the formula
for q(x) simplifies considerably:
Such a quadratic is easy to understand: In each coordinate direction xi,
the graph is a parabola, opening upward if Hii>0 and opening downward
if Hii<0. There is also the degenerate case Hii=0, in which case
q is constant with respect to xi and the graph in that direction is
a horizontal line.
Therefore, in two variables (the only case that can be visualized), a
quadratic function defined by
has six possible
shapes, corresponding to the following cases:
- 1.
-
;
- 2.
-
;
- 3.
-
or
;
- 4.
-
or
;
- 5.
-
or
;
- 6.
-
.
Four of the possibilities are graphed in Figure 1.
Figure 1:
The graphs of four quadratic functions: two positive
eigenvalues (upper left), two negative eigenvalues (upper right),
one positive and one negative eigenvalue (lower left),
one positive and one zero eigenvalue (lower right).
|
|
Now I will explain the change of variables that diagonalizes a
symmetric matrix. A vector
is implicitly expressed in terms of the standard basis
:
where
If
is an orthonormal set, then it is
an alternate basis: Every
can be expressed as
Moreover, the coefficients
are easy to
compute:
When the orthonormal basis forms a matrix
,
then the computation of the coefficients
takes for the form of a matrix-vector product:
The key point here is that the numbers
can be
thought of as new variables representing the vector x. Specifically,
represent x in the standard basis
,
while
represent x in the alternate basis
.
I now digress to remind the reader of the following fundamental property
of matrices, vectors, and the dot product: If
,
then
This is really the reason that the transpose of a matrix is important.
Assuming
is symmetric, it has a spectral decomposition
H=UDUT. Therefore,
where I have applied the change of variables
.
Therefore,
the quadratic
is a simple decoupled quadratic when expressed
in terms of the alternate basis
.
Since every symmetric matrix has a spectral decomposition, this means that
every quadratic function
can be expressed as a simple
decoupled quadratic, provided the correct coordinate system is chosen.
In particular, this shows that the graph of every quadratic in two variables
looks like one of the graphs in Figure 1 (or like one of the two
other possibilities not illustrated in that figure), possibly rotated
from the standard coordinates.
Example 2.2
Define

by
q(x)=x12+6x1x2+x22.
Then

,
where
The spectral decomposition of
H is
H=
UDUT, where
The vectors
define the coordinate system illustrated in Figure
2.
Figure 2:
Standard coordinates and a rotated coordinate system.
|
|
The graph of
q, which is shown in Figure
3, is now predictable:
It curves up in the direction of
u(1) and down in the direction of
u(2).
Figure 3:
The function
q(x)=x12+6x1x2+x22.
|
|
Next: Understanding stationary points
Up: The spectral theorem and
Previous: Introduction
Mark S. Gockenbach
2003-01-22