THE LANCZOS AND CONJUGATE GRADIENT ALGORITHMS
from theory to finite precision computations
Gérard MEURANT
SIAM
2006
Software, Environments, Tools
ISBN13 97808987160
ISBN10 0898716160
365 pages


This book is intended for applied mathematicians, computational scientists, engineers,
and physicists who have an interest in linear algebra, numerical analysis,
and partial differential equations. It will be of interest to engineers and scientists
using the Lanczos algorithm to compute eigenvalues and the CG algorithm to solve linear systems,
and to researchers in Krylov subspace methods for symmetric matrices, especially those concerned
with floating point error analysis.
The Lanczos algorithm is one of the most frequently used numerical
methods for computing a few eigenvalues (and eventually
eigenvectors) of a large sparse symmetric matrix A. If
the extreme eigenvalues of A are well separated, the Lanczos
algorithm usually delivers good approximations to these
eigenvalues in a few iterations. One interest of this method is
that it only needs the matrix A whose eigenvalues are sought in
the form of matrixvector products. Hence, it can even be used in
some applications for which the matrix cannot be stored in the
main memory of the computer as long as one is able to produce the
result of the matrix A times a given vector. Another interesting
property is that when one just needs the eigenvalues, the Lanczos
algorithm only requires a very small storage of a few vectors
besides eventually storing the matrix. At iteration k the
algorithm constructs an orthogonal basis of a Krylov
subspace of dimension k,
whose basis vectors are columns of an orthogonal matrix
Vk and a tridiagonal matrix Tk whose eigenvalues are
approximations of the eigenvalues of A.
However, it has been known
since the introduction of the method in 1950 by Cornelius Lanczos
that when used in finite precision arithmetic
this algorithm does not fulfill its theoretical properties. In
particular, the computed basis vectors which are mathematically
orthogonal lose their orthogonality in practical
computations. Moreover, numerically some additional copies of the
eigenvalues which have already converged reappear within
the computed approximate eigenvalues when we continue the
iterations. This leads to a delay in the computation of the
eigenvalues which have not yet converged. In fact, in some cases,
to obtain all the distinct eigenvalues of A, one has to do many
more iterations than the order n of the matrix. It is because of
these problems in finite precision arithmetic that the
algorithm was dismissed or ignored for some time until some
people, mainly Chris Paige in the seventies, showed that despite
its departure from theory, the algorithm can be used efficiently
for computing eigenvalues and eigenvectors, especially of large
and sparse matrices. Therefore, it is important to understand why
the problems we mentioned before are happening and to have a view
as clear as possible about the behavior of the Lanczos algorithm
in finite precision arithmetic. To try to reach this goal, we
shall consider the problem from different points of view. We shall
describe the matrix point of view that has been used by Paige to
derive elegant matrix relations describing the propagation of
rounding errors in the algorithm. But, we shall also consider the
projections of the Lanczos vectors on the eigenvectors of A and
derive solutions for their recurrences. Even though this may seem
more difficult and lest fruitful than the matrix approach, we
shall obtain some additional insights on the behavior of the
algorithm in finite precision arithmetic.
The Lanczos algorithm is closely linked to the Conjugate Gradient
(CG) method for solving linear systems Ax=b. This numerical
method was introduced by Magnus Hestenes and Eduard Stiefel at
about the same time Lanczos published his own algorithm. In 2006
CG is the state of the art algorithm for solving large sparse
linear systems with a positive definite symmetric matrix
A. Today CG is generally used in connection with a
preconditioner to speed up convergence. As we shall see there is a
very close relationship between CG convergence and approximation
of the eigenvalues of A in the Lanczos algorithm. In finite
precision arithmetic, like the Lanczos algorithm, CG does not
fulfill its mathematical properties. The residual vectors which
must be orthogonal lose their orthogonality. Moreover,
when compared to what would be obtained in exact arithmetic, in
most cases convergence is delayed in finite precision
computations. Hence, here also it is of interest to understand the
origin of these problems which are, of course, closely linked to
the ones encountered with the Lanczos algorithm.
This book presents some known and a few new results about the
Lanczos and CG algorithms both in exact and finite precision
arithmetics. Our aim is to describe and
explain the "generic" behavior of these algorithms, that is what
happens in most cases. Therefore, we shall not pay too much
attention to some subtleties that may arise in some contrived
examples. We are particularly interested in the
eigenvalues of the Lanczos tridiagonal matrices Tk and
how well they approximate the eigenvalues of the matrix A. This
is because our main goal is to study the convergence rate of the
Conjugate Gradient algorithm and we shall see that the behavior of
the norms of the error and the residual depends on how well the
eigenvalues of A are approximated in the Lanczos algorithm.
Therefore, when studying the Lanczos algorithm we shall put the
emphasis on the eigenvalue approximations and not too much on
eigenvectors.
The Lanczos and Conjugate Gradient algorithms are among the most
fascinating numerical algorithms since, even though their finite
precision arithmetic behavior is far from theory and there can be
a very large growth of the rounding errors, they can nevertheless
be used very efficiently to compute eigenvalues and for solving
linear systems with large sparse matrices.
1.1. Introduction to the Lanczos algorithm . . . . . . . . . . . . 2
1.2. Lanczos polynomials . . . . . . . . . . . . . . . . . . . . . 9
1.3. Interlacing properties and approximations of eigenvalues. . . 16
1.4. The components of the eigenvectors of Tk. . . . . . . . . . . 20
1.5. Study of the pivot functions. . . . . . . . . . . . . . . . . 22
1.6. Bounds for the approximation of the eigenvalues . . . . . . . 25
1.7. A priori bounds . . . . . . . . . . . . . . . . . . . . . . . 41
1.8. Computation of the approximate eigenvalues. . . . . . . . . . 43
1.9. Harmonic Ritz values. . . . . . . . . . . . . . . . . . . . . 43
2.1. Derivation of the CG algorithm from the Lanczos algorithm . . 46
2.2. Relations between residuals and descent directions. . . . . . 53
2.3. The norm of the residual. . . . . . . . . . . . . . . . . . . 55
2.4. The Anorm of the error . . . . . . . . . . . . . . . . . . . 58
2.5. The l2 norm of the error. . . . . . . . . . . . . . . . . . . 68
2.6. Other forms of the CG algorithm . . . . . . . . . . . . . . . 74
2.7. Bounds for the norm of the error. . . . . . . . . . . . . . . 77
3.1. The tools of the trade. . . . . . . . . . . . . . . . . . . . 83
3.2. Numerical example . . . . . . . . . . . . . . . . . . . . . . 89
3.3. The work of Chris Paige . . . . . . . . . . . . . . . . . . . 93
3.4. Illustration of the work of Chris Paige . . . . . . . . . . . 102
3.5. The work of Joe Grcar . . . . . . . . . . . . . . . . . . . . 105
3.6. Illustration of the work of Joe Grcar . . . . . . . . . . . . 109
3.7. The work of Horst Simon . . . . . . . . . . . . . . . . . . . 110
3.8. Illustration of the work of Horst Simon . . . . . . . . . . . 118
3.9. The work of Anne Greenbaum and Zdenek Strakos . . . . . . . . 121
3.10. The work of J. Cullum and R. Willoughby . . . . . . . . . . . 130
3.11. The work of L. Druskin and L. Knizhnerman . . . . . . . . . . 130
3.12. Recent related results. . . . . . . . . . . . . . . . . . . . 136
4.1. Numerical examples. . . . . . . . . . . . . . . . . . . . . . 139
4.2. Solution of threeterm recurrences. . . . . . . . . . . . . . 149
4.3. The Lanczos vectors in finite precision . . . . . . . . . . . 152
4.4. Another solution to threeterm recurrences. . . . . . . . . . 166
4.5. Bounds for the perturbation terms . . . . . . . . . . . . . . 174
4.6. Some more examples. . . . . . . . . . . . . . . . . . . . . . 176
4.7. Conclusions of the chapter. . . . . . . . . . . . . . . . . . 184
5.1. Numerical examples. . . . . . . . . . . . . . . . . . . . . . 187
5.2. Relations between CG and Lanczos in finite precision. . . . . 191
5.3. Study of the CG threeterm recurrences. . . . . . . . . . . . 198
5.4. Study of the CG twoterm recurrences. . . . . . . . . . . . . 210
5.5. Relations between pk and rk . . . . . . . . . . . . . . . . . 214
5.6. Local orthogonality in finite precision . . . . . . . . . . . 215
5.7. CG convergence in finite precision. . . . . . . . . . . . . . 223
5.8. Numerical examples of convergence in finite precision . . . . 230
6.1. Difference of residuals . . . . . . . . . . . . . . . . . . . 239
6.2. Numerical experiments for Ax=0. . . . . . . . . . . . . . . . 241
6.3. Estimate of the maximum attainable accuracy . . . . . . . . . 249
6.4. Some ways to improve the maximum attainable accuracy. . . . . 252
7.1. Computation of estimates of the norms of the error in exact
arithmetic. . . . . . . . . . . . . . . . . . . . . . . . . . 257
7.2. The Anorm of the error in finite precision . . . . . . . . . 263
7.3. The l2 norm of the error in finite precision. . . . . . . . . 265
7.4. Numerical examples. . . . . . . . . . . . . . . . . . . . . . 268
7.5. Comparison of twoterm and threeterm CG. . . . . . . . . . . 275
8.1. PCG in exact arithmetic . . . . . . . . . . . . . . . . . . . 281
8.2. Formulas for norms of the error . . . . . . . . . . . . . . . 282
8.3. PCG in finite precision . . . . . . . . . . . . . . . . . . . 284
8.4. Numerical examples of convergence . . . . . . . . . . . . . . 287
8.5. Numerical examples of estimation of norms . . . . . . . . . . 292
9.1. Choice of the starting vector . . . . . . . . . . . . . . . . 297
9.2. Variants of CG and multiple right hand sides. . . . . . . . . 298
9.2.1. Constrained CG. . . . . . . . . . . . . .. . . . . . . . . . 298
9.2.2. Block CG . . . . . . . . . . . . . . . . . . . . . . . . . . 299
9.2.3. Init, augmented and deflated CG. . . . . . . . . . . . . . . 300
9.2.4. Global CG. . . . . . . . . . . . . . . . . . . . . . . . . . 302
9.3. Residual smoothing. . . . . . . . . . . . . . . . . . . . . . 302
9.4. Innerouter iterations and relaxation strategies. . . . . . . 304
9.5. Numerical experiments with starting vectors . . . . . . . . . 305
9.6. Shifted matrices. . . . . . . . . . . . . . . . . . . . . . . 308
9.7. CG on indefinite systems. . . . . . . . . . . . . . . . . . . 310
9.8. Examples with PCG . . . . . . . . . . . . . . . . . . . . . . 313
A.1. A short biography of Cornelius Lanczos . . . . . . . . . . . 323
A.2. A short biography of M.R. Hestenes and E. Stiefel . . . . . . 324
A.3. Examples in exact arithmetic. . . . . . . . . . . . . . . . . 325
There are 208 references (14 pages).