Preface
Table of Contents
SIAM


THE LANCZOS AND CONJUGATE GRADIENT ALGORITHMS
from theory to finite precision computations

Gérard MEURANT

SIAM
2006

Software, Environments, Tools
ISBN-13 978-0-898716-0
ISBN-10 0-89871-616-0
365 pages



This book is intended for applied mathematicians, computational scientists, engineers, and physicists who have an interest in linear algebra, numerical analysis, and partial differential equations. It will be of interest to engineers and scientists using the Lanczos algorithm to compute eigenvalues and the CG algorithm to solve linear systems, and to researchers in Krylov subspace methods for symmetric matrices, especially those concerned with floating point error analysis.

Preface

The Lanczos algorithm is one of the most frequently used numerical methods for computing a few eigenvalues (and eventually eigenvectors) of a large sparse symmetric matrix A. If the extreme eigenvalues of A are well separated, the Lanczos algorithm usually delivers good approximations to these eigenvalues in a few iterations. One interest of this method is that it only needs the matrix A whose eigenvalues are sought in the form of matrix-vector products. Hence, it can even be used in some applications for which the matrix cannot be stored in the main memory of the computer as long as one is able to produce the result of the matrix A times a given vector. Another interesting property is that when one just needs the eigenvalues, the Lanczos algorithm only requires a very small storage of a few vectors besides eventually storing the matrix. At iteration k the algorithm constructs an orthogonal basis of a Krylov subspace of dimension k, whose basis vectors are columns of an orthogonal matrix Vk and a tridiagonal matrix Tk whose eigenvalues are approximations of the eigenvalues of A.

However, it has been known since the introduction of the method in 1950 by Cornelius Lanczos that when used in finite precision arithmetic this algorithm does not fulfill its theoretical properties. In particular, the computed basis vectors which are mathematically orthogonal lose their orthogonality in practical computations. Moreover, numerically some additional copies of the eigenvalues which have already converged re-appear within the computed approximate eigenvalues when we continue the iterations. This leads to a delay in the computation of the eigenvalues which have not yet converged. In fact, in some cases, to obtain all the distinct eigenvalues of A, one has to do many more iterations than the order n of the matrix. It is because of these problems in finite precision arithmetic that the algorithm was dismissed or ignored for some time until some people, mainly Chris Paige in the seventies, showed that despite its departure from theory, the algorithm can be used efficiently for computing eigenvalues and eigenvectors, especially of large and sparse matrices. Therefore, it is important to understand why the problems we mentioned before are happening and to have a view as clear as possible about the behavior of the Lanczos algorithm in finite precision arithmetic. To try to reach this goal, we shall consider the problem from different points of view. We shall describe the matrix point of view that has been used by Paige to derive elegant matrix relations describing the propagation of rounding errors in the algorithm. But, we shall also consider the projections of the Lanczos vectors on the eigenvectors of A and derive solutions for their recurrences. Even though this may seem more difficult and lest fruitful than the matrix approach, we shall obtain some additional insights on the behavior of the algorithm in finite precision arithmetic.

The Lanczos algorithm is closely linked to the Conjugate Gradient (CG) method for solving linear systems Ax=b. This numerical method was introduced by Magnus Hestenes and Eduard Stiefel at about the same time Lanczos published his own algorithm. In 2006 CG is the state of the art algorithm for solving large sparse linear systems with a positive definite symmetric matrix A. Today CG is generally used in connection with a preconditioner to speed up convergence. As we shall see there is a very close relationship between CG convergence and approximation of the eigenvalues of A in the Lanczos algorithm. In finite precision arithmetic, like the Lanczos algorithm, CG does not fulfill its mathematical properties. The residual vectors which must be orthogonal lose their orthogonality. Moreover, when compared to what would be obtained in exact arithmetic, in most cases convergence is delayed in finite precision computations. Hence, here also it is of interest to understand the origin of these problems which are, of course, closely linked to the ones encountered with the Lanczos algorithm.

This book presents some known and a few new results about the Lanczos and CG algorithms both in exact and finite precision arithmetics. Our aim is to describe and explain the "generic" behavior of these algorithms, that is what happens in most cases. Therefore, we shall not pay too much attention to some subtleties that may arise in some contrived examples. We are particularly interested in the eigenvalues of the Lanczos tridiagonal matrices Tk and how well they approximate the eigenvalues of the matrix A. This is because our main goal is to study the convergence rate of the Conjugate Gradient algorithm and we shall see that the behavior of the norms of the error and the residual depends on how well the eigenvalues of A are approximated in the Lanczos algorithm. Therefore, when studying the Lanczos algorithm we shall put the emphasis on the eigenvalue approximations and not too much on eigenvectors.

The Lanczos and Conjugate Gradient algorithms are among the most fascinating numerical algorithms since, even though their finite precision arithmetic behavior is far from theory and there can be a very large growth of the rounding errors, they can nevertheless be used very efficiently to compute eigenvalues and for solving linear systems with large sparse matrices.


Table of Contents

1/ The Lanczos algorithm in exact arithmetic [TOC]

1.1.  Introduction to the Lanczos algorithm . . . . . . . . . . . .   2
1.2.  Lanczos polynomials . . . . . . . . . . . . . . . . . . . . .   9
1.3.  Interlacing properties and approximations of eigenvalues. . .  16
1.4.  The components of the eigenvectors of Tk. . . . . . . . . . .  20
1.5.  Study of the pivot functions. . . . . . . . . . . . . . . . .  22
1.6.  Bounds for the approximation of the eigenvalues . . . . . . .  25
1.7.  A priori bounds . . . . . . . . . . . . . . . . . . . . . . .  41
1.8.  Computation of the approximate eigenvalues. . . . . . . . . .  43
1.9.  Harmonic Ritz values. . . . . . . . . . . . . . . . . . . . .  43

2/ The CG algorithm in exact arithmetic [TOC]

2.1.  Derivation of the CG algorithm from the Lanczos algorithm . . 46
2.2.  Relations between residuals and descent directions. . . . . . 53
2.3.  The norm of the residual. . . . . . . . . . . . . . . . . . . 55
2.4.  The A-norm of the error . . . . . . . . . . . . . . . . . . . 58
2.5.  The l2 norm of the error. . . . . . . . . . . . . . . . . . . 68
2.6.  Other forms of the CG algorithm . . . . . . . . . . . . . . . 74
2.7.  Bounds for the norm of the error. . . . . . . . . . . . . . . 77

3/ A historical perspective on the Lanczos algorithm in finite precision [TOC]

3.1.  The tools of the trade. . . . . . . . . . . . . . . . . . . .  83
3.2.  Numerical example . . . . . . . . . . . . . . . . . . . . . .  89
3.3.  The work of Chris Paige . . . . . . . . . . . . . . . . . . .  93
3.4.  Illustration of the work of Chris Paige . . . . . . . . . . . 102
3.5.  The work of Joe Grcar . . . . . . . . . . . . . . . . . . . . 105
3.6.  Illustration of the work of Joe Grcar . . . . . . . . . . . . 109
3.7.  The work of Horst Simon . . . . . . . . . . . . . . . . . . . 110
3.8.  Illustration of the work of Horst Simon . . . . . . . . . . . 118
3.9.  The work of Anne Greenbaum and Zdenek Strakos . . . . . . . . 121
3.10. The work of J. Cullum and R. Willoughby . . . . . . . . . . . 130
3.11. The work of L. Druskin and L. Knizhnerman . . . . . . . . . . 130
3.12. Recent related results. . . . . . . . . . . . . . . . . . . . 136

4/ The Lanczos algorithm in finite precision [TOC]

4.1.  Numerical examples. . . . . . . . . . . . . . . . . . . . . . 139
4.2.  Solution of three-term recurrences. . . . . . . . . . . . . . 149
4.3.  The Lanczos vectors in finite precision . . . . . . . . . . . 152
4.4.  Another solution to three-term recurrences. . . . . . . . . . 166
4.5.  Bounds for the perturbation terms . . . . . . . . . . . . . . 174
4.6.  Some more examples. . . . . . . . . . . . . . . . . . . . . . 176
4.7.  Conclusions of the chapter. . . . . . . . . . . . . . . . . . 184

5/ The CG algorithm in finite precision [TOC]

5.1.  Numerical examples. . . . . . . . . . . . . . . . . . . . . . 187
5.2.  Relations between CG and Lanczos in finite precision. . . . . 191
5.3.  Study of the CG three-term recurrences. . . . . . . . . . . . 198
5.4.  Study of the CG two-term recurrences. . . . . . . . . . . . . 210
5.5.  Relations between pk and rk . . . . . . . . . . . . . . . . . 214
5.6.  Local orthogonality in finite precision . . . . . . . . . . . 215
5.7.  CG convergence in finite precision. . . . . . . . . . . . . . 223
5.8.  Numerical examples of convergence in finite precision . . . . 230

6/ The maximum attainable accuracy [TOC]

6.1.  Difference of residuals . . . . . . . . . . . . . . . . . . . 239
6.2.  Numerical experiments for Ax=0. . . . . . . . . . . . . . . . 241
6.3.  Estimate of the maximum attainable accuracy . . . . . . . . . 249
6.4.  Some ways to improve the maximum attainable accuracy. . . . . 252

7/ Estimates of norms of the error in finite precision [TOC]

7.1.  Computation of estimates of the norms of the error in exact 
      arithmetic. . . . . . . . . . . . . . . . . . . . . . . . . . 257
7.2.  The A-norm of the error in finite precision . . . . . . . . . 263
7.3.  The l2 norm of the error in finite precision. . . . . . . . . 265
7.4.  Numerical examples. . . . . . . . . . . . . . . . . . . . . . 268
7.5.  Comparison of two-term and three-term CG. . . . . . . . . . . 275

8/ The preconditioned CG algorithm [TOC]

8.1.  PCG in exact arithmetic . . . . . . . . . . . . . . . . . . . 281
8.2.  Formulas for norms of the error . . . . . . . . . . . . . . . 282
8.3.  PCG in finite precision . . . . . . . . . . . . . . . . . . . 284
8.4.  Numerical examples of convergence . . . . . . . . . . . . . . 287
8.5.  Numerical examples of estimation of norms . . . . . . . . . . 292

9/ Miscellaneous [TOC]

9.1.  Choice of the starting vector . . . . . . . . . . . . . . . . 297
9.2.  Variants of CG and multiple right hand sides. . . . . . . . . 298
9.2.1. Constrained CG. . . . . . . . . . . . . .. . . . . . . . . . 298
9.2.2. Block CG . . . . . . . . . . . . . . . . . . . . . . . . . . 299
9.2.3. Init, augmented and deflated CG. . . . . . . . . . . . . . . 300
9.2.4. Global CG. . . . . . . . . . . . . . . . . . . . . . . . . . 302
9.3.  Residual smoothing. . . . . . . . . . . . . . . . . . . . . . 302
9.4.  Inner-outer iterations and relaxation strategies. . . . . . . 304
9.5.  Numerical experiments with starting vectors . . . . . . . . . 305
9.6.  Shifted matrices. . . . . . . . . . . . . . . . . . . . . . . 308
9.7.  CG on indefinite systems. . . . . . . . . . . . . . . . . . . 310
9.8.  Examples with PCG . . . . . . . . . . . . . . . . . . . . . . 313

Appendix [TOC]

A.1.  A short biography of Cornelius Lanczos  . . . . . . . . . . . 323
A.2.  A short biography of M.R. Hestenes and E. Stiefel . . . . . . 324
A.3.  Examples in exact arithmetic. . . . . . . . . . . . . . . . . 325

Bibliography [TOC]

There are 208 references (14 pages).