A PARALLELIZABLE PRECONDITIONER FOR THE ITERATIVE
SOLUTION OF IMPLICIT RUNGE-KUTTA TYPE METHODS
LAURENT O. JAY

AND THIERRY BRACONNIER
y
Abstract. The main diculty in the implementation of most standard implicit Runge-Kutta
(IRK) methods applied to (sti) ordinary dierential equations (ODE's) is to eciently solve the
nonlinear system of equations. In this article we propose the use of a preconditioner whose decompo-
sition cost for a parallel implementation is equivalent to the cost for the implicit Euler method. The
preconditioner is based on the W-transformation of the RK coecients matrices discovered by Hairer
and Wanner. For sti ODE's the preconditioner is by construction asymptotically exact for methods
with an invertible RK coecients matrix. The methodology is particularly useful when applied to
super partitioned additive Runge-Kutta (SPARK) methods. The nonlinear system can be solved by
inexact Newton iterations: at each simplied Newton step the linear system can be approximately
solved by an iterative method applied to the preconditioned linear system.
Key words. Iterative methods, linear system, Newton iterations, nonlinear system, parallel
implementation, preconditioning, Runge-Kutta methods, stiness, W-transformation
AMS subject classications. 34A65, 65F10, 65L05, 65L06, 65Y05
1. Introduction. This article is concerned with the implementation of implicit
Runge-Kutta (IRK) methods such as those based on Gauss, Radau, and Lobatto points
[20] applied to (sti) ordinary dierential equations (ODE's). The main diculty is to
eciently solve the nonlinear system of equations. For an s-stage IRK method and a
dierential system of dimension n, the nonlinear system is of size s n and it is usually
solved by simplied Newton iterations. A direct decomposition of the s  n  s  n
simplied Jacobian matrix is generally inecient when s  2. The diagonalization
of the RK coecients matrix can drastically reduce the number of operations and it
also allows for parallelism. Nevertheless, the presence in general of pairs of complex
eigenvalues in the RK coecients matrix for most standard IRK methods not only
impairs their parallelism, but also signicantly increases the decomposition cost of the
Jacobian. Moreover, if several distinct IRK methods are used in a partitioned and/or
additive way, this diagonalization procedure cannot be applied since the dierent RK
matrices generally possess distinct eigenvectors. In this article we propose the use of a
preconditioner requiring s independent decompositions of submatrices of dimension n,
i.e., whose decomposition cost for a parallel implementation is equivalent to the cost
for the implicit Euler method. The preconditioner is based on the W-transformation
of the RK coecients matrices discovered by Hairer and Wanner [20]. For sti ODE's
the preconditioner is by construction asymptotically exact for methods with an in-
vertible RK coecients matrix. The methodology is particularly useful when applied
to super partitioned additive Runge-Kutta (SPARK) methods [22]. The nonlinear
system can be solved by inexact Newton iterations: at each simplied Newton step
the linear system can be approximately solved by an iterative method applied to the
preconditioned linear system.

Institute for Mathematics and its Applications, University of Minnesota, 514 Vincent Hall,
206 Church Street S.E., Minneapolis, MN 55455, USA (na.ljay@na-net.ornl.gov). The work of
this author was supported in part by a research grant of the Holderbank Foundation, Switzerland
(Holderbank-Stiftung zur Forderung der wissenschaftlichen Fortbildung).
y
Department of Computer Science and Engineering, University of Minnesota, 4-192 EE/CS Bldg.,
200 Union Street S.E., Minneapolis, MN 55455-0159, USA (braconni@cs.umn.edu).
1
2 L. O. Jay and T. Braconnier
In section 2 we give the denition of IRK methods, we discuss the approximate
Jacobian matrix used in the simplied Newton iterations, and we succinctly describe
the W-transformation. In section 3 we present the preconditioner used for the solution
of the linear systems occuring in the simplied Newton iterations. The preconditioner
is analyzed on the scalar linear test equation y
0
= y in section 4. In section 5 we
show how to extend the preconditioner from IRK to SPARK methods. In section 6
we present some numerical results illustrating the behaviour of the considered precon-
ditioner using dierent iterative methods.
2. IRK methods, approximate Jacobian, and W-transformation. We
consider the system of (sti) ODE's
y
0
= f(t; y)(1)
where y = (y
1
; : : : ; y
n
)
T
2 R
n
. The denition of IRK methods is as follows.
Definition 2.1. One step of an s-stage implicit Runge-Kutta (IRK) method
applied to the system (1) with initial values y
0
at t
0
and stepsize h reads
Y
i
  y
0
  h
s
X
j=1
a
ij
f(t
0
+ c
j
h; Y
j
) = 0 for i = 1; : : : ; s;(2)
y
1
= y
0
+ h
s
X
i=1
b
i
f(t
0
+ c
i
h; Y
i
):
The equations (2) dene a nonlinear system of dimension s  n to be solved for the s
internal stages Y
i
. The numerical approximation at t
0
+ h is then given by y
1
. The
RK coecients are usually expressed using a Butcher-tableau notation
c
1
a
11
   a
1s
.
.
.
.
.
.
.
.
.
.
.
.
c
s
a
s1
   a
ss
b
1
   b
s
where b = (b
1
; : : : ; b
s
)
T
is the weight vector, c = (c
1
; : : : ; c
s
)
T
is the node vector,
and A = (a
ij
)
ij=1;:::;s
is the RK coecients matrix. A detailed presentation of the
construction of IRK methods can be found in [20, Section IV.5]. We will assume s  2
for the remainder of the article. The nonlinear system can be solved by simplied
Newton iterations with approximate Jacobian matrix
I
s

 I
n
  hA
 J with J :=
@f
@y
(t
0
; y
0
)(3)
where 
 is the tensor product and I
m
denotes the identity matrix in R
m
. Each
iteration requires the solution of an s  n-dimensional linear system with matrix (3)
whose direct decomposition is generally inecient for s  2 as it can be drastically
improved by exploiting its special structure. By diagonalizing the RK coecients
matrix A
S
 1
AS =  = diag(
1
; : : : ; 
s
)
A preconditioner for the solution of IRK methods 3
the linear system can be tranformed into a decoupled linear system with block-diagonal
matrix
S
 1

 I
n
(I
s

 I
n
  hA
 J)S 
 I
n
=
I
s

 I
n
  h
 J =
0
B
@
I
n
  
1
hJ O
.
.
.
O I
n
  
s
hJ
1
C
A
:(4)
Nevertheless, the presence in general of pairs of complex eigenvalues in the RK co-
ecients matrix for most standard IRK methods, not only impairs the parallelism,
but also signicantly increases the decomposition cost of (4) [20, Section IV.8]. This
cost for such methods should be ideally equivalent to s independent decompositions of
submatrices of dimension n, as for diagonally implicit Runge-Kutta (DIRK) methods
[1, 10, 23] and multi-implicit Runge-Kutta (MIRK) methods [2, 5, 24] for which the
eigenvalues are real. Various iterations schemes have been proposed, some of them
requiring the decomposition of only one submatrix of dimension n [8, 9, 16, 17] or of
s submatrices of dimension n [21, 27]. These methods do not usually iterate at the
linear algebra level and they can be considered as modied Newton iterations. Unfor-
tunately, none of these methods is asymptotically exact for sti systems, whereas the
method presented in this article gives by construction an asymptotically exact result
in this situation.
In this article we propose a dierent approach aimed at reducing the amount of
computations. Instead of solving exactly a linear system at each simplied Newton
step, we apply an iterative method to a corresponding preconditioned linear system.
The use of iterative methods for the numerical solution of sti ODE's were already
considered in [3, 7, 12], with an emphasis on preconditioning in [4]. Such inexact
Newton methods are generally considered to be among the most ecient ways to
solve nonlinear system of equations [11, 25]. We construct a preconditioner requiring
s independent decompositions of matrices of dimension n, i.e., whose decomposition
cost for a parallel implementation is equivalent to the cost for the implicit Euler
method. The preconditioner is based on the W-transformation of the RK coecients
matrices discovered by Hairer and Wanner [18, 19]. This transformation is given by
X := W
T
BAW(5)
where B = diag(b
1
; : : : ; b
s
) and the coecients of the matrix W are given by w
ij
=
P
j 1
(c
i
) with P
k
(x) being the k-th shifted Legendre polynomial
P
k
(x) =
p
2k + 1
k!

d
k
dx
k

x
k
(x  1)
k

=
p
2k + 1
k
X
j=0
( 1)
j+k
 
k
j
! 
j + k
j
!
x
j
:
For recent references and more details about the W-transformation we refer the reader
to [20, Section IV.5] and [6, 13]. In the remainder of the article we will assume that
X := W
T
BAW is tridiagonal and D :=W
T
BW is diagonal and nonsingular;
two conditions which are satised for most IRK methods of interest, such as Gauss,
Radau IA & IIA, Lobatto IIIA & IIIB & IIIC & IIIC

schemes [6, 13, 20, 22]. For
4 L. O. Jay and T. Braconnier
these IRK methods the transformed matrix X and the matrix D read
X =
0
B
B
B
B
B
B
B
@
1=2  
1
O

1
0
.
.
.
.
.
.
.
.
.
 
s 2

s 2
0 
s 1;s
O 
s;s 1

ss
1
C
C
C
C
C
C
C
A
; D = diag(1; 1; : : : ; 1; d
s
);(6)
where 
k
= 1=

2
p
4k
2
  1

and the missing coecients 
s;s 1
; 
s 1;s
; 
ss
; d
s
are given
in Table 1. Note that the inverse of W is simply given by W
 1
= D
 1
W
T
B. We will
actually assume the specic forms (6) in the remainder of the article.
Table 1
Missing coecients of the transformed matrix X for some IRK methods,  =
2s 1
s 1
.
IRK method 
s;s 1

s 1;s

ss
d
s
Gauss 
s 1
 
s 1
0 1
Radau IA 
s 1
 
s 1
1
4s 1
1
Radau IIA 
s 1
 
s 1
1
4s 1
1
Lobatto IIIA 
s 1
 0 0 
Lobatto IIIB 0  
s 1
 0 
Lobatto IIIC 
s 1
  
s 1


2s 2

Lobatto IIIC


s 1
  
s 1
  

2s 2

3. Preconditioning the linear system. Using the W-transformation (5) in
(3), at each simplied Newton step we should solve a linear system
Kx = b(7)
for a block-tridiagonal matrix
K = D 
 I
n
  hX 
 J =
0
B
B
B
B
B
B
@
E
1
F
1
O
G
1
E
2
F
2
.
.
.
.
.
.
.
.
.
G
s 2
E
s 1
F
s 1
O G
s 1
E
s
1
C
C
C
C
C
C
A
(8)
with n n blocks given as follows
E
1
= I
n
 
1
2
hJ; E
i
= I
n
for i = 2; : : : ; s  1; E
s
= d
s
I
n
  
ss
hJ;(9a)
F
i
= 
i
hJ for i = 1; : : : ; s  2; F
s 1
=  
s 1;s
hJ;(9b)
G
i
=  
i
hJ for i = 1; : : : ; s  2; G
s 1
=  
s;s 1
hJ:(9c)
A way to solve (7) would be to use the block-LU decomposition [14, 15] of (8)
K =
0
B
B
B
B
B
B
@
I
n
O
G
1
H
 1
1
I
n
.
.
.
.
.
.
G
s 2
H
 1
s 2
I
n
O G
s 1
H
 1
s 1
I
n
1
C
C
C
C
C
C
A
0
B
B
B
B
B
B
@
H
1
F
1
O
H
2
F
2
.
.
.
.
.
.
H
s 1
F
s 1
O H
s
1
C
C
C
C
C
C
A
A preconditioner for the solution of IRK methods 5
where the blocks H
i
are recursively given by
H
1
= E
1
; H
i
= E
i
 G
i 1
H
 1
i 1
F
i 1
for i = 2; : : : ; s(10)
and are assumed to be nonsingular. Subdividing the solution vector x, the right-hand
side b of (7), and an intermediate vector y into s n-dimensional subvectors
x =
0
B
B
B
B
B
B
@
x
1
x
2
.
.
.
x
s 1
x
s
1
C
C
C
C
C
C
A
; b =
0
B
B
B
B
B
B
@
b
1
b
2
.
.
.
b
s 1
b
s
1
C
C
C
C
C
C
A
; y =
0
B
B
B
B
B
B
@
y
1
y
2
.
.
.
y
s 1
y
s
1
C
C
C
C
C
C
A
; x
i
; b
i
; y
i
2 R
n
for i = 1; : : : ; s;
the linear system (7) can be solved using block forward and backward substitutions:
y
1
= b
1
; y
i
= b
i
 G
i 1
H
 1
i 1
y
i 1
for i = 2; : : : ; s;
x
s
= H
 1
s
y
s
; x
i
= H
 1
i
(y
i
  F
i
x
i+1
) for i = s  1; : : : ; 1:
From (9) and (10) the blocks H
i
are given in our situation by
H
1
= I
n
 
1
2
hJ; H
i
= I
n
+ 
2
i 1
h
2
JH
 1
i 1
J for i = 2; : : : ; s  1;
H
s
= d
s
I
n
  
ss
hJ   
s;s 1

s 1;s
h
2
JH
 1
s 1
J:
Since each block H
i
for i  2 depends on H
 1
i 1
the above recursion is not parallelizable.
Moreover, we should also assume that all blocks H
i
are nonsingular, a condition which
can actually be violated even if I
n
  hJ is supposed to be invertible for all h  0. In
terms of computational cost at each step i for i  2 we should compute JH
 1
i 1
J . For
example if the LU decomposition ofH
i 1
would be performed, this would require 7n
3
=3
operations. Thus, the total block-LU decomposition of K would require (7s  6)n
3
=3
operations. This is clearly inecient as it would still be a factor from 3:5 to 7 more
costly than if the block-diagonal-LU decomposition of (4) would be used (3:5 at best
if all the eigenvalues of the RK coecients matrix consist only of conjugate complex
pairs, 7 at worst if all those eigenvalues are real).
We now present the main idea of this article. Instead of solving (7) directly, we
apply an iterative method for example to the left-preconditioned linear system
P
 1
Kx = P
 1
b:(11)
We choose the preconditioner P to be given by the approximate block-LU decompo-
sition of K based on independent approximations
e
H
i
of H
i
, i.e., we set
P :=
0
B
B
B
B
B
B
@
I
n
O
G
1
e
H
 1
1
I
n
.
.
.
.
.
.
G
s 2
e
H
 1
s 2
I
n
O G
s 1
e
H
 1
s 1
I
n
1
C
C
C
C
C
C
A
0
B
B
B
B
B
B
B
@
e
H
1
F
1
O
e
H
2
F
2
.
.
.
.
.
.
e
H
s 1
F
s 1
O
e
H
s
1
C
C
C
C
C
C
C
A
(12)
with
e
H
i
:= I
n
  
i
hJ for i = 1; : : : ; s  1;
e
H
s
:= d
s
I
n
  
s
hJ;(13)
6 L. O. Jay and T. Braconnier
where

1
=
1
2
; 
i
=

2
i 1

i 1
for i = 2; : : : ; s  1; 
s
= 
ss
 

s;s 1

s 1;s

s 1
:(14)
Each
e
H
i
can be formed and decomposed independently, making these operations fully
parallelizable. The coecients 
i
have been chosen so that
e
H
 1
i
H
i
 I
n
when (I
n
 
hJ)
 1
( hJ)  I
n
for all h  h
0
> 0. Hence, if the RK coecients matrix is invertible,
i.e., if 
s
6= 0, the preconditioner is asymptotically exact for sti systems. Note that
for the Lobatto IIIA, IIIB, and IIIC

coecients we have 
s
= 0. Since the Lobatto
IIIC

methods behave like explicit RK methods, they should not be considered to
treat sti terms. For the Lobatto IIIA and IIIB methods the last block is
e
H
s
= d
s
I
n
and needs not be decomposed.
We can interpret the above preconditioner P by obtaining an explicit expression
from the above formulas and we get
P =
0
B
B
B
B
B
B
B
@
e
E
1
F
1
O
G
1
e
E
2
F
2
.
.
.
.
.
.
.
.
.
G
s 2
e
E
s 1
F
s 1
O G
s 1
e
E
s
1
C
C
C
C
C
C
C
A
where the blocks
e
E
i
are recursively given by
e
E
1
=
e
H
1
= H
1
= E
1
;
e
E
i
=
e
H
i
+G
i 1
e
H
 1
i 1
F
i 1
for i = 2; : : : ; s:
We note that the above preconditioner is consistent in the sense that for h = 0 we have
P = K = I
s

 I
n
. It is also asymptotically exact for y
0
= y when jhj ! 1 if 
s
6= 0
(see also next section). We would like to stress the point that this preconditioner
cannot be interpreted as being of the form
I
s

 I
n
  h
e
A
 J
for a modied coecients matrix
e
A as considered in [21, 27]. Note that the linear
systems involving the matrices
e
H
i
may also be approximately solved by iterative tech-
niques, especially if a good preconditioner to I
n
  hJ is available.
4. Linear analysis of the preconditioner. We consider the scalar linear test
equation y
0
= y and we denote z := h. The preconditioner presented in the previous
section is by construction exact for z = 0 and asymptotically exact for large jzj if

s
6= 0, i.e., P
 1
(z)K(z)! I
s
when jzj ! 1, with P (z) and K(z) given below. Here,
we consider intermediate values Re(z)  0. The matrix K(z) (8) is given by
K(z) =
0
B
B
B
B
B
B
B
@
1  z=2 
1
z O
 
1
z 1
.
.
.
.
.
.
.
.
.

s 2
z
 
s 2
z 1  
s 1;s
z
O  
s;s 1
z d
s
  
ss
z
1
C
C
C
C
C
C
C
A
:
A preconditioner for the solution of IRK methods 7
The left-preconditioned linear system (11) reads
P
 1
(z)K(z)x = P
 1
(z)b
where
P (z) =
0
B
B
B
B
B
B
@
1 O
g
1
(z)
e
h
 1
1
(z) 1
.
.
.
.
.
.
g
s 2
(z)
e
h
 1
s 2
(z) 1
O g
s 1
(z)
e
h
 1
s 1
(z) 1
1
C
C
C
C
C
C
A

0
B
B
B
B
B
B
B
@
e
h
1
(z) f
1
(z) O
e
h
2
(z) f
2
(z)
.
.
.
.
.
.
e
h
s 1
(z) f
s 1
(z)
O
e
h
s
(z)
1
C
C
C
C
C
C
C
A
=
0
B
B
B
B
B
B
B
@
e
1
(z) f
1
(z) O
g
1
(z)
e
2
(z)
.
.
.
.
.
.
.
.
.
f
s 2
(z)
g
s 2
(z)
e
s 1
(z) f
s 1
(z)
O g
s 1
(z)
e
s
(z)
1
C
C
C
C
C
C
C
A
with function coecients given by
f
i
(z) = 
i
z for i = 1; : : : ; s  2; f
s 1
(z) =  
s 1;s
z;
g
i
(z) =  
i
z for i = 1; : : : ; s  2; g
s 1
(z) =  
s;s 1
z;
e
h
i
(z) = 1  
i
z for i = 1; : : : ; s  1;
e
h
s
(z) = d
s
  
s
z;
e
1
=
e
h
1
(z);
e
i
(z) =
e
h
i
(z) + g
i 1
(z)
e
h
 1
i 1
(z)f
i 1
(z) for i = 2; : : : ; s:
A good measure for the quality of the preconditioner is given by (P
 1
(z)K(z)) where
 denotes the condition number of a matrix. The closer to one is this quantity, the
better is the preconditioner. For the 2-stage Lobatto IIIA and IIIB methods we
trivially have (P
 1
(z)K(z)) = 1 since the preconditioner is exact for those two
methods, i.e., P = K for any J . For the 2-stage Lobatto IIIC method we have
P
 1
(z)K(z) =
 
1
p
3z
2
(z 1)(z 2)
2
0 1 +
z
(z 1)(z 2)
!
:
Hence, we get

1
(P
 1
(z)K(z)) = max
 
1 +
p
3jzj
2
jz   1j  jz   2j
2
;




1 +
z
(z   1)(z   2)




!

max
 
1 +
p
3jzj
2
jz   2j  jz
2
  2z + 2j
;
jz   1j  jz   2j
jz
2
  2z + 2j
!
:
For jzj ! 1 we thus have 
1
(P
 1
(z)K(z)) = 1 + 2
p
3=jzj + O(1=jzj
2
). In Fig. 1
we have plotted this condition number for purely negative values z =  r and purely
imaginary values z = ir with r = jzj. In Fig. 2 we give a similar plot for the 5-stage
Lobatto IIIB method.
8 L. O. Jay and T. Braconnier
0 10 20 30 40 50 60 70 80 90 100
1
1.5
2
r=|z|
c
o
n
di
tio
n 
nu
m
be
r
z=−r
z=ir
Fig. 1. Condition number 
1
(P
 1
(z)K(z)) for the 2-stage Lobatto IIIC method.
5. A preconditioner for SPARK methods. The methodology described in
this article to solve the nonlinear system of equations for IRK methods is particularly
useful when considering SPARK methods such as the Lobatto IIIA-B-C-C

methods
[22]. In this section we consider the following system of (sti) ODE's
y
0
= f(t; y) =
M
X
m=1
f
m
(t; y)(15)
where y = (y
1
; : : : ; y
n
)
T
2 R
n
. Such a decomposition
P
M
m=1
f
m
(t; y) may come from
a splitting and/or a partitioning of f(t; y) into dierent terms. The functions f
m
(t; y)
are supposed to have distinct properties and may therefore be numerically treated
dierently. Several motivations were given in [22] to introduce a more general class of
methods than IRK methods. The denition of SPARK methods applied to (15) is as
follows.
A preconditioner for the solution of IRK methods 9
0 50 100 150 200 250 300 350 400 450 500
1
1.5
2
2.5
3
3.5
r=|z|
c
o
n
di
tio
n 
nu
m
be
r
z=−r
z=ir
Fig. 2. Condition number 
1
(P
 1
(z)K(z)) for the 5-stage Lobatto IIIB method.
Definition 5.1. One step of an s-stage super partitioned and additive Runge-
Kutta (SPARK) method, based on the same underlying quadrature formula (b
i
; c
i
)
s
i=1
,
applied to the system (15) with initial value y
0
at t
0
and stepsize h reads
Y
i
  y
0
  h
M
X
m=1
s
X
j=1
a
(m)
ij
f
m
(t
0
+ c
j
h; Y
j
) = 0 for i = 1; : : : ; s;(16)
y
1
= y
0
+ h
s
X
i=1
b
i
f(t
0
+ c
i
h; Y
i
):
The equations (16) dene a nonlinear system of dimension s  n to be solved for
the s internal stages Y
i
. The numerical approximation at t
0
+ h is then given by y
1
.
This system can be solved by simplied Newton iterations with approximate Jacobian
10 L. O. Jay and T. Braconnier
matrix
I
s

 I
n
  h
L
X
m=1
A
(m)

 J
m
with J
m
:=
@f
m
@y
(t
0
; y
0
) for m = 1; : : : ; L(17)
where L  M and the sti terms are treated with the L rst RK methods. Since in
general the RK coecients matrices A
(m)
possess distinct eigenvectors the diagonal-
ization procedure cannot be applied. However, using the same W-transformation for
all RK coecients matrices we may assume that
X
(m)
:=W
T
BA
(m)
W =
0
B
B
B
B
B
B
B
B
@
1=2  
1
O

1
0
.
.
.
.
.
.
.
.
.
 
s 2

s 2
0 
(m)
s 1;s
O 
(m)
s;s 1

(m)
ss
1
C
C
C
C
C
C
C
C
A
for possibly distinct 
(m)
s;s 1
; 
(m)
s 1;s
; 
(m)
ss
, an assumption which is satised for the Lo-
batto IIIA, IIIB, and IIIC coecients (L = 3) of the Lobatto IIIA-B-C-C

methods
(M = 4). We thus obtain a block-tridiagonal matrix (8) with n  n blocks given by
(9) for J :=
P
L
m=1
J
m
except for the following blocks
E
s
= d
s
I
n
 
L
X
m=1

(m)
ss
hJ
m
; F
s 1
=  
L
X
m=1

(m)
s 1;s
hJ
m
; G
s 1
=  
L
X
m=1

(m)
s;s 1
hJ
m
:
As described in sections 3 for IRK methods we can solve (7) using an iterative method
on the left-preconditioned linear system (11) with the preconditioner P given by (12).
The blocks
e
H
i
can be chosen as in (13) except for the last block
e
H
s
:= d
s
I
n
 
L
X
m=1

(m)
s
hJ
m
with 
(m)
s
= 
(m)
ss
 

(m)
s;s 1

(m)
s 1;s

s 1
:
As mentioned before, note that we have 
(m)
s
= 0 for the Lobatto IIIA and IIIB
coecients.
6. Numerical results. The linear system (7) can be solved by an iterative
method applied to the left-preconditioned system (11) using the preconditioner de-
scribed in section 3. Starting from x
0
:= 0, the simplest iterative method is given by
the Richardson iterations (PRI)
x
k+1
:= x
k
+ (P
 1
b  P
 1
Kx
k
) = (I   P
 1
K)x
k
+ P
 1
b for k = 0; 1; 2; : : :(18)
If (I   P
 1
K) < 1 where  denotes the spectral radius of a matrix then the method
converges linearly, otherwise the method diverges [15]. Another possibility is to use
iterative Krylov-type methods such as the GMRES method [26]. Note that for such
methods, convergence is ensured and the convergence behaviour greatly depends on
the spectral distribution of the matrix K or P
 1
K depending on whether the pre-
conditioner is applied or not. In this section we illustrate the good quality of the
A preconditioner for the solution of IRK methods 11
preconditioner. All experiments in this section are done for Lobatto IIIC methods.
The matrix J is set as follows
J :=
0
B
B
B
B
B
@
  1 : : : 1
 2
.
.
.
.
.
.
.
.
.
1
O  n
1
C
C
C
C
C
A
;
where the parameter  allows to tune the size of the eigenvalues of J , hence to increase
stiness. Note that the eigenvalues of the matrix I
n
  hJ are all comprised in the
interval [
min
; 
max
] = [1 + h; 1 +nh]. All computations have been done on an SGI
IRIX 5.3 workstation.
For a block size n = 25 and s = 4 blocks, Table 2 shows some results for h = 10
 2
and increasing values of . When  increases, the eigenvalues of I
n
  hJ become
larger and far apart from 1. Hence, the approximate submatrices
e
H
i
become closer
to the exact submatrices H
i
, and therefore P
 1
becomes a better approximation to
K
 1
. This is rst illustrated in the columns labelled 
2
(K) and 
2
(P
 1
K) showing
that the condition number of the preconditioned matrix P
 1
K becomes closer to 1
as the parameter  increases, whereas the condition number of the original matrix K
remains large. In the column labelled kK
 1
  P
 1
k
2
we see that P
 1
tends to K
 1
as  increases. In the last column labelled k (PRI) we give the number k of PRI
iterations (18) to solve the system (7) for x = (1; : : : ; 1)
T
, the right-hand side being
given by b = Kx. The error tolerance is set to 100    kbk where  is the machine
precision. The error is measured by k
e
x  xk
1
where
e
x is the computed solution. We
observe that the number k of PRI iterations decreases as the stiness parameter 
increases.
Table 2
Some measures of the quality of the preconditioner.
 
2
(K) 
2
(P
 1
K) kK
 1
  P
 1
k
2
k (PRI)
10
1
5:40 3:40 9:5  10
 1
83
10
2
28:90 10:62 9:8  10
 1
77
10
3
156:80 9:53 6:1  10
 1
44
10
4
191:35 2:49 3:0  10
 2
13
10
6
194:38 1:59 1:6  10
 4
5
10
8
194:41 1:58 1:6  10
 6
3
To illustrate the improvement that the use of the preconditioner can provide,
we have run the non-preconditioned GMRES(m) method and the preconditioned
PGMRES(m) method wherem is the size of the Krylov subspace for which the method
is restarted. As before the exact solution is chosen to be x = (1; : : : ; 1)
T
. We have
used the same type of matrix J of size n = 200, with s = 10 and parameters h = 10
 4
and  = 10
3
. As shown in Table 3 for this example, using the GMRES(5) method
with the preconditioner P
 1
roughly divides the total running time T
tot
by a third
and takes about ten times less iterations (see column labelled k) than the unprecon-
ditioned method. In Table 3 T
dec
corresponds to the time in seconds to compute the
12 L. O. Jay and T. Braconnier
decomposition of the preconditioner P , T
sol
corresponds to the time in seconds for the
resolution by the (P)GMRES(5) methods, and T
tot
is the total computational time.
Table 3
GMRES versus PGMRES.
T
dec
T
sol
T
tot
k k
e
x  xk
1
GMRES(5)   147:77 147:77 325 2:1  10
 13
PGMRES(5) 15:13 37:01 52:14 30 3:9  10
 14
Finally, we have applied the direct block-LU method, PRI, and PGMRES(2) with
the matrix J of size n = 500, with s = 4 and parameters h = 10
 4
and  = 10
9
,
in order to compare the eciency of the preconditioned iterative methods toward the
direct block-LU method. Some results are shown in Table 4. T
dec
corresponds to
Table 4
Comparison between block-LU, PRI, and PGMRES.
T
dec
T
sol
T
tot
k k
e
x  xk
1
block-LU 1391:64 3:92 1395:56   2:9  10
 14
PRI 156:00 11:96 167:96 3 2:3  10
 13
PGMRES(2) 156:00 17:32 173:32 4 2:9  10
 13
the time in seconds of the factorization for the direct block-LU method whereas for
PRI and PGMRES(2) this corresponds to the time to decompose the preconditioner
P . T
sol
corresponds to the time in seconds for the resolution by the three dierent
methods. The preconditioner is good since the eigenvalues of I
n
  hJ are large. Thus
only a few iterations (see column labelled k) are needed by the two preconditioned
iterative methods to solve the linear system. We can see that for the same level of
accuracy, the preconditioned iterative methods take much less time than the direct
method (see column labelled T
tot
), this is simply because the computational eort
needed to decompose P is much smaller than for the block-LU factorization of K.
Obviously, the block-LU decomposition of the block-tridiagonal matrix K is not the
optimal way to solve the system, since by diagonalizing the RK coecients matrix
we could improve the cost of the decomposition by a factor close to 4. Nevertheless,
this is an interesting measure in our context since for the implementation of SPARK
methods this diagonalization procedure cannot be applied. It is interesting to note
that the simple PRI method is as ecient as the PGMRES(2) method. Since the direct
block-LU method provides a residual kK
e
x   bk close to C    kbk for a constant C,
for comparison reasons we have set the stopping criterion for PRI and PGMRES(2)
to a similar level. For the numerical solution of (sti) ODE's such an accuracy is
not needed, because the stopping criterion within the simplied Newton iterations
can be relaxed and be based on the preconditioned residual error kP
 1
(K
e
x  b)k [4].
Moreover, nonsti components need not be solved very accurately since the Newton
iterations on top of the iterative linear solver make these components to converge
suciently rapidly.
A preconditioner for the solution of IRK methods 13
REFERENCES
[1] R. Alexander, Diagonally implicit Runge-Kutta methods for sti ODE's, SIAM J. Numer.
Anal., 14 (1977), pp. 1006{1021.
[2] C. Bendtsen, Highly stable parallel Runge-Kutta methods, Appl. Numer. Math., 21 (1996),
pp. 1{8.
[3] P. N. Brown and A. C. Hindmarsh, Matrix-free methods for sti systems of ODE's, SIAM
J. Numer. Anal., 23 (1986), pp. 610{638.
[4] P. N. Brown, A. C. Hindmarsh, and L. R. Petzold, Using Krylov methods in the solution of
large-scale dierential-algebraic systems, SIAM J. Sci. Comput., 15 (1994), pp. 1467{1488.
[5] K. Burrage, A special family of Runge-Kutta methods for solving sti dierential equations,
BIT, 18 (1978), pp. 22{41.
[6] R. P. K. Chan, On symmetric Runge-Kutta methods of high order, Computing, 45 (1990),
pp. 301{309.
[7] T. F. Chan and K. R. Jackson, The use of iterative linear equation solvers in codes for large
systems of sti IVPs for ODEs, SIAM J. Sci. Stat. Comput., 7 (1986), pp. 378{417.
[8] G. J. Cooper and J. C. Butcher, An iteration scheme for implicit Runge-Kutta methods,
IMA J. Numer. Anal., 3 (1983), pp. 127{140.
[9] G. J. Cooper and R. Vignesvaran, A scheme for the implementation of implicit Runge-Kutta
methods, Computing, 45 (1990), pp. 321{332.
[10] M. Crouzeix, Sur l'approximation des equations dierentielles operationnelles lineaires par des
methodes de Runge-Kutta, PhD thesis, Univ. Paris VI, France, 1975.
[11] R. S. Dembo, S. C. Eisenstat, and T. Steihaug, Inexact Newton methods, SIAM J. Numer.
Anal., 19 (1982), pp. 400{408.
[12] C. W. Gear and Y. Saad, Iterative solution of linear equations in ODE codes, SIAM J. Sci.
Stat. Comput., 4 (1983), pp. 583{601.
[13] S. Geng, Construction of high order symplectic Runge-Kutta methods, J. Comput. Math., 11
(1993), pp. 250{260.
[14] J. A. George, On block elimination for sparse linear systems, SIAM J. Numer. Anal., 11 (1974),
pp. 585{603.
[15] G. Golub and C. F. van Loan, Matrix computations, Johns Hopkins Studies in the Math.
Sci., The Johns Hopkins Univ. Press, Baltimore and London, Third Edition, 1996.
[16] S. Gonz

alez-Pinto, C. Gonz

alez-Concepci

on, and J. I. Montijano, Iterative schemes for
Gauss methods, Computers Math. Applic., 27 (1994), pp. 67{81.
[17] S. Gonz

alez-Pinto, J. I. Montijano, and L. R

andez, Iterative schemes for three-stage
implicit Runge-Kutta methods, Appl. Numer. Math., 17 (1995), pp. 363{382.
[18] E. Hairer and G. Wanner, Algebraically stable and implementable Runge-Kutta methods of
high order, SIAM J. Numer. Anal., 18 (1981), pp. 1098{1108.
[19] , Characterization of non-linearly stable implicit Runge-Kutta methods, in Numerical inte-
gration of dierential equations, vol. 968 of Lect. Notes in Math., 1982, pp. 207{219.
[20] , Solving ordinary dierential equations II. Sti and dierential-algebraic problems, vol. 14
of Comput. Math., Springer, Berlin, Second Revised Edition, 1996.
[21] W. Hoffmann and J. J. B. de Swart, Approximating Runge-Kutta matrices by triangular
matrices, BIT, 37 (1997), pp. 346{354.
[22] L. O. Jay, Structure-preservation for constrained dynamics with super partitioned additive
Runge-Kutta methods, SIAM J. Sci. Comput., (1995). To appear.
[23] S. P. Nrsett, Semi-explicit Runge-Kutta methods, Tech. Report 6/74, Dept. of Math., Univ.
of Trondheim, Norway, 1974.
[24] B. Orel, Parallel Runge-Kutta methods with real eigenvalues, APNUM, 11 (1993), pp. 241{250.
[25] W. Rheinboldt, Methods for solving systems of nonlinear equations, SIAM Classics in Appl.
Math., SIAM, Philadelphia, Second Revised and Expanded Edition, 1998. To appear.
[26] Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solving
nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986), pp. 856{869.
[27] P. J. van der Houwen and J. J. B. de Swart, Triangularly implicit iteration methods for
ODE-IVP solvers, SIAM J. Sci. Comput., 18 (1997), pp. 41{55.