MONOTONIC PARAREAL CONTROL FOR QUANTUM SYSTEMS

. Following encouraging experimental results in quantum control, numerical simulations have known signiﬁcant improvements through the introduction of eﬃcient optimization algorithms. Yet, the computational cost still prevents using these procedures for high-dimensional systems often present in quantum chemistry. Using parareal framework, we present here a time parallelization of these schemes which allows us to reduce signiﬁcantly their computational cost while still ﬁnding convenient controls.


Introduction.
In the last decade, quantum control has witnessed significant developments including encouraging experimental results [5,6,9,15,16,23,24,29,36]. At the computational level [7,25], the introduction of the monotonic algorithms of Krotov (by Tannor [31]), Zhu and Rabitz [37], or the unified form of Maday and Turinici [21] allows us to design efficient methods to obtain laser fields controlling the molecular dynamics.On the other hand, parareal scheme (that stands for parallelization in real time) has shown a convenient efficiency in the case of the Schrödinger equation; see, e.g., [2,33].In what follows, we combine these two approaches by using monotonic algorithms as the inner loop of a time-parallelization procedure.
Let us first briefly present the model and the corresponding optimal control framework used in this paper.Consider a quantum system described by its wavefunction ψ(x, t), also called state in what follows.Here "x" ∈ Ω denotes the relevant spatial coordinates (the symbol x will often be omitted in what follows for reason of simplicity).The operator V (x) is the potential part.The dynamics of this system is characterized by its internal Hamiltonian: In this equation H 0 , the kinetic part, could be where p is the number of particles considered, m n their masses, and Δ rn the Laplace operator with respect to their coordinates.
A way to control such a system is to light it with a laser pulse.We denote by ε(t) the intensity of this control field.The contribution of this parameter is taken into account by introducing a perturbative term in the Hamiltonian which then reads H(x) − μ(x)ε(t).The evolution of ψ ε (x, t) is governed by the Schrödinger equation (we work in atomic units, i.e., = 1): (1.1) where ψ init is the initial condition for ψ ε subject to the constraint: Since H is self-adjoint, from (1.1) the norm of the state is constant with respect to the time.In the numerical simulations, the ground state, i.e., a unitary eigenvector of H associated with the lowest eigenvalue, is generally taken as initial condition.
The optimal control framework can then be applied to this bilinear control system to design relevant control fields.The quality of the pulse is evaluated via a cost functional.A general example of such a function is where T is the total time of control, α a positive function, and ψ target a target state which has to be reached.At the minimum of the cost functional J, the Euler-Lagrange critical point equations are satisfied; a standard way to write these equations is to use a Lagrange multiplier χ ε (x, t) called adjoint state.The following critical point equations are thus obtained [37]: ) where A is an operator on L 2 (Ω) and Numerous optimization procedures exist to compute iteratively sequences (ε k ) k∈N that approximate the solution of (1.3)-(1.5).The common feature of these algorithms is that they involve repeated resolutions of Schrödinger equations (1.3) and (1.4), which induce a heavy computational time cost.Depending on their order, the mere computation of ε k can also be time consuming due to the high nonlinearity of the cost functional.In order to reduce the computation time, some time parallelization of the resolution of (1.3)-(1.5)can be done.A standard method consists of subdividing the interval [0, T ] into subintervals and to compute iteratively the corresponding adequate initial conditions for parallel resolution on each subinterval.This approach is also the base of the multiple shooting methods (see [22, sect. 17.1] and, e.g., [8]).In [10] a comparison between these methods shows that the parareal algorithms can be recast as a multiple shooting algorithm where the Jacobian matrix is approached by a finite difference method on a coarse grid.
Another body of related literature was introduced in the pioneering works of Hackbush; see [11,12] and also [14,19,35].The parareal method corresponds to a two-level multigrid with a larger coarsening rate and an unusual smoother which corresponds to a single phase of a bicolor relaxation scheme.
Such time parallelization procedures have already been associated to optimization procedures to tackle control problems, e.g., in the case of ordinary differential equations [13], or linear control of hyperbolic [17], or parabolic evolution equations [4,34].On the contrary, we present here a new method to treat the bilinear optimal control of the Schrödinger equation (1.1), and consider a particular decomposition of J in sub-cost functionals corresponding to the time subdivision.In this framework, an iterative optimization procedure is designed that converges to a critical point of J.
The paper is organized as follows: parareal optimal control settings corresponding to our quantum control problem is presented in section 2. In section 3, we give an algorithm achieving a parallel in time optimization.We prove the convergence of this procedure towards a critical point of a discrete version of the cost functional J in section 4 and we finally give some numerical results in section 5.

Parareal control setting.
Throughout this section the control field ε is either a function of L 2 ([0, T ]) or its corresponding time discretization.

Main features of the parallelization. Following Lions, Maday, and
Turinici [18], we now introduce the necessary concepts and tools involved in the time parallelization proposed by the parareal approach.

Subdivision of [0, T ] and virtual controls.
Let N ≥ 1 be an integer.In order to design the time parallelized optimization procedure, we introduce a subdivision of [0, T ]: with T 0 = 0 and T N = T .Consider also a set Λ = (λ ) =1,...,N −1 ∈ L 2 (Ω) In what follows, Λ will be called the set of virtual controls.For notational simplicity, we will also denote by λ 0 the initial state ψ init and by λ N the target state ψ target .The resolution of (1.1) is now substituted by the resolution of the N problems: (2.1) where ε is the restriction of ε to [T , T +1 ] (with = 0, . . ., N − 1).By (2.1), ψ ε also depends on λ .In order to simplify the notations, we omit this dependence.The solution of (2.1) coincides to that of (1.1) if and only if The parareal framework provides different methods to iteratively compute a solution Λ * of (2.1)-(2.2).

Coarse and fine propagators.
Suppose that the numerical simulation of (1.1) can be realized both by a coarse propagator and a fine propagator.Because the use of the coarse propagator is considered to be cheap, it can be used for the resolution of (1.1) over the whole interval of control [0, T ].On the contrary, the fine propagator will only be used for parallel resolutions on [T , T +1 ].
The analysis of the algorithm presented below requires that the L 2 -norm of the finely approximated solution be constant with respect to the time (as is the case for the exact solution of (1.1)).Because of its numerical accuracy, we choose to consider the Strang-second-order split-operator solver [3,30], that fulfills this property.Let us present it in some detail.
Remark 1.Though we focus on the Strang-second-order split-operator scheme, the methodology presented in this paper can be adapted to other time discretizations.Indeed, the analysis done below requires only that the norm of the wavefunction be preserved during the propagation.

Parareal strategy.
Parareal algorithms aim at computing in parallel on each subinterval the solution of evolution equations such as (1.1).To do this, they propose various iterative methods to update the virtual controls after each parallel propagation.The purpose of what follows is to define an updated algorithm that allows one to couple the parareal approach with an optimization procedure of quantum control.
Note that the optimization problems defined on [T , T +1 ] via J are similar to the initial control problem on [0, T ] corresponding to (1.2).

Monotonic parareal algorithm.
Our aim is to optimize J (ε, Λ) with respect to its two variables.We first present the main features of our algorithm and then give further details on each step.

Structure of the algorithm.
To couple parareal framework and the control optimization, we propose the following methodology: given ν > 0, consider the termination criterion c(ε Given an initial control field ε k and a tolerance tol ≥ 0, the computation of ε k+1 is done as follows: 1 4. Using ψ k and χ k , compute Λ k , which optimizes J (ε k , Λ) with respect to Λ. 5. On each interval [T , T +1 ], compute in parallel some control field ε k+1 , which optimizes the cost functionals J (ε , λ k , λ k +1 ) with respect to ε .6. Define ε k+1 as the concatenation of the control fields ε k+1 .7. Assign k ← k + 1.Return to step 1.This algorithm is similar to an alternate direction descent algorithm, in the sense that it alternatively optimizes J (ε, Λ) with respect to Λ and to ε.
Of course, to take advantage of the time parallelization, steps 2 and 3 of the previous algorithm are to be computed using the coarse propagator, whereas step 5 can be done simultaneously and by fine propagations.
Remark 2. Solving (1.4) exploits in an essential manner the time-reversibility of the Schrödinger equation.Further work is required to extend this algorithm to nonreversible cases.

Virtual controls definition.
We present in this section some results which will enable us to achieve efficiently step 4 of the monotonic parareal algorithm.As we do not intend to deal with the coarse propagator properties, we will consider in this section that steps 2 and 3 are done with the split-operator method presented in section 2.1.2,with the (small) time step δt.Even though this is not what we want to do ultimately, the results below keep a practical interest since the most expensive part of the algorithm is the update of the control field which will be done in parallel.Thus, given a control field ε = (ε j ) j=0,...,n−1 , the states ψ ε = (ψ ε j ) j=0,...,n and the adjoint states For reasons of simplicity, we use in what follows the following notation: where 0 ≤ j, j ≤ n are two (time) indices, and ψ ε = (ψ ε j ) j=0,...,n is the solution of (3.1).We still denote by J the time discretized cost functional corresponding to (1.2): The following theorem provides the optimal choice of virtual controls Λ.Theorem 3.1.With the previous notations, let us define Λ ε = (λ ε ) =1,...,N −1 by Moreover, we have Proof.For a given Λ = (λ ) =1,...,N −1 , let us first prove that J(ε) is a lower bound for J (ε, Λ).The Cauchy-Schwarz inequality gives Recalling that F ε is a unitary operator, we have Combining (3.9) and (3.6) we obtain, since By (3.4), we have Combining this equality with (2.4), we obtain the following: and the theorem follows.
Remark 3. The trajectory (1 − j N )ψ ε j + j N χ ε j j=0,...,n is an ideal trajectory that reaches exactly the target ψ target .The choice Λ = Λ ε is equivalent to define the virtual controls on this trajectory.This interpretation is closely related to the concept of reference trajectory tracking: through the introduction of the parareal cost functionals, the initial problem is transformed into N − 1 optimization problems that aim to minimize on each subinterval the distance between the current trajectory and this ideal (unknown) trajectory.
Remark 4.An alternative definition for Λ can be , which has the advantage that the norms of the virtual controls are preserved.Furthermore, it can be proved that λ corresponds to a critical point of J (ε, Λ) under the constraint

Monotonic schemes.
Let us now describe a practical implementation of step 5 of the monotonic parareal algorithm.Given a set of virtual controls Λ = (λ ) =1,...,N −1 (recall that λ 0 = ψ 0 and λ N = ψ target ), the parareal cost functional J reads The first two terms of J will not change during the optimization of ε.An efficient way to minimize cost functionals of the form J(ε) = −2Re ψ(T ), ψ target + T 0 α(t)ε 2 (t)dt associated with the Schrödinger equation is to use monotonic schemes [21,37].
In our case, the time discretized monotonic scheme corresponding to (3.13) can be defined by the following procedure.
Let us consider (δ, η) ∈ [0, 2[ × [0, 2[ and introduce the notations and μ * (h), an approximation of μ, defined by where Id is the identity operator.Given a control field ε k , its restriction ε k to the interval [T , T +1 ] and the corresponding ψ k = (ψ k ,j ) j=n ,...,n +1 are defined iteratively by = e The computation of ε k+1 is performed as follows: 1.For = 0, . . ., N − 1, compute iteratively an intermediate control field ε k and its corresponding adjoint state χ k by where ε k ,j is such that with the final condition 2. For = 0, . . ., N − 1, compute iteratively the control field ε k+1 and its corresponding state ψ k+1 by where ε k+1 ,j is such that with the initial condition The implicit equations (3.17) and (3.18) are solved independently for ε k ,j and ε k+1 ,j , respectively, at each time step by a Newton method (all other parameters involved are known).We refer the reader to [20] for a proof of the existence of solutions and further details on this scheme.
In what follows, the initial value ε 0 of the monotonic scheme is considered fixed.An important property of this algorithm is given in the following theorem [21].
This optimization procedure can be done in parallel on each interval [T , T +1 ].Consequently, the computations can be carried out with fine propagators F ε k+1 .Remark 5.As was the case for step 4 (see Remark 3), this step of the algorithm is also linked to the concept of reference trajectory tracking: in the monotonic schemes, the control field ε k ,j1 (resp., ε k+1 ,j ) is chosen such that the distance between the current states and adjoint state ) decreases at each time step.We refer the reader to [28,32] for details about the relationship between the monotonic schemes and local tracking algorithms.
Remark 6.Note that several iterations of this scheme can be done during step 5 of the monotonic parareal algorithm, especially in case of slow convergence (see Table 5.1 in section 5.4 for numerical results about it).
The algorithm is schematically depicted in Figure 3.1.

Monotonicity of the algorithm. The combination of the two previous strategies allows us to define
and ε k+1 as the concatenation of the sequence (ε k+1 ) =0,...,N −1 .We have thus obtained a global monotonic algorithm since

4.
Convergence of the algorithm.The convergence of the sequence (ε k ) k∈N defined by the previous algorithm is described in the next theorem.
Theorem 4.1.Given an initial control field ε 0 , consider the sequence (ε k ) k∈N obtained by the algorithm (3.16)- (3.19),where the coarse propagator in steps 2-3 is taken to be the same as the fine propagator.The sequence (ε k ) k∈N converges towards a critical point of J.
Proof.Since the proof is very similar to that presented in [26], we give only a brief overview.
We are now in the position to reproduce the analysis in Theorem 4.5 of [26] which shows that where . 2 denotes the usual Euclidean norm on R n .Hence, the sequence (ε k ) k∈N is a Cauchy sequence and by (4.6) the theorem follows.

Model.
In order to test the performance of the algorithm, a case already treated in the literature has been considered.The system is a molecule of HCN modelled as a rigid rotator.We refer the reader to [1,27] for numerical details concerning this system.The goal is to control the orientation of the system; this is expressed through the requirement that ψ(T ) − ψ target L 2 (Ω) is minimized, where the target function ψ target corresponds to an orientated state.

Propagators.
All the propagations are done through a Strang-secondorder split-operator type as, e.g., in (3.15).The coarse propagator, corresponding to (3.1) and (3.2), is used with a large time step, whereas the fine propagator, as it appears in (3.18) and (3.17), is used with a small time step.The ratio of these two time steps is 10.We have observed that a renormalization (3.12) slightly reduces the speed of convergence, but has no effect on the converged control fields.

Numerical convergence.
Let us present some results concerning the convergence of the monotonic parareal algorithm.

Evolution of (ε k
) k∈N .In a first test, N = 10 identical subintervals are considered to parallelize the algorithm.Figures 5.1-5.3show a typical evolution of the sequence (ε k ) k∈N computed by our algorithm.Figure 5.4 represents the control field obtained without parallelization by a monotonic algorithm applied to J. Note that the discontinuities that are visible on Figure 5.1 and even on Figure 5.2 disappear as the number of iteration increases.

Variation of the number of subintervals.
The control field obtained at the numerical convergence appears to be independent of the number N of subintervals.This is coherent with Theorem 4.1 which claims that the limit depends only on the initial cost functional J.In order to evaluate the effect of N on the convergence of the algorithm, we have run the algorithm for different values of this parameter.Figure 5.5 shows the evolution of the cost functional values for N = 1, 2, 5, 10, 50, 100.The convergence speed decreases when N becomes larger.One has thus to find an optimum between the acceleration obtained by parallelization and the reduction of the convergence speed induced by it.This compromise depends on the choice of the coarse and fine propagators.In our case, the parallel optimizations and the coarse propagations allow us to reach a satisfactory cost functional value with an actual gain in "wall-clock" time roughly equal to 7. We have not sought to optimize the coarse and the fine propagators for these numerical tests.In particular, a coarser propagator should improve this result.approach such a full efficiency, a strategy would be to increase the parallel computations (step 5) with respect to the coarse propagations (steps 2 and 3 of the algorithm).
In this perspective, the influence on the convergence of the number of iterations (denoted by m) of the monotonic algorithm (3.17)-(3.18)done during step 5 has been tested (see Remark 6).Let us denote by k the number of iterations of the monotonic parareal algorithm, by T f the computational time of one iteration of the monotonic algorithm in a subinterval, and by T C the computational time of both step 2 and step 3. Table 5.1 summarizes the cases that have been tested.Case 0 corresponds to the nonparallelized monotonic algorithm (i.e., with N = 1) computed with the fine propagator.Figure 5.6 shows that the best strategy corresponds to case 1.Further work is needed to reach full efficiency (corresponding in our case to a computational time close to 100.T f ).

Fig. 3 . 1 .
Fig. 3.1.Symbolic representation of one iteration of the monotonic parareal algorithm.The optimization is achieved in parallel on each subinterval [T , T +1 ].The virtual controls λ k are updated at each iteration.
Number of updates of ε per iteration.

Fig. 5 . 6 .
Fig. 5.6.Evolutions of the cost functional values for different values of m.