KAM , α-Gevrey regularity and the α-Bruno-Rüssmann condition

We prove a new invariant torus theorem, for α-Gevrey smooth Hamiltonian systems, under an arithmetic assumption which we call the α-Bruno-Rüssmann condition, and which reduces to the classical Bruno-Rüssmann condition in the analytic category. Our proof is direct in the sense that, for analytic Hamiltonians, we avoid the use of complex extensions and, for non-analytic Hamiltonians, we do not use analytic approximation nor smoothing operators. Following Bessi, we also show that if a slightly weaker arithmetic condition is not satisfied, the invariant torus may be destroyed. Crucial to this work are new functional estimates in the Gevrey class.


The general question
We consider small perturbations of an integrable Hamiltonian system, defined bẏ q = ∇ p H(q, p),ṗ = −∇ q H(q, p) where H is a Hamiltonian of the form H(q, p) = h(p) + ǫf (q, p), (q, p) ∈ T n × R n , 0 ≤ ǫ < 1 where n ≥ 2, T n = R n /Z n , ω 0 = ∇h(0) ∈ R n , and ∇ 2 h(0) ∈ M n (R) is non-degenerate. When ǫ = 0, the torus T 0 of equation p = 0 is invariant and quasi-periodic of frequency ω 0 . The general question we are interested in is the persistence of this torus for ǫ = 0 sufficiently small : does there exist a torus T ǫ which is invariant and quasi-periodic of frequency ω 0 and which converges (in a suitable sense) to T 0 as ǫ goes to zero?
This question was answered positively by Kolmogorov in his foundational paper [Kol54] under the assumption that H is real-analytic and ω 0 is a τ -Diophantine vector (τ ≥ n − 1): there exists γ > 0 such that for all k ∈ Z n \ {0}, |k · ω 0 | ≥ γ|k| −τ . As a conclusion, the perturbed torus is real-analytic. It became clear that a regularity assumption on H and an arithmetic condition on ω 0 were necessary, and then further works investigate the interplay between the analysis and the arithmetic.
It was certainly a remarkable contribution of Moser (see [Mos62]) to realize that the question can also be answered for Hamiltonians which are only finitely differentiable. More precisely (see [Sal04]), if ω 0 is τ -Diophantine and if H is of class C r , with r > 2(τ + 1), then the torus persists and it is of class C r ′ +τ +1 for any r ′ < r − 2(τ + 1). If H is smooth, that is C ∞ , there is no restriction on τ and the perturbed torus is smooth. It follows from a recent result of Cheng-Wang [CW13] (which uses an idea of Bessi [Bes00]) that the result is false if H is of class C r , with r < 2(τ + 1). Thus in the finitely differentiable or smooth case, one may consider this Diophantine condition as essentially optimal.
In the real-analytic setting, the Diophantine condition is not necessary. Indeed, it is sufficient to assume that ω 0 satisfies the weaker Bruno-Rüssmann condition (see §2.2 for a definition), as was first proved by Rüssmann in [Rüs01]; an equivalent condition was actually introduced earlier by Bruno [Bru71], [Bru72] in a different but related small divisors problem, the Siegel linearization problem. The necessity of this condition turns out to be a more subtle problem. In the Siegel problem, it is optimal in dimension one (this is a celebrated result of Yoccoz [Yoc88], [Yoc95]) but in higher dimension it is unknown.
In the Hamiltonian problem we are considering here, the only general result we are aware of is due to Bessi [Bes00] (extending an earlier result of Forni [For94] for twist maps of the annulus) in which a torus with a frequency not satisfying a slightly weaker condition can be destroyed by an arbitrary small analytic perturbation. This leaves open the possibility of slightly improving the Bruno-Rüssmann condition.

Main results of the paper
Real-analytic functions are characterized by a growth of their derivatives of order s −|k| k! for some analyticity width s > 0; in the periodic case, this is equivalent to a decay of Fourier coefficients of order e −s|k| . Given a real parameter α ≥ 1, allowing a growth of the derivatives of order s −|k| k! α or, equivalently, a decay of Fourier coefficients of order e −αs|k| 1/α , one is lead to consider α-Gevrey functions, which thus corresponds to realanalytic functions when α = 1. Since the introduction by Gevrey of the class of functions now baring his name ( [Gev18]), there has been a huge amount of works on Gevrey functions, mainly for PDEs, but also more recently in other fields, including dynamical systems (see §1.3 for some related works in dynamical systems dealing with Gevrey regularity).
In this paper, we study the persistence when ǫ is small of the torus T 0 , as a Gevrey quasiperiodic invariant torus T ε , under the assumptions that H itself has Gevrey regularity. The only general result so far is due to Popov [Pop04] who proved that the latter holds true if ω satisfies a Diophantine condition. This result, the proof of which uses analytic approximation, extends the result of Kolmogorov when α = 1 but not the one of Rüssmann: clearly one would expect an arithmetic condition which does depend on α and that reduces to the Bruno-Rüssmann condition when α = 1.
The main result of the paper is to solve this persistency problem, assuming that the frequency ω 0 satisfies some arithmetic condition which we call the α-Bruno-Rüssmann condition, which is weaker than the Diophantine condition and agrees with the Bruno-Rüssmann condition when α = 1. This is the content of Theorem A; Theorem B and Theorem C deal respectively with the iso-energetic and time-periodic versions. We will also state and prove a Gevrey analogue of Arnold's normal form theorem for vector fields on the torus (Theorem E). Theorem H is a more precise, quantitative statement, with parameters, which does not require non-degeneracy, and from which Theorem A and Theorem E follow. We also notice that Bessi's ideas [Bes00] may be adapted to the Gevrey setting, to provide a necessary arithmetic condition for the invariant torus to persist (Theorem D). The soobtained condition fails to agree with the sufficient condition of Theorem A and, as in the analytic case, it remains open to determine the optimal condition. Finally, we will also give discrete versions of Theorem A and Theorem E, which are, respectively, Theorem F and Theorem G.
When a Hamiltonian is not real analytic, it is often the case that there is still some control on its derivatives and that it has Gevrey regularity. This may happen for example for the restriction of an analytic Hamiltonian restricted to a Gevrey, symplectic, central manifold. Technically, Gevrey regularity luckily extends the well-behaved analytic regularity in KAM theory: the effect of small denominators in Fourier series reduces to decreasing the "Gevrey width" s, the analogue of the analyticity width. This makes it possible to adapt Kolmogorov's proof of his invariant torus theorem without using analytic approximations or smoothing operators as in the smooth setting. Yet there are two issues one needs to solve.
The first and main issue is that the estimates needed in the general problem of perturbation theory were missing. This is why we provide an appendix with an adequate choice of norms and spaces, together with the estimates needed in our proof. In particular, Proposition 20 provides a "geometric" estimate of the composition of two Gevrey functions, in which the loss of Gevrey width is arbitrarily small when composing a function to the right by a diffeomorphism close to the identity, in continuity with the real-analytic setting. Starting with the work of Gevrey itself [Gev18], there have been many results concerning the composition of Gevrey functions (see, for instance, Yamanaka [Yam89], Marco-Sauzin [MS02], Cadeddu-Gramchev [CG03], Popov [Pop04]) but none of them allowed an arbitrarily small loss of width except in some particular cases (the one-dimensional case and the analytic case). To our knowledge, our composition result is new and may be of independent interest.
The second and minor issue is that to reach a weak arithmetic condition, it is usually better not to solve exactly the cohomological equation but an approximate version of it, and hence one cannot proceed as in Kolmogorov's proof. The strategy of Rüssmann, that we could have tried to pursue here, consists in solving this equation not for the original perturbation but for a polynomial approximation of it. We will rather adopt the strategy of [BF13], [BF14] in which periodic approximations of the frequency are used and only cohomological equations associated to periodic vectors need to be solved: estimates on the solution are straightforward in this case, unlike the cohomological equation associated to a non-resonant vector.
As a further remark concerning the proof, in invariant tori problems derivatives in the angle and action directions do not play the same role: in the analytic case it is customary to introduce anisotropic norms. However, as Theorem H and its proof show, we can still get good estimates if we keep track separately of the sizes of various terms in the expansion of the Hamiltonian with respect to the actions: this turns out simpler than using anisotropic Gevrey norms. Such a feature is not present in dealing with linearization problem such as in Theorem E; a direct proof of the latter result would have been much simpler.

Related results
Apart from the work of Popov that we have already mentioned, there have been several works dealing with Gevrey regularity in a related context. The first setting is the so-called Siegel-Sternberg linearization problem. Under a nonresonance condition, a formal solution to the conjugacy problem always exists and Sternberg proved that the solution is in fact smooth. In the analytic case, under the Bruno-Rüssmann condition the conjugacy is analytic; this arithmetic condition is thus sufficient but also necessary in (complex) dimension one (a result of Yoccoz we already mentioned). In the Gevrey setting, still under the Bruno-Rüssmann condition, Carletti-Marmi [CM00] and Carletti [Car03] have shown that the formal solution still has Gevrey growth (with the same Gevrey exponent); an interesting feature of their result is that allowing a worse Gevrey exponent for the formal solution, one can relax accordingly the arithmetic condition. All these results are actually valid for a class of ultra-differentiable functions that includes analytic and Gevrey functions. It was then proved by Stolovitch [Sto13] that this formal Gevrey solution actually give rise to a Gevrey smooth solution, and recently, Pöschel [Pös17] gave a very general version of the Siegel-Sternberg theorem for ultradifferentiable functions that contains all the previous results (the smooth, analytic, Gevrey and ultra-differentiable cases). Let us mention that all these results do use stability by composition, but a precise composition result is not needed as they do not require to keep track of the width.
In the analytic setting, the Siegel problem and the problem of the linearization of circle diffeomorphisms are solved under the same arithmetic condition [PM97]. But this may well be incidental, and, to our knowledge, it may well not be true in the Gevrey setting. The only result concerning Gevrey circle diffeomorphism we are aware of is due to Gramchev-Yoshino [GY99]: they proved the linearization theorem under a condition which is weaker than the Diophantine condition but stronger than the α-Bruno-Rüssmann condition (they actually introduce a condition equivalent to our α-Bruno-Rüssmann condition and conjecture that the result should hold under this condition). To prove such a result, they use a composition result but in one dimension only; in this special case, as we already pointed out above, good composition estimates are known (see, for instance, [MS02]). As a matter of fact, Theorem G (the discrete version of Theorem E) gives linearization of Gevrey torus diffeomorphism close to a translation under the α-Bruno-Rüssmann condition, extending the result in [GY99] (and giving a positive answer to their conjecture).

Further results
Let us describe some further results that could be achieved using the techniques of this paper. The literature on KAM theory is enormous and so there are many potential applications; we will only describe here some of those that may have some interest.
First, and more importantly, the technical estimates we derive in Appendix B for Gevrey functions actually hold true for a larger class of ultra-differentiable functions that includes Gevrey (and thus analytic) functions as a particular case. This not only leads to a further extension of the KAM theorems we state and prove here, but also allows us to generalize other perturbative results such as the Nekhoroshev theorem (extending the result of [MS02] in the convex case and [Bou11] in the steep case). To keep this paper to a reasonable length, all these results will be derived in a subsequent article [BF17].
Then, our main result Theorem A deals with the persistence of Lagrangian tori; KAM theory also deals with lower-dimensional tori (see, for instance, [Rüs01] for a comprehensive treatment in the analytic case), and one may expect that our result extend to such a setting.
Finally, one may consider the problem of reducibility of quasi-periodic cocycles close to constant. In the analytic case, the Bruno-Rüssmann condition is sufficient, as was shown in [CM12]; in the α-Gevrey case, the α-Bruno-Rüssmann condition is sufficient. In fact, this setting is simpler from a technical point of view and our Gevrey estimates are not necessary to obtain such a result; one simply needs to go through the proof of [CM12]. A possible explanation for this is that for quasi-periodic cocycles, composition occur in a linear Lie group, thus only estimates for linear composition (product of matrices) are necessary and so everything boils down to good estimates for the product of two functions.

Plan of the paper
The plan of the paper is as follows.
In Section 2, we describe precisely the setting, namely we properly define the Gevrey norms we will use and the α-Bruno-Rüssmann condition. In Section 3 we state our main results: • Theorem A about the persistence of a torus in a non-degenerate Hamiltonian system under the α-Bruno-Rüssmann condition; • Theorem B, the iso-energetic version of Theorem A; • Theorem C, the non-autonomous time-periodic version; • Theorem D about the destruction of a torus in the same context not assuming a condition weaker that the α-Bruno-Rüssmann condition; • Theorem E about linearization of vector fields on the torus close to constant (we will also discuss necessary arithmetic conditions here, albeit in a restricted context); • Theorem F, the discrete version of Theorem A, about the persistence of a torus in a non-degenerate exact-symplectic map; • Theorem G, the discrete version of Theorem E, about the linearization of diffeomorphisms of the torus close to a translation.
In Section 4 we state Theorem H, the main technical result of this paper, which is a KAM theorem which do not require non-degeneracy but depends on parameters. In Section 5, we give the proof of Theorems A and E, assuming Theorem H. Section 6 contains the proof of Theorem H. Section 7 contains the proof of Theorem D, a straightforward extension of the work of Bessi [Bes00]. Finally, two appendices contain technical results. Appendix A provides various characterizations of the α-Bruno-Rüssmann condition. Appendix B, which is absolutely crucial in this work, provides estimates on Gevrey functions (and in particular our composition result Proposition 20) which are use throughout the paper.

Gevrey Hamiltonians
Recall that n ≥ 1 is an integer, T n = R n /Z n and let B ⊆ R n be a bounded open domain containing the origin. For a small parameter ǫ ≥ 0, we consider a Hamiltonian function The Hamiltonian h is non-degenerate at the origin if the matrix ∇ 2 h(0) itself is nondegenerate. We shall assume that the Hamiltonian H is α-Gevrey on T n ×B, with α ≥ 1 and whereB denotes the closure of B in R n : H is smooth on a open neighborhood of T n ×B in T n × R n and there exists s 0 > 0 such that, using multi-indices notation (see Appendix B) and |k| = 2n This definition can be extended to vector-valued function X : T n ×B → R p by setting where | . | 1 is the l 1 -norm of vectors in R p , or the sum of the absolute values of the components. As a rule, we will use the l 1 -norm for vectors, so for simplicity we shall write | . | 1 = | . |. To emphasize the role of the "Gevrey width" s 0 , we shall also say that H is (α, s 0 )-Gevrey if (1) holds. Observe that a function is 1-Gevrey if and only it is realanalytic, in which case the parameter s 0 > 0 is the width of analyticity. Properties of these Gevrey norms are described in Appendix B; in particular we explain there the (inessential) role of the factor (|k|+1) 2 and the normalizing constant c > 0 in (1).

The α-Bruno-Rüssmann condition
Given ω 0 ∈ R n , define the function This function Ψ ω 0 measures the size of the so-called small denominators which will come into play in our computations. Call BR the set of vectors ω 0 satisfying the so-called Bruno-Rüssmann condition, and, given α ≥ 1, call BR α the set of vectors ω 0 satisfying the α-Bruno-Rüssmann condition, which we define as These conditions prevent Ψ ω 0 from growing too fast at infinity. If ω 0 ∈ BR = BR 1 , in particular Ψ ω 0 (Q) is finite for all Q, i.e. ω 0 is non-resonant. Besides, the set BR α decreases with respect to α. For example, if Ψ ω 0 (Q) = exp(Q β ) then ω 0 ∈ BR α if and only if β < 1/α (we let the reader check, using continued fractions if n = 2, that the set of vectors ω 0 having such function Ψ ω 0 is not empty). Let D τ be the set of τ -Diophantine vectors (τ ≥ n − 1), i.e. for which there exists γ > 0 such that Ψ(Q) ≤ Q τ /γ for all Q ≥ 1. D τ is non-empty and has full measure if τ > n − 1 [Rüs75]. As definitions show, for all α ≥ 1, we have D τ ⊂ BR α . Thus, as Example 10 shows, ∩ α≥1 BR α \ ∪ τ ≥n−1 D τ has zero-measure but is non-empty. Now assume that ω 0 is non-resonant. The function Ψ ω 0 is non-decreasing, piecewise constant, and has a countable number of discontinuities. In the sequel, it will be more convenient to work with a continuous version of Ψ ω 0 : it is not hard to prove (see, for instance, Appendix A of [BF13]) that one can find a continuous non-decreasing function Ψ : [1, ∞) → [Ψ(1), +∞) such that Ψ(1) = Ψ ω 0 (1) and For all k ∈ Z n \ {0}, we still have and in the condition (BR α ) (which defines ω 0 ∈ BR α ), one may use Ψ instead of Ψ ω 0 . Let us now define the function It is continuous and increasing, and thus is a homeomorphism, whose functional inverse is In Appendix A we show that the set BR α agrees with the set A α defined by the condition 3 Main results

KAM theorem for non-degenerate integrable Hamiltonians
The image of the map Θ 0 : T n → T n × B, q → (q, 0), is an embedded torus invariant by the flow of h carrying a quasi-periodic flow with frequency ω 0 . We shall prove that this quasi-periodic invariant Gevrey-smooth embedded torus is preserved by an arbitrary small perturbation, provided h is non-degenerate, H is α-Gevrey and ω 0 satisfies the α-Bruno-Rüssmann condition.
Theorem A. Let H be as in ( * ), where H is (α, s 0 )-Gevrey, ω 0 ∈ BR α and h is nondegenerate. Then there exists 0 < s ′ 0 < s 0 such that for ǫ small enough, there exists an (α, s ′ 0 )-Gevrey torus embedding Θ ω 0 : T n → T n × B such that Θ ω 0 (T n ) is invariant by the Hamiltonian flow of H and quasi-periodic with frequency ω 0 . Moreover, Θ ω 0 is close to Θ 0 in the sense that for some constant c > 0 independent of ǫ.
Theorem A will be deduced from a KAM theorem for a Hamiltonian with parameters, for which a quantitative statement is given in §4. Let us also state the corresponding iso-energetic and non-autonomous time-periodic versions.
We say that the integrable Hamiltonian h is iso-energetically non-degenerate at 0 if the so-called bordered Hessian of h, has a non-zero determinant. Under this assumption, the unperturbed torus p = 0, with energy h(0), can be continued to a torus with the same energy but with a frequency of the form λω 0 for λ close to one.
Theorem B. Let H be as in ( * ), where H is (α, s 0 )-Gevrey, ω 0 ∈ BR α and h is isoenergetically non-degenerate. Then there exists 0 < s ′ 0 < s 0 such that for ǫ small enough, there exist λ ∈ R * and an (α, s ′ 0 )-Gevrey torus embedding Θ ω 0 : T n → T n × B such that Θ ω 0 (T n ) is invariant by the Hamiltonian flow of H, contained in H −1 (h(0)) and quasiperiodic with frequency λω 0 . Moreover, λ is close to one and Θ ω 0 is close to Θ 0 in the sense that for some constant c > 0 independent of ǫ.
We can also look at the non-autonomous time-periodic version; we consider a slightly different setting by looking at a Hamiltonian functionH : It is better to consider the unperturbed torus p = 0 as an invariant torus for the integrable Hamiltonianh : B × R defined byh(p, e) := h(p) + e: it is then quasi-periodic with frequencyω 0 := (ω 0 , 1), has dimension n + 1 and is the image of the trivial embedding Theorem C. LetH be as in ( * ), whereH is (α, s 0 )-Gevrey, ω 0 ∈ BR α and h is nondegenerate. Then there exists 0 < s ′ 0 < s 0 such that for ǫ small enough, there exists an (α, s ′ 0 )-Gevrey torus embeddingΘ ω 0 : T n × T → T n × B × T such thatΘ ω 0 (T n × T) is invariant by the Hamiltonian flow ofH and quasi-periodic with frequencyω 0 . Moreover, Θ ω 0 is close toΘ 0 in the sense that for some constant c > 0 independent of ǫ.
Theorem B and Theorem C are essentially equivalent statements and can be easily deduced from Theorem A; in the analytic case details are given in [TZ10], Chapter 2, but it is plain to observe that the arguments still work in the Gevrey case.

Destruction of invariant tori
According to Theorem A, the α-Bruno-Rüssmann condition is sufficient for the preservation of an invariant torus under an α-Gevrey perturbation. A natural question is: is it necessary? To this question, here we only bring a partial answer, which circumscribes the optimal arithmetic condition, if any. Following Bessi [Bes00], one can show that if ω = ω 0 satisfies a condition (the condition (B α ) defined below), the torus can be destroyed. In particular, this shows that the exponent 1 + 1/α in (BR α ) cannot be replaced by a strictly larger exponent. As a matter of fact, the example of Bessi already shows this in the analytic case α = 1; our observation here is that Bessi's example gives a similar result for any α ≥ 1.
Theorem D. Given α ≥ 1, assume that the vector ω ∈ R n satisfies the following condition: Then an invariant torus with frequency ω can be destroyed by an arbitrarily small α-Gevrey perturbation.
Thus the condition that ω 0 does not satisfy (B α ), namely is a necessary condition for the conclusion of Theorem A to hold true. For α = 1, this condition (R α ) is actually a sufficient (and most probably necessary) condition to solve the cohomological equation associated to ω (see [Rüs75]); in the general case α ≥ 1 this should also be true but we couldn't find a reference. Let us also note that (R α ) is implied by (but clearly not equivalent to) the condition that ω ∈ BR α , see Remark 1 in Appendix A.
For a more precise statement and how this follows from [Bes00], we refer to Theorem 7 in Section 7. It is likely that one could improve this result for α > 1 by using perturbations with compact support as in [CW13].
Observe that for any α ≥ 1 and any 0 < β < α, vectors ω ∈ R n for which Ψ ω (Q) ∼ e Q 1/α satisfies (B α ) but also the β-Bruno-Rüssmann condition. (That such vectors do exist is a classical matter in number theory.) The following corollary is then obvious.
Corollary 1. For any α ≥ 1 and any 0 < β < α, there exist invariant tori with frequency vectors ω ∈ BR β which can be destroyed by an arbitrary small α-Gevrey perturbation. In particular, there exist invariant tori with frequency vectors ω ∈ BR which can be destroyed by an arbitrary small Gevrey non-analytic perturbation.

KAM theorem for constant vector fields on the torus
Now we state a Gevrey version of Arnold's normal form theorem for vector fields on the torus.
Theorem E. Let ω 0 ∈ BR α and X ∈ G α,s (T n , R n ) a vector field on T n of the form Then, for µ sufficiently small, there exist a vector ω * 0 ∈ R n and an (α, s/2)-Gevrey diffeomorphism Ξ : T n → T n such that X + ω * 0 − ω 0 is conjugate to ω 0 via Ξ: Moreover, we have the estimate Observe that because of the shift of frequency ω * 0 − ω 0 , in general this result does not give any information on the vector field X. Under some further assumption (for instance, if ω 0 belongs to the rotation set of X, see [Kar16]), then this shift vanishes and Theorem E implies that X is conjugated to ω 0 .
An even more restricted setting is when X is proportional to ω 0 (so that the flow of X is a re-parametrization of the linear flow of frequency ω 0 and thus ω 0 is the unique rotation vector of X); Theorem E applies in this case to give a conjugacy to ω 0 , assuming that ω 0 ∈ BR α , but the proof is actually much simpler in this case (it boils down to solve only once the cohomological equation) and should require the weaker condition that ω 0 satisfies (R α ), as it is stated in the case α = 1 in [Fay02]. Still in [Fay02], it is proved that for α = 1 (there are also versions in the C r case), if ω 0 satisfies (B α ), then there is a dense set of reparametrized linear flow which are weak-mixing (and so cannot be conjugated to the linear flow); thus a necessary condition for Theorem E to hold true is that ω 0 satisfies (R α ) (and this is also a sufficient condition if we impose that X is proportional to ω 0 ) 1 . Clearly, this should extend to the general case α ≥ 1 and thus the condition that ω 0 does not satisfy (B α ) is a necessary condition for Theorem E to hold true, as in Theorem A.

KAM theorem for maps
In this section, we give the statement of discrete versions of Theorem A and Theorem E.
Let us start with the discrete analogue of Theorem A. Given a function h :B → R, we define the exact-symplectic map As before, let us fix α ≥ 1 and s 0 > 0.
Theorem F. Let F : T n ×B → T n ×B be an (α, s 0 )-Gevrey exact symplectic map with Assume that ω 0 = ∇h(0) ∈ BR α and that h is non-degenerate. Then there exists 0 < s ′ 0 < s 0 such that for ǫ small enough, there exists an (α, s ′ 0 )-Gevrey torus embedding Θ ω 0 : T n → T n × B such that Θ ω 0 (T n ) is invariant by F and Θ ω 0 gives a conjugacy between the translation of vector ω 0 on T n and the restriction of F to Θ ω 0 (T n ). Moreover, Θ ω 0 is close to Θ 0 in the sense that for some constant c > 0 independent of ǫ.
Theorem F follows at once from Theorem B (or Theorem C) provided one has a suitable quantitative "suspension" result; in the analytic case α = 1 this was proved in [KP94] and in the Gevrey non-analytic case α > 1 this is contained in [LMS16].
In the same way, we have the following discrete analogue of Theorem E. Given ω 0 ∈ R n , let T ω 0 be the translation of T n of vector ω 0 : Let α ≥ 1 and s > 0.
Theorem G. Let ω 0 ∈ BR α and T ∈ G α,s (T n , T n ) a diffeomorphism of T n of the form Then, for µ sufficiently small, there exist a vector ω * 0 ∈ R n and an (α, s/2)-Gevrey diffeomorphism Ξ : Moreover, we have the estimate

Statement of the KAM theorem with parameters
Let us now consider the following setting. Fix ω 0 ∈ R n \ {0}. Re-ordering the components of ω 0 and re-scaling the Hamiltonian allow us to assume without loss of generality that Given real numbers r > 0 and h > 0, we let Our Hamiltonians will be defined on Let α ≥ 1, s > 0, η ≥ 0 a fixed parameter, ε ≥ 0 and µ ≥ 0 two small parameters. We consider a function H ∈ G α,s (T n × D r,h ) of the form where the notation M(θ, I, ω) · I 2 stands for the vector I given twice as an argument to the symmetric bilinear form M(θ, I, ω). Observe that A : the ring of real square matrices of size n. Observe that we do not assume ε = µ because these two small parameters play different roles in applications (in Theorem A we will have µ = √ ε while in Theorem E, ε = 0 and µ will be the only small parameter).
The function H in ( * * ) should be considered as a Gevrey Hamiltonian on T n × D r , depending on a parameter ω ∈ D ω 0 h ; for a fixed parameter ω ∈ D ω 0 h , when convenient, we will write The image of the map Φ 0 : T n → T n × D r , θ → (θ, 0) is a smooth embedded torus in T n × D r , invariant by the Hamiltonian flow of N ω 0 + R ω 0 and quasi-periodic with frequency ω 0 . The next theorem asserts that this quasi-periodic torus will persist, being only slightly deformed, as an invariant torus not for the Hamiltonian flow of H ω 0 but for the Hamiltonian flow of H ω * 0 , where ω * 0 is a parameter close to ω 0 , provided ε and µ are sufficiently small and ω 0 satisfies the α-Bruno-Rüssmann condition. Here is the precise statement.
Theorem H. Let H be as in ( * * ), with ω 0 ∈ BR α . Then there exist positive constants c 1 ≤ 1, c 2 ≤ 1 and c 3 ≥ 1 depending only on n and α such that if where Q 0 ≥ n + 2 is sufficiently large so that there exist a vector ω * 0 ∈ R n and an (α, s/2)-Gevrey embedding with the estimates Theorem A follows quite directly from Theorem H, introducing the frequencies ω = ∇h(p) as independent parameters, taking µ = √ ε, and tuning the shift of frequency ω * 0 − ω 0 using the non-degeneracy assumption on the unperturbed Hamiltonian. Theorems E follows also from Theorem H by realizing X as the restriction of a Hamiltonian vector field on an invariant torus, setting ε = η = 0 and letting µ be the only small parameter. These arguments are made precise in Section 5.

Proof of Theorem A
In this section, we assume Theorem H and we show how it implies Theorem A, following [Pös01] (in the analytic case) and [Pop04] (in the Gevrey case).
Proof of Theorem A. As noticed at the beginning of Section 4, we may assume that ω 0 is of the form For p 0 ∈ B, we expand h in a small neighborhood of p 0 : writing p = p 0 + I for I close to zero, we get Similarly, we expand ǫf with respect to p, in a small neighborhood of p 0 : Since ∇ p h : B → Ω is a diffeomorphism, instead of p 0 we can use ω = ∇ p h(p 0 ) as a new variable, and letting ∇ ω g := (∇h) −1 , we write and also, letting θ = q, and Finally, we can set so that h + ǫf can be written as and we have By assumption, h and f are (α, s 0 )-Gevrey on T n ×B, and since the space of Gevrey functions is closed under taking derivatives, products, composition and inversion (up to restricting the parameter s 0 , see Appendix B for the relevant estimates), we claim that we can find 2 s > 0, r > 0, h > 0 andc > 0 which are independent of ǫ such that H is (α, s)-Gevrey on the domain T n × D r,h with the estimates |A| α,s ≤cǫ, |B| α,s ≤cǫ, |∇ 2 I R| α,s ≤c.
We may set ε :=cǫ, µ := √ ε, η :=c and assuming ǫ small enough, we havecǫ ≤ µ = √ ε. Thus we have Having fixed s > 0 and r > 0, we may choose Q 0 sufficiently large so that (6) holds true, and then by further restricting first h, and then ǫ, we may assume that the condition (5) is satisfied. Theorem H applies: there exist an (α, s/2)-Gevrey embedding Υ ω 0 : where Φ ω 0 is given by Theorem H, and a vector ω * 0 ∈ R n such that Υ ω 0 (T n ) is invariant by the Hamiltonian flow of H ω * 0 and quasi-periodic with frequency ω 0 . Moreover, ω * 0 and Υ ω 0 satisfy the estimates for some large constant c > 1. Since h is non-degenerate, there exists p * 0 such that ∇h(p * 0 ) = ω * 0 and, up to taking c > 1 larger and recalling that µ = √ ε, the above estimates imply Now observe that an orbit (θ(t), I(t)) for the Hamiltonian H ω * 0 corresponds to an orbit (q(t), p(t)) = (θ(t), I(t)+p * 0 ) for our original Hamiltonian. Hence, if we define T : T n ×R n → T n × R n by T (θ, I) = (θ, I + p * 0 ) and then Θ ω 0 is an (α, s/2)-Gevrey torus embedding such that Θ ω 0 (T n ) is invariant by the Hamiltonian flow of H and quasi-periodic with frequency ω 0 . The estimates on the distance between Θ ω 0 and the trivial embedding Θ 0 follows directly from (8), which finishes the proof.

Proof of Theorem E
Now we show how Theorem E follows from Theorem H.
Proof of Theorem E. Consider the vector field X = ω 0 + B ∈ G α,s (T n , R n ) as in the statement. It can be trivially included into a parameter-depending vector field: given Clearly, for any parameter ω, the torus T n × {0} is invariant by the Hamiltonian vector field X Hω , and, upon identifying T n × {0} with T n , the restriction of X Hω to this torus coincides withX ω . Now the Hamiltonian H defined in (9) is of the form ( * * ) with ε = η = 0 (and e = 0) and therefore for µ sufficiently small, Theorem H applies: there exist a vector ω * 0 ∈ R n and an (α, s/2)-Gevrey embedding The embedding Φ * ω 0 clearly leaves invariant the torus T n ×{0} and induces a diffeomorphism of this torus that can be identified to Ξ := Id + E * . Writing the equality (10) in terms of Hamiltonian vector fields, we have, upon restriction to the invariant torus and recalling that the restriction of X Hω coincides withX ω , together with the estimates on ω * 0 and Ξ − Id = E * , was the statement we wanted to prove.

Proof of Theorem H
This section is devoted to the proof of Theorem H, in which we will construct, by an iterative procedure, a vector ω * 0 close to ω 0 and a Gevrey-smooth torus embedding Φ * ω 0 whose image is invariant by the Hamiltonian flow of H ω * 0 . We start, in §6.1, by recalling the Diophantine result of [BF13] which will be used in our approach. Then, in §6.2, we describe an elementary step of our iterative procedure, and finally, in §6.3, we will show that infinitely many steps may be carried out, to converge towards a solution.
In this paper, we do not pay attention to how constants depend on the dimension n or the Gevrey-exponent α, both being fixed. Hence in this section, we shall write if, for some constant C ≥ 1 depending only on n and α, we have u ≤ Cv (respectively Cu ≤ v). In particular, u ·< v is stronger than u <· v.

Approximation by rational vectors
Recall that we have written ω 0 = (1,ω 0 ) ∈ R n withω 0 ∈ [−1, 1] n−1 . For a given Q ≥ 1, it is always possible to find a rational vector v = (1, p/q) ∈ Q n , with p ∈ Z n−1 and q ∈ N, which is a Q-approximation in the sense that |qω 0 − qv| ≤ Q −1 , and for which the denominator q satisfies the upper bound q ≤ Q n−1 : this is essentially the content of Dirichlet's theorem on simultaneous rational approximations, and it holds true without any assumption on ω 0 . In our situation, since we have assumed that ω 0 is non-resonant, there exist not only one, but n linearly independent rational vectors in Q n which are Q-approximations. Moreover, one can obtain not only linearly independent vectors, but rational vectors v 1 , . . . , v n of denominators q 1 , . . . , q n such that the associated integer vectors q 1 v 1 , . . . , q n v n form a Zbasis of Z n . However, the upper bound on the corresponding denominators q 1 , . . . , q n is necessarily larger than Q n−1 , and is given by a function of Q that we can call here Ψ ′ ω 0 (see [BF13] for more precise and general information, but note that in this reference, Ψ ′ ω 0 was denoted by Ψ ω 0 and Ψ ω 0 , which we defined in (3), was denoted by Ψ ′ ω 0 ). A consequence of the main Diophantine result of [BF13] is that this function Ψ ′ ω 0 is in fact essentially equivalent to the function Ψ ω 0 .
Proposition 2. Let ω 0 = (1,ω 0 ) ∈ R n be a non-resonant vector, withω 0 ∈ [−1, 1] n−1 . For any Q ≥ n + 2, there exist n rational vectors v 1 , . . . , v n , of denominators q 1 , . . . , q n , such that q 1 v 1 , . . . , q n v n form a Z-basis of Z n and for j ∈ {1, . . . , n}, For a proof of the above proposition with Ψ ω 0 instead of Ψ, we refer to [BF13], Theorem 2.1 and Proposition 2.3; now by (4), Ψ ω 0 ≤ Ψ and so one may replace Ψ ω 0 by Ψ. Now given a q-rational vector v and a smooth function H defined on T n × D r,h , we define Given n rational vectors v 1 , . . . , v n , we let The following proposition is a consequence of the fact that the vectors q 1 v 1 , . . . , q n v n form a Z-basis of Z n .
Proposition 3 ([Bou13, Corollary 6]). Let v 1 , . . . , v n be rational vectors, of denominators q 1 , . . . , q n , such that q 1 v 1 , . . . , q n v n form a Z-basis of Z n , and H a function defined on T n × D r,h . Then Φ is a parameter-depending change of coordinates and ϕ a change of parameters. Moreover, our change of coordinates will be of the form , G : T n × D ω 0 h → R n and for each fixed parameter ω, Φ ω will be symplectic. For simplicity, we shall write Φ = (E, F, G); the composition of such transformations F = (Φ, ϕ) = (E, F, G, ϕ) is again a transformation of the same form, and we shall denote by G the groupoid of such transformations.
Proposition 4. Let H be as in ( * * ), with ω 0 = (1,ω 0 ) ∈ R n non-resonant, consider 0 < σ < s, 0 < δ < r, Q ≥ n + 2, and assume that √ Then there exists an (α, s − σ)-Gevrey symplectic transformation with the estimates such that , with the estimates Proof. We divide the proof of the KAM step into five small steps. Except for the last one, the parameter ω ∈ D ω 0 h will be fixed, so for simplicity, in the first four steps we will drop the dependence on the parameter ω ∈ D ω 0 h . Let us first notice that (13) clearly implies the following seven inequalities: It is also important to notice that the implicit constant appearing in (21) is independent of the other implicit constants; we may choose it as large as we want without affecting the other implicit constants. In the first three steps, the term R which contains terms of order at least 2 in I will be ignored, that is we will only considerĤ = H − R = N + P . Since ω 0 is non-resonant, given Q ≥ n + 2, we can apply Proposition 2: there exist n rational vectors v 1 , . . . , v n , of denominators q 1 , . . . , q n , such that q 1 v 1 , . . . , q n v n form a Z-basis of Z n and for j ∈ {1, . . . , n}, For any ω ∈ D ω 0 h , using (16) and q j <· Ψ(Q), we have

Successive rational averagings
Let us set A 1 := A, B 1 := B so that P 1 (θ, I) := A 1 (θ) + B 1 (θ) · I satisfies P 1 = P . Recalling that [ . ] v denotes the averaging along the periodic flow associated to a periodic vector v ∈ R n (see (11)), we define inductively, for 1 ≤ j ≤ n, If we further define N j by N j (I) = e(ω) + v j · I, it is then easy to check, by a simple integration by parts, that the equations and then are satisfied, where { . , . } denotes the usual Poisson bracket. Moreover, we have the estimates and then Next, for any 0 ≤ j ≤ n, define r j = r − n −1 jδ and s j = s − (2n) −1 jσ. We have r n = r − δ ≤ r j ≤ r 0 = r while s n = s − σ/2 ≤ s j ≤ s 0 = s. Let X t j be the time-t map of the Hamiltonian flow of X j . Using (27), together with inequalities (17), (18) and (19), the condition (80) and (82) of Proposition 22, Appendix B, are satisfied, so the latter proposition can be applied: for 1 ≤ j ≤ n, X t j maps T n × B r j into T n × B r j−1 for all t ∈ [0, 1] and it is of the form Now we define Φ 0 := Id to be the identity and inductively Φ j := Φ j−1 • X 1 j for 1 ≤ j ≤ n. Then Φ j maps T n × B r j into T n × B r and one easily checks, by induction using the estimates (28), that Φ j is still of the form with the estimates, for j = 1, ..., n,

New Hamiltonian
Let us come back to the HamiltonianĤ = H − R = N + P = N + P 1 . We claim that for all 0 ≤ j ≤ n, we havê Let us prove the claim by induction on 0 ≤ j ≤ n. For j = 0, we may set P + 1 := 0 and there is nothing to prove. So let us assume that the claim is true for some j − 1 ≥ 0, and we need to show it is still true for j ≥ 1. By this inductive assumption, we havê Let S j = ω · I − v j · I so that N = N j + S j and thuŝ Let us consider the first summand of the last sum. Using the equality (25), a standard computation based on Taylor's formula with integral remainder gives Clearly, U t j+1 is still of the form U t j+1 (θ, I) = U t j+1 (θ, 0) + ∇ I U t j+1 (θ, 0) · I as this is true for P j , S j , X j and that this form is preserved under Poisson bracket. Using the estimates for P j (θ, 0), ∇ I P j (θ, 0), X j (θ, 0), ∇ I X j (θ, 0) (given respectively in (26) and in (27)), the fact that S j (θ, 0) = 0, ∇ I S j (θ, 0) = ω − v j with the inequality (23), and the estimates for the derivatives and the product of Gevrey functions (given respectively in Proposition 14, Corollary 15 and Proposition 16, Corollary 18, Appendix B), one finds, for all t ∈ [0, 1] Since q j <· Ψ(Q), using (20) the latter estimate reduces to Similarly, one obtains |∇ I U t j+1 (θ, 0)| α,s j <· (Qσ α ) −1 µ. Then, using the expression of X t j and the associated estimates (28), a direct computation, still using (20), gives |P j+1 (θ, 0)| α,s j <· (Qσ α ) −1 ε and |∇ IPj+1 (θ, 0)| α,s j <· (Qσ α ) −1 µ.
It is important to recall here that we may choose the implicit constant in (21) as large as we want (in order to achieve (35)) without affecting any of the other implicit constants. Then observe also that H • Φ andR differ only by terms of order at most one in I, so and therefore using the formula for the Hessian of a composition, (34) and the fact that ∇ 2 I Φ is identically zero, one finds We also setẽ := e + [A] and observe that |ẽ − e| α,s−σ/2 ≤ |[A]| α,s ≤ |A| α,s .

Iterations and convergence
We now define, for i ∈ N, the following decreasing geometric sequences: Next, for a constant Q 0 ≥ n + 2 to be chosen below, we define ∆ i and Q i , i ∈ N, by and then we define σ i , i ∈ N, by where C ≥ 1 is a sufficiently large constant so that the last condition of (13) is satisfied for σ = σ 0 and Q = Q 0 (and thus for σ = σ i and Q = Q i , for any i ∈ N); clearly, this constant is of the form C =· (1 + η) 1/α . Finally, we define s i and r i , i ∈ N, by Obviously, we have lim We claim that, assuming ∆ −1 satisfies (A α ), which is equivalent to ω 0 ∈ BR α , we can choose Q 0 sufficiently large so that where we made a change of variables x := 2 y ∆(Q 0 ), and the last integral converges since provided we choose Q 0 sufficiently large in order to have Applying inductively Proposition 4 we will easily obtain the following proposition.
Proposition 5. Let H be as in ( * * ), with ω 0 ∈ BR α , and fix Q 0 ≥ n + 2 sufficiently large so that (43) is satisfied. Assume that Then, for each i ∈ N, there exists an (α, s i )-Gevrey smooth transformation satisfying the following estimates and such that Let us emphasize that the implicit constants in the above proposition depend only on n and α and are thus independent of i ∈ N.
Proof. For i = 0, we let F 0 be the identity, e 0 := e, A 0 := A, B 0 := B, R 0 := R, M 0 := M and there is nothing to prove. The general case follows by an easy induction. Indeed, assume that the statement holds true for some i ∈ N so that H • F i is (α, s i )-Gevrey on the domain T n × D r i ,s i . We want to apply Proposition 4 to this Hamiltonian, with ε = ε i , µ = µ i , r = r i , s = s i , h = h i , σ = σ i and Q = Q i . First, by our choice of Q 0 and δ 0 it is clear that 0 < σ i < s i , 0 < δ i < r i , and Q i ≥ n + 2. Then we need to check that the conditions The last condition of (47) is satisfied, for all i ∈ N, simply by the choice of the constant C in the definition of σ i . As for the other four conditions of (47), using the fact that the sequences ε i , µ i , h i , ∆ −1 i and δ i decrease at a geometric rate with respective ratio 1/16, 1/4, 1/2, 1/2 and 1/2, it is clear that they are satisfied for any i ∈ N if and only if they are satisfied for i = 0. The first three conditions of (47) for i = 0 are nothing but (44). Moreover, using our choice of δ 0 = r/4, the fourth condition of (47) for i = 0 reads µ ·< ∆ −1 0 and this also follows from (44).
Hence Proposition 4 can be applied, and all the conclusions of the statement follow at once from the conclusions of Proposition 4.
We can finally conclude the proof of Theorem H, by showing that one can pass to the limit i → +∞ in Proposition 5.
Proof of Theorem H. Recall that we are given ε > 0, µ > 0, r > 0, s > 0, h > 0 and that we define the sequences ε i , µ i , δ i , h i in (39), and then we chose Q 0 ≥ n + 2 satisfying (43) to define the sequences ∆ i , Q i in (40) and σ i in (41) and finally, s i and r i were defined in (42). Moreover, we have and for later use, let us observe that the following series are convergent and can be made as small as one wishes thanks to condition (5) of Theorem H: Now the condition (5) of Theorem H implies that the condition (44) of Proposition 5 is satisfied; what we need to prove is that the sequences given by this Proposition 5 do converge in the Banach space of (α, s/2)-Gevrey functions. Recall that F 0 = (E 0 , F 0 , G 0 , ϕ 0 ) is the identity, while for i ≥ 0, from which one easily obtains the following inductive expressions: Let us first prove that the sequence ϕ i converges. We claim that for all i ∈ N, we have where the fact that the last product is bounded uniformly in i ∈ N follows from (49). For i = 0, ϕ 0 = Id and there is nothing to prove; for i ∈ N since ϕ i+1 = ϕ • + ϕ i+1 we have so that using the estimate for ϕ i+1 and ∇ϕ i+1 given in (45), Proposition 5, the claim follows by induction. Using this claim, and writing Using the convergence of (48) and (50), one finds that the domain of definition of ϕ i shrinks to a point and the sequence ϕ i converges to a trivial map and observe that since Ψ(Q i ) ≥ 1, then the estimates for E i+1 , ∇E i+1 , ϕ i+1 and ∇ϕ i+1 given in Proposition 5 implies that Using these estimates, and the fact that E i+1 can be written as we can proceed as before, using the convergence of (51) to show first that Using the convergence of (48) and (52), this shows that E i converges to a map For the F i , we do have the expression As before, using the estimates on F i+1 and ∇F i+1 given in Proposition 5, one shows that but however, here, the sum above is not convergent. Yet we do have from (52) and using the fact that the estimate for V i+1 can be written as

By induction, one shows that
and as a consequence, Using the convergence of (48) and (51), this shows that F i converges to a map For G i , we have the expression Proceeding exactly as we did for E i and F i , using the convergence of (48), (51) and (53), one finds that G i converges to a map In summary, the map F i converges to a map which belongs to G, of the form F * (θ, I, ω 0 ) = (Φ * ω 0 (θ, I), ω * 0 ), Φ * ω 0 (θ, I) = (θ + E * (θ), I + F * (θ) · I + G * (θ)) with the estimates Then from the estimates given in (46), Proposition 5, and the convergence (48), it follows that both A i and B i converge to zero. Next from the estimates still given in (46), Proposition 5, one can prove in the same way as we did before, that e i converges to a trivial map whereas M i converges to a map such that, setting R * (θ, I) = M * (θ, I)I · I, Therefore we have which, together with the previous estimates (55), (56) and (57), is what we wanted to prove.

Proof of Theorem D, following Bessi
The goal of this short section is to show how Theorem D follows directly from the work of Bessi in [Bes00]. In Bessi, one starts with a non-resonant vector ω ∈ R n which is assumed to be "exponentially Liouville" in the following sense: there exists s 0 > 0 and a sequence k j ∈ Z n with |k j | → +∞ as j → +∞ for which 0 < |k j · ω| ≤ e −s 0 |k j | .

(H 1,j,s )
Observe that the only role of the sequences ν 1,j,s andν 1,j,s is to ensure that the sequence of perturbations F 1,j ε,µ satisfy, for all j ∈ N and all 0 ≤ µ ≤ 1: In [Bes00], Bessi proved the following theorem.
Theorem 6 (Bessi). Assume that ω ∈ R n satisfy (C 1,s 0 ). Then, if s 0 > s, for any 0 ≤ ε ≤ 1, there exists µ ε > 0 and j ε ∈ N such that for any 0 < µ ≤ µ ε and any j ≥ j ε , the Hamiltonian system defined in (H 1,j,s ) does not have any invariant torus T satisfying (i) T projects diffeomorphically to T n ; (i) There is a C 1 diffeomorphism between T n and T which conjugates the flow on T to the linear flow on T n of frequency ω.
It is clear that it is the regularity of the perturbation, here the analyticity which causes the exponential decay of the Fourier coefficients, that forces the condition (C 1,s 0 ). If the perturbation is assumed to be only of class C r for some r ∈ N, then (C 1,s 0 ) can be weakened to cover frequencies ω which are Diophantine with an exponent τ which is related to r (this can be obtained from Bessi's work; one can find a better quantitative result in [CW13], which also uses ideas of [Bes00]).
Here we would like to consider the case where the perturbation is α-Gevrey; we will consider a slight modification of the family of Hamiltonians (H 1,j,s ) to a family of Hamiltonians (H α,j,s ) depending on α ≥ 1, which are still analytic but for which the perturbation are bounded and small only in a α-Gevrey norm.
(H α,j,s ) With these choices of ν α,j,s andν α,j,s we have that, for all j ∈ N and all 0 ≤ µ ≤ 1: for some constant C > 1 independent of ε and µ. The argument of Bessi goes exactly the same of way for this family of Hamiltonians (H α,j,s ) under the condition (C α,s 0 ), and thus we have the following statement.
Theorem 7. Assume that ω ∈ R n satisfy (C α,s 0 ). Then, if s 0 > s, for any 0 ≤ ε ≤ 1, there exists µ ε > 0 and j ε ∈ N such that for any 0 < µ ≤ µ ε and any j ≥ j ε , the Hamiltonian system defined in (H α,j,s ) does not have any invariant torus T satisfying (i) T projects diffeomorphically to T n ; (i) There is a C 1 diffeomorphism between T n and T which conjugates the flow on T to the linear flow on T n of frequency ω. Now Theorem 7 implies Theorem D, as if ω satisfies (B α ), then it satisfies (C α,s 0 ) for some s 0 > 0 and it is sufficient to consider a Hamiltonian system as in (H α,j,s ) with s < s 0 .
Also recall that, by definition, if α ≥ 1, A α consists of those vectors ω for which whereas BR α consists of vectors ω satisfying the α-Bruno-Rüssmann condition: In the proof of Theorem H, we use the following fact.
Proof. We aim at showing that the two integrals A (Riemann-Stieltjes) integration by part shows that, if T > 1, On the one hand, the two integrals in this equality have a (possibly infinite) limit as T tends to +∞, and ln ∆(T ) T 1 α ≥ 0, thus (58) yields, as T tends to infinity, On the other hand, since ∆ is increasing, So, a α = bα α , whence the conclusion. Remark 1. From the proof above, one easily see that if ω ∈ BR α , then lim Q→+∞ ln(∆(Q)) Q 1/α = 0.
We refer to [BF13, Proposition 2.2] for related, more precise results. In the next lemma, we give alternative characterizations of the α-BR condition, so as to facilitate comparisons with other arithmetic conditions. Lemma 9. Let α ≥ 1, β = 1 + 1 α and ω ∈ R n non-resonant. The following conditions are equivalent to each other: In the case α = 1, the equivalence (1 ⇔ 2) is proved in [Rüs01], whereas (2 ⇔ 3 ⇔ 4) is proved in in [GM10].

B Gevrey estimates
Let us start by recalling some notations and definitions. Given an integer m ≥ 1 and k = (k 1 , . . . , k m ) ∈ N m , we define Given x ∈ R m , we set Let K be a compact set of the form whereB m 2 is the closure of an open subset B m 2 of R m 2 . Let f : K → R be a smooth function, meaning that f extends smoothly to an open neighborhood of K. Such an extension is by no means unique, but note that, by continuity, the partial derivatives of f over K, at any order, do not depend on the extension. For a ∈ K and k ∈ N m we set Given real numbers α ≥ 1 and s > 0, the function f is said to be (α, s)-Gevrey if The space of such functions will be denoted by G α,s (K), and equipped with the above norm, it is a Banach space. Our definition of Gevrey norm is not quite standard, but up to decreasing or increasing the parameter s, it is comparable to the Gevrey norms that have been used in Hamiltonian perturbation theory (as for instance in [MS02] or in [Pop04]). On the one hand, the role of the factor (|k|+1) 2 is to simplify the estimates for the product and

B.1 Majorant series and Gevrey functions
The definition of Gevrey functions can be conveniently reformulated in terms of majorant series with one variable (see [Kom79], [Kom80] and also [SCK03]). But first let us consider a formal power series in m variables X = (X 1 , . . . , X m ) with coefficients in a normed real vector space (E, | . | E ), which is a formal sum of the form Such a formal series is said to be majorized by another formal power series with real non-negative coefficients Next, following [SCK03], we introduce a notion of a smooth function being majorized by a formal power series in one variable. So let f : K → R p be a smooth function, and F be a formal power series in one variable with non-negative coefficients, that we shall write as We will say that f is majorized by F on K, and we will write f ≪ K F (or f (x) ≪ K F (X)), if for all a ∈ K and all k ∈ N m , we have To better understand this definition, recall that given f : K → R p and a ∈ K, we can define its formal Taylor series at a by which is a formal power series in m variables that takes values in R p . To a formal series F in one variable, one can associate a formal seriesF in m variables simply by settinĝ F (X 1 , . . . , X m ) := F (X 1 + · · · + X m ).
If is then easy to check that f ≪ K F , in the sense of (62), if and only if for all a ∈ K, T a f ≪F , in the sense of (61) (with E = R p and | . | E the norm given by the sum of the absolute values of the components). Now, given α ≥ 1 and s > 0, let us define the following formal power series in one variable (63) The following characterization of Gevrey functions is evident from the definitions (60) and (62).
Proposition 11. If f : K → R p is a smooth function, Henceforth α ≥ 1 will be fixed, so in the sequel we will simply write M α,s = M s .

B.2 Properties of majorant series
We collect here some properties of majorant series that will be used later on. It is clear how to define the derivatives of a formal power series in one variable, and also a linear combination and the product of two such formal power series. We then have the following lemma.
Lemma 12. Let f, g : K → R p be smooth functions, F, G be two formal power series in one variable, and assume that f ≪ K F, g ≪ K G. Then Moreover, if we define f · g : K → R by For a proof, we refer to [SCK03], Lemma 2.2, in which the case p = 1 is considered; but the general case p ≥ 1 is entirely similar.
Given two formal power series in one variable F and G, we define the composition F ⊙ G of F and G by Lemma 13. Let f : K → R p be a smooth function, g : L → T m 1 × R m 2 another smooth function such that g(L) ⊆ K, and assume that Once again, for a proof we refer to [SCK03], Lemma 2.3.

B.3 Derivatives
In this section, we will show that the derivatives of a Gevrey function are still Gevrey, at the expense of reducing the parameter s > 0; these are analogues of Cauchy estimates for analytic functions.
For f : K → R, let ∇f : K → R m be the vector-valued function formed by the partial derivatives of f of order one, and more generally, for f : K → R p , we let ∇f : K → M m,p (R) ≃ R mp be the matrix-valued function whose columns are given by ∇f i where f = (f i ) 1≤i≤p . Then we have the following obvious corollary of Proposition 14.

B.4 Products
In this section, we shall prove that the product of Gevrey functions is still a Gevrey function.
Once again, in view of Proposition 11 and (66) of Proposition 12, Proposition 16 is a direct consequence of the following lemma.
The proof given below follows [Lax53]. It is this lemma that motivates the introduction of the normalizing constant in M s (and thus in the Gevrey norm); without this constant one would have M 2 s ≪ c 2 M s . Let us point out that the proof given below is elementary thanks to the factor (|k| + 1) 2 in the definition of M s ; without this factor, the statement is true (with a different normalizing constant) but the proof is more involved (see Lemma 2.7 of [SCK03]).

B.6 Flows
In this section and the next one, we shall state and prove some estimates adapted to the situation considered in §6: that is we consider functions H = H(θ, I, ω) which are defined and Gevrey smooth on a domain of the form where D r is the ball of radius r > 0 centered at the origin and D ω 0 h is an arbitrary ball of radius h > 0. In the lemma and proposition below, the variables ω ∈ D ω 0 h play the role of a fixed parameter, hence to simplify the notations we will explicitly suppress the dependence on ω ∈ D ω 0 h . Moreover, throughout this section and the next one, for simplicity we shall write u <· v (respectively u ·< v), if, for some constant C ≥ 1 which depends only on n and α and could be made explicit, we have u ≤ Cv (respectively Cu ≤ v).
Let us first start with a vector-valued function D : T n → R n which depends only on θ ∈ T n , and that we shall considered as a vector field on T n .
The proof of the above lemma is a variant of the proof of Lemma B.3 in [LMS16].
Proof. The fact that D t is smooth and defined for all t ∈ [0, 1] (in fact, for all t ∈ R) follows from the compactness of T n and the classical result on the existence and uniqueness of solutions of differential equations (even though this will essentially be re-proved below); the only thing we need to prove is the estimate (78). So let us consider the space V := C([0, 1], G α,s−σ (T n , T n )) of continuous map from [0, 1] to G α,s−σ (T n , T n ): given an element Φ ∈ V and t ∈ [0, 1], we shall write Φ t := Φ(t) and consequently Φ = (Φ t ) t∈[0,1] . We equip V with the following norm: ||Φ|| := max t∈[0,1] |Φ t | α,s−σ which makes it a Banach space, and if we set ρ := |D| α,s , we define We can eventually define a Picard operator P associated to D by where P (Φ) = (P (Φ) t ) t∈[0,1] is defined by To prove the lemma, it is sufficient to prove that P has a unique fixed point Φ * ∈ B ρ V , as necessarily (Φ t * ) t∈[0,1] = (D t ) t∈[0,1] . Therefore it is sufficient to prove that P induces a well-defined contraction on B ρ V , as the latter is a complete subset of the Banach space V .
Using (77), we can then insure that P is a contradiction, which concludes the proof. Now let us consider a Hamiltonian function X on T n × D r , of the form X(θ, I) := C(θ) + D(θ) · I, C : T n → R, D : T n → R n .
The equations for θ are uncoupled from the equations of I (and hence can be integrated independently), while the equations for I are affine in I; it is well-known that these facts lead to a simple form of the Hamiltonian flow associated to X (see, for instance, [Vil08]).

B.7 Inverse functions
In this last section, we shall prove that if a Gevrey map is sufficiently close to the identity, then its local inverse is still Gevrey. To prove this in a setting adapted to §6, let us consider a map φ which depends only on ω ∈ D h , that is φ : D h → R n .
Proof. Let us define V := G α,s−σ (D h/2 , R n ), which is a Banach space with the norm || . || = | . | α,s−σ , and for ρ := |φ − Id| α,s , we set Let us define the following Picard operator P associated to φ: It is clear that φ • ϕ = Id if and only if ϕ is a fixed point of P , and therefore the proposition will be proved once we have shown that P has a unique fixed point in B ρ V , and to do this it is enough to prove that P is a well-defined contraction of B ρ V .
Comment. After this work was made public on Arxiv, an independent and interesting proof of a special case of Theorem E appeared in the preprint [LDP17].