Optimal Strategies in Zero-Sum Repeated Games with Incomplete Information: The Dependent Case

Using the duality techniques introduced by De Meyer (Math Oper Res 21:209–236, 1996a, Math Oper Res 21:237–251, 1996b), Rosenberg (Int J Game Theory 27:577–597, 1998) and De Meyer and Marino (Cahiers de la MSE 27, 2005) provided an explicit construction for optimal strategies in repeated games with incomplete information on both sides, in the independent case. In this note, we extend both the duality techniques and the construction of optimal strategies to the dependent case.


Introduction
We consider here a zero-sum repeated game with incomplete information on both sides, in the spirit of Aumann and Maschler [1]. Let K (resp. L) be the finite set of types of Player 1 (resp. 2), and let π be a probability distribution over K × L. To any pair (k, ) corresponds a matrix game G k : I × J → R, where I (resp. J ) is the finite set of actions of Player 1 (resp. 2). The game is played as follows. First, a pair (k, ) ∈ K × L is drawn with the probability distribution π. Player 1 (resp. 2) is informed only of k (resp. ). Then, the game G k is played repeatedly. At each stage m ≥ 1, the players choose actions (i m , j m ) ∈ I × J , which produces a stage-payoff G k (i m , j m ). Actions are publicly observed after each stage. For any initial distribution π and any sequence of nonnegative weights θ = (θ m ) m , we consider the game G θ (π) in which the overall payoff is the expected θ -weighted sum of the stage-payoffs m≥1 θ m G k (i m , j m ) and where π stands for the probability distribution of (k, ). This game has a value, denoted by v θ (π). The particular case where, for some n ∈ N * , one has θ m = 1 n 1 {m≤n} for all m ≥ 1 corresponds to the classical n-stage repeated games. Similarly, the case where θ m = λ(1 − λ) m−1 for all m ≥ 1 and some λ ∈ (0, 1] corresponds to λ-discounted repeated games. We use then the notation G n (π) and v n (π), and G λ (π) and v λ (π), respectively. This model was analyzed by Mertens and Zamir in [7]. Their main result was the existence of v n (π) and v λ (π) and their convergence (respectively, as n goes to +∞ and as λ vanishes) to the unique solution of a system of functional equations. The proof of this result was based on the introduction of the specific notion of I -concavity for the value function π → v n (π), which can be described as follows. Any probability π over the product set K × L can be decomposed as a pair ( p, Q) where p is the marginal probability on K and Q is a matrix of conditional probabilities on L given k ∈ K . This decomposition can be expressed as π = p ⊗ Q where ⊗ denotes the direct product. One may then consider v n as a function of ( p, Q) and show that p → v n ( p, Q) := v n ( p ⊗ Q) is a concave function for any fixed Q. A dual notion of II -convexity was also introduced, and the notions of I -concave and II -convex envelopes were the building blocks of the system of functional equations characterizing the limit value. Based on this characterization, a construction of asymptotically optimal strategies (i.e., strategies being almost optimal in G n (π), with an error term vanishing as the number of stages tends to +∞) was obtained by Heuer [6]. The convergence of the values v θ (π) for a general evaluation, as max m≥1 θ m tends to 0, and the construction of asymptotically optimal strategies in this case were obtained by Oliu-Barton [10,11].
In addition to their main result, Mertens and Zamir [7] also established a recursive formula for v n (π) and v λ (π) in terms of the conditional probabilities on K × L induced by the players' strategies at the first stage, and the extension to a general evaluation θ is straightforward. Though very useful for studying the values, the formula cannot be used by the players for the simple reason that none of them can actually compute these conditional probabilities. There is, however, one important exception: games with incomplete information on one side. Indeed, when Player 2 has no private information, Player 1 controls and observes the conditional probabilities while Player 2 does not. As a consequence, the former, and not the latter, can use the recursive formula satisfied by the values to construct an optimal strategy. This game is denoted by G θ ( p) where p is the probability distribution of k. The dual game was introduced by De Meyer in [3,4] in order to construct an optimal strategy for Player 2. The idea of the dual game is to consider a game with vector payoffs: for each realized pair of actions (i, j), the uninformed player knows the vector (G k (i, j)) k∈K . Like in approachability theory, an optimal strategy for Player 2 is one that ensures that the θ -weighted sum of payoffs lies in an appropriate subset of R K . This set depends on a well-chosen dual variable x ∈ R K which replaces the unknown type of Player 1 in the following sense: Player 2 can choose his opponent's type to be k at a cost x k . De Meyer [3] proves that the values of the dual game w θ (x) satisfy a recursive formula in terms of the dual variable and that Player 2 can use this formula to construct an optimal strategy in the dual game. More importantly, this strategy is an optimal strategy in G θ ( p) provided that x belongs to the sub-differential of the concave The duality techniques were extended by Rosenberg [12], Sorin [14] and De Meyer and Marino [5] for repeated games with incomplete information on both sides, in the special case of independent initial probabilities, i.e., π = p ⊗ q, for some probabilities p over K and q over L. As both players are uninformed about their opponent's type, one needs to consider two dual games, one for each player. The first dual game is related to the Fenchel conjugate of the function p → v θ ( p , q) := v θ ( p ⊗ q), where q is a fixed probability over L. Rosenberg [12] proved that its value w θ (x, q) satisfies a recursive formula in terms of the dual variable x and the conditional probabilities over L induced by the strategy of Player 2. As these two variables are accessible to Player 2, the latter can use this formula to construct an optimal strategy in the dual game, and this strategy is an optimal strategy in the game G θ (π) provided that x belongs to the sub-differential of the concave function p → v θ ( p ⊗ q) at p, where p and q are such that π = p ⊗ q. An alternative formula with similar properties was obtained more recently by De Meyer and Marino [5], who also considered the case of infinite action spaces. The second dual game is constructed in a symmetric manner and provides an optimal strategy of Player 1. It is worth mentioning that, unlike the asymptotic results from Heuer [6] and Oliu-Barton [11], the constructions of Rosenberg [12] and De Meyer and Marino [5] provide optimal strategies for repeated games with a fixed evaluation (namely, n-stage and λ-discounted games).
In the present paper, we extend the results from Rosenberg [12] to the so-called dependent case. That is, we provide a recursive formula for the values of the dual games, from which we deduce the computation of explicit optimal strategies for the players in the repeated game with incomplete information G θ (π), for any evaluation θ and any probability π on K × L. Our construction can be extended, word for word, to stochastic games with incomplete information, as long as the incomplete information concerns the payoff function, but not the transition function. Extending the duality techniques to the dependent case was never done before; albeit not technically difficult, the extension requires some new ideas, such as considering an intermediate step: first, the type of one player is drawn according to the corresponding marginal law, and then, the other type is drawn according to the conditional law. These considerations are crucial in the proof of our main result in order to prove the convexity of some auxiliary functions (see Remark 4.2). Besides, contrary to [12], who considered n-stage games and λ-discounted games separately, and games with incomplete information on one and two sides separately too, we present here a unified approach. Let us also point out that the approach proposed by De Meyer and Marino [5], which was designed to handle games with infinite action spaces, does not seem well-suited to analyze the dependent case. Indeed, when applied to this case, their method requires the introduction of an additional dual variable and thus results in a substantially more complicated dual recursive formula (see [8,Chapter 4]).

Main Results
As both players have symmetric roles, we will only state our results on one side. Namely, we will focus on the optimal strategies of Player 2.
In order to state our main results, we need the following notation: • For a non-empty finite set X , (X ) denotes the set of probabilities over X and is identified with the canonical simplex in R X . • N * denotes the set of positive integers, and (N * ) is the set of nonnegative sequences (a m ) m∈N * satisfying m≥1 a m = 1. • For any θ ∈ (N * ) satisfying θ 1 < 1 we denote by θ + ∈ (N * ) the sequence ( θ m 1−θ 1 ) m≥1 . • For any p ∈ (K ) and Q ∈ (L) K we denote by p ⊗ Q the probability on K × L induced by p and Q, i.e., ( p ⊗ Q)(k, ) = p k Q( | k) for all (k, ). For any τ ∈ (J ) L , we denote by P pQ τ the probability over K × L × J induced by ( p, Q, τ ). For every j ∈ J , P pQ τ ( j) denotes the marginal probability of j and Q j the matrix of conditional probabilities over L given k ∈ K , conditionally on j.
For any evaluation θ , any dual variable x and any matrix of conditional probabilities Q, we denote by w θ (x, Q) the value of the dual game corresponding to the game G θ ( p, Q) from the perspective of Player 2. We can now state our main result. Theorem 1.1 (Dual recursive formula) For all (x, Q) ∈ R K × (L) K and θ ∈ (N * ) one has: Corollary 1.2 Player 2 can construct an optimal strategy in G θ (π) by using the dual recursive formula, starting from an appropriate pair (x, Q), namely Q is the matrix of conditionals corresponding to π and x belongs to the sub-differential of p → v θ ( p ⊗ Q) at p such that π = p ⊗ Q.

Outline of the Paper
Section 2 is devoted to introduce the duality techniques in all its generality. In particular, we show how to deal with the dependent case. In Sect. 3, we introduce repeated games with incomplete information on both sides. Section 4 is devoted to prove our main results. In Sect. 5, we provide some comments on the extensions of our results to two classes of dyamics games, stochastic games and differential games.

Duality Techniques
For any pair of sets S and T and any function g : S × T → R, we denote by (S, T , g) the zero-sum game where S is the set of strategies of Player 1, T is the set of strategies of Player 2 and g is the payoff. The maxmin and minmax of (S, T , g) are given by: Note that ε-optimal strategies exist for all ε > 0 but not necessarily for ε = 0.
The aim of this section is to recall some properties of the dual game, introduced by De Meyer in [3,4] to study repeated games with incomplete information on one side. We follow the presentation given in [14,Chapter 2]. Throughout this section, S and T denote two convex sets, K and L are two finite sets and G k : S × T → R is a payoff function for each (k, ) ∈ K × L that is bi-linear and bounded, i.e., sup s,t |G k (s, t)| < +∞.

Incomplete Information on One Side
Let us start by considering the case L = { }, and set G k := G k for all k ∈ K to simplify the notation. To the collection of zero-sum games (henceforth, games) {(S, T , G k ), k ∈ K }, we associate two families of games, the so-called primal and dual games, parameterized in terms of a probability p ∈ (K ) and a vector x ∈ R K , respectively.

The primal game G( p)
To every probability distribution p ∈ (K ) corresponds a game with incomplete information on one side, defined as follows: • Before the play, k ∈ K is chosen according to p and told to Player 1.
• Then, the game (S, T , G k ) is played, i.e., Player 1 chooses s ∈ S, Player 2 chooses t ∈ T (both choices being independent and simultaneous), and the payoff is G k (s, t).
The set of strategies of Player 1 is S K , the set of strategies of Player 2 is T , and the payoff function is given by: The game (S K , T , γ (p, ·)) is denoted by G( p) and will be referred as the primal game. The maxmin and minmax of G( p) are given, respectively, by: Concavity and continuity The maps p → v ± ( p) are concave and continuous on (K ).

The dual game D[G](x) To every vector x ∈ R K corresponds the dual game D[G](x), a modified version of the primal game G( p)
where Player 1 can choose the parameter k ∈ K at a cost x k . Formally, the set of strategies of Player 1 in the dual game is (K ) × S K , the set of strategies of Player 2 is T , and the payoff function is given by: Let w − (x) and w + (x) denote, respectively, the maxmin and the minmax of D[G](x), i.e.,

Convexity and continuity
The maps x → w ± (x) are convex and continuous on R K . The values of the primal game and the values of the dual game are essentially Fenchel conjugates from each other. To be more precise about the link between the functions v ± and w ± , one needs to introduce two closely related convex transforms: the lower and the upper conjugates.
Recall that the Fenchel conjugate of f is given by f * (x) = sup y∈R K y, x − f (y) so that: Recall also the usual definition of the superdifferential of f at x: is also an ε-optimal strategy of Player 2 in G( p).

Incomplete Information on Both Sides
Consider now the general case where K and L may contain more than one element. To the collection of games {(S, T , G k ), (k, ) ∈ K × L}, one can associate a family of games with incomplete information on both sides, as before. The primal game G(π ). For any π ∈ (K × L), consider the following primal game, denoted by G(π ): • Before the play, a couple (k, ) ∈ K × L is chosen according to π, Player 1 is informed of k and Player 2 is informed of . • Then, the game (S, T , G k ) is played, i.e., Player 1 chooses s ∈ S, Player 2 chooses t ∈ T (both choices being independent and simultaneous), and the payoff is G k (s, t).
In order to apply the duality techniques described above, one needs to reformulate the primal game G(π ) in a slightly different way. Let p ∈ (K ) and Q ∈ (L) K denote, respectively, the marginal of π on K and its matrix of conditional probabilities, so that: From the perspective of Player 2, the game with incomplete information on both sides G(π ), can be seen as a game with incomplete information on one side, where Player 1 is the informed player.
The first primal game G Q ( p). Let Q ∈ (L) K be fixed. For any p ∈ (K ), consider the following game: • Before the play, k ∈ K is chosen according to p and told to Player 1.
• Then, the game (S, T L , G k Q ) is played, i.e., Player 1 chooses s ∈ S, Player 2 chooses t ∈ T L (both choices being independent and simultaneous), and the payoff is: The sets of strategies are thus S K and T L and the payoff function is given by: The maxmin and minmax are denoted, respectively, by v − Q ( p) and v + Q ( p), and the maps v ± Q ( p) are concave and continuous.
The sets S and T L are convex and G k Q is bi-linear and bounded for all k ∈ K , so that a dual game can be defined, like in Sect. 2.1. The first dual game D[G Q ](x). As before, to every x ∈ R K corresponds a dual game. The sets of strategies are (K ) × S K and T L and the payoff function is given by: The maxmin and minmax functions x → w ± Q (x) are convex and continuous. Theorem 2.2 and Corollary 2.3 can thus be restated accordingly.
The second primal game G P (q) and the second dual game D[G P ](y) In games with incomplete information on both sides, the two players have similar roles. Thus, by expressing the primal game G(π ) from the perspective of Player 1, one similarly defines a primal game G P (q) for Player 1 and the corresponding dual game D[G P ](y), for all (q, P) ∈ (L) × (K ) L and y ∈ R L . Analogue versions of Theorem 2.4 and Corollary 2.5 can thus be obtained.

Preliminaries
Let K , L, I , J be finite sets. For any (k, ) ∈ K × L, let G k = (G k (i, j)) (i, j)∈I ×J be an I × J matrix. A repeated game with incomplete information is described by the finite collection of matrix games {G k , (k, ) ∈ K × L} and a probability π ∈ (K × L). It is played as follows: • A pair of parameters (k, ) ∈ K × L is drawn according to π ∈ (K × L). Player 1 is informed of k, Player 2 is informed of . • Then, the game G k is played repeatedly: at each stage m ≥ 1, knowing the past actions, the players choose (i m , j m ) ∈ I × J and a stage-payoff G k (i m , j m ) is produced (though not observed).
The payoff of Player 1 is the expectation of m≥1 θ m G k (i m , j m ), for some given θ ∈ (N * ) that is known to both players. The payoff of Player 2 is the opposite amount. We denote this game by G θ (π). Payoff and values A pair of strategies (ŝ,t) ∈ S K × T L and an initial probability π ∈ (K × L) induce a unique probability over K × L × (I × J ) N * on the σ -algebra generated by the cylinders, denoted by P π s,t . One can then write the game G θ (π) in normal-form, i.e., G θ (π) = (S K , T L , γ θ (π, · )) where: and where E π s,t is the expectation with respect to P π s,t . The following result is well-known, 1 and we omit its proof.

Lemma 3.1 For any
Moreover, both players have 0-optimal strategies.
The aim of this paper is to provide an explicit construction for a couple of 0-optimal strategies. Recall that, as the two players have similar roles, we will focus on the construction for Player 2 only. For this reason, from now on the function v θ : (K × L) → R will be expressed in the following equivalent manner: which is more convenient for studying the game from the perspective of Player 2. Let us start by recalling an important result, the so-called primal recursive formula, which expresses the values of the repeated game with incomplete information v θ ( p, Q) in terms of the values of the continuation game, that is, the sub-game that the players are facing after the first stage.

Primal Recursive Formula
Like in Sect. 2.2, for each ( p, Q) ∈ (K ) × (L) K , let G θ ( p, Q) denote the repeated game of incomplete information on both sides G θ (π), expressed from the point of view of Player 2.
The aim of this section is to provide a recursive formula satisfied by the values v θ ( p, Q). The following specific notation will be used to express this result.
• (σ, τ ) ∈ (I ) K × (J ) L denotes a pair of strategies for the first stage of the game.
• For any ( p, Q) ∈ (K ) × (L) K , (σ, τ ) ∈ (I ) K × (J ) L and (i, j) ∈ I × J define p i j ∈ (K ) and Q j ∈ (L) K by setting, for all (k, ): where P pQ σ τ is the probability over K × L × I × J induced by (σ, τ, p, Q). 1 One may apply Sion's minmax theorem to the game in mixed strategies when pure strategies are endowed with the product topology, and then apply Kuhn's theorem to deduce the result. See, e.g., chapter 3 and appendix A in [14] where the same method is applied in the discounted case.
The following easy result is important, as in particular it shows that Player 2 can compute (and, in fact, controls) the matrix of conditional probabilities Q j ∈ (L) K for all j ∈ J .

Lemma 3.2 For all (k, , i, j) one has:
Proof For any (k, j) such that P pQ σ τ (k, j) > 0, a direct computation gives: . so that the first relation holds. Similarly, for any (k, i, j) such that P pQ σ τ (k, i, j) > 0 one obtains: For any (i, j) such that P pQ σ τ (i, j) > 0, disintegration gives then the second relation: We are now ready to state the so-called primal recursive formula, due to Mertens and Zamir [7, Section 3]. For convenience, we provide a direct and shorter proof here.
Proof Consider the maxmin. Letŝ be a strategy of Player 1. At the first stage, the information available to Player 1 is k, so that σ := (ŝ(k)) k ∈ (I ) K represents the strategy of Player 1 at the first stage. Similarly, the information available to Player 1 at the second stage is (k, i, j) for some couple of actions (i, j) played at the first stage, so thatŝ + = (ŝ(k, i, j)) k,i, j represents the strategy of Player 1 at the second stage. One can then writeŝ = (σ,ŝ + ).
Similarly, a strategyt of Player 2 can be written ast = (τ,t + ) where τ ∈ (J ) L and t + = (t( , i, j)) ,i, j . For each (i, j), lett + i j be a best reply of Player 2 to the strategyŝ + i j := (ŝ(k, i, j)) k in the so-called continuation game, i.e., G θ + ( p i j , Q j ). Then, for all (σ, τ ) ∈ (I ) K × (J ) L one has: Player 1 can still maximize over his own first-stage strategy, and Player 2 can again play a best reply. Hence: Reversing the roles of the players one obtains, symmetrically: The result follows then from the inequality "maxmin ≤ minmax." Comments Proposition 3.3 provides a recursive formula satisfied by the values, from the perspective of Player 2. Similarly, one can obtain a recursive formula satisfied by the values from the perspective of Player 1, expressing v θ (q, P) := v θ (q ⊗ P) in terms of v θ + (q i j , P i ) for any (q, P) ∈ (L) × (K ) L and suitably defined conditional probabilities (q i j , P i ) ∈ (L) × (K ) L for all (i, j). However, neither of these recursive formulae can be used by the players as both probability distributions p i j and q i j depend on the first-stage strategies of the two players. The situation contrasts with the case of repeated games with incomplete information on one side (that is, when L is a singleton), where Player 1 observes and controls the conditional probability p i ∈ (K ), so that the primal recursive formula provides an explicit and recursive manner to obtain an optimal strategy for Player 1 (see [14,Section 3]).

Remark 3.4
The sequence of weights θ + is not defined when θ 1 = 1, so that neither is the value function v θ + . This, however, does not matter as the primal formula contains the term (1 − θ 1 )v θ + , which is 0.

Dual Recursive Formula
The aim of this section is to prove the dual recursive formula, stated in Theorem 1.1, and to deduce an explicit construction of an optimal strategy for Player 2 in G θ (π).

The Dual Game
Consider the game G θ (π) described in Sect. 3 from the point of view of Player 2. For any fixed matrix of conditionals Q ∈ (L) K and any k ∈ K , consider the collection of games {G θ (δ k ⊗ Q), k ∈ K }, where δ k ∈ (K ) is the Dirac mass, i.e., δ k (k) = 1 and δ k (k ) = 0 for all k = k. Let S and T L denote, respectively, the common sets of strategies of Player 1 and 2 in each of these games. These sets are convex and the payoff functions (s,t) → γ θ (δ k ⊗ Q, s,t) are bi-linear and bounded. Hence, like in Sect. 2, one can define the corresponding dual game D[G θ ](x, Q) for Player 2, for any x ∈ R K . The dual game D[G θ ](x, Q). By construction, the sets of strategies of this game are (K )× S K and T L , and the payoff function is given by: By Lemma 3.1 and Theorem 2.4, this game has a value w θ (x, Q), i.e., ,ŝ,t). and the mappings x → w θ (x, Q) are convex and continuous.
The following notation will be used in the proof of the dual recursive formula:

Let us recall the statement of the theorem:
For all (x, Q) ∈ R K × (L) K and θ ∈ (N * ) one has: Furthermore, we will prove that the minimum in (x i j ) i j is reached in the set B I ×J .

Remark 4.1
Again, there is no need in defining the value function w θ + when θ 1 = 1, since in this case the term (1 − θ 1 )w θ + is equal to 0 by convention.
Proof On the one hand, by Proposition 3.3 one has the primal recursive formula: where p i j ∈ (K ) and Q j ∈ (L) K are defined in (3.2) and where, by Lemma 3.2, Q j does not depend on ( p, σ ). On the other hand, by the duality results presented in Sect. 2, namely Theorem 2.4, one has: Replacing v θ ( p, Q) by its expression in the primal recursive formula one obtains: where μ K is the marginal of μ on K and μ i j is the conditional on K given (i, j), i.e., Consider the one-shot game with action sets (K × I ) and (J ) L and payoff function: Clearly, F[θ, x, Q] is continuous on the compact set (K × I ) × (J ) L , its first and last terms are linear in μ and τ and, as we have already shown, one has: Therefore, one can apply Sion's minmax theorem [13] as soon as we prove that the following function is concave-convex: First, let us recall that from Theorem 2.4, the following relation holds for all (i, j): Recall that any concave function ϕ on (K ) which is G -Lipschitz with respect to the norm . 1 can be extended to a concave Lipschitz functionφ on the whole space R K having the same Lipschitz constant, by definingφ(x) = sup y∈R K {ϕ(y) − G y − x 1 }. By construction,φ admits super-gradients at every point, which belong to the compact convex set B := {x ∈ R K | x ∞ ≤ G }. Since ϕ =φ on (K ), for all p ∈ (K ), the super-gradients ofφ at p are super-gradients of ϕ at p, and therefore ϕ admits super-gradients in B at every point of (K ). According to Fenchel's lemma, the set of minimizers of the right-hand side of (4.1) is exactly the set of super-gradients of the concave mapping p → v θ + ( p , Q j ) at p = μ i j . Since this mapping is G -Lipschitz with respect to the norm . 1 on its domain (K ), it admits a super-gradient in the compact convex set B at μ i j , which is therefore a minimizer of the right-hand side of (4.1). We deduce that Hence, replacing this expression, one can write: Since μ → P Q μτ is affine, we deduce from the above expression that μ → f (μ, τ ) is concave as an infimum of affine functions.
To prove the convexity of τ → f (μ, τ ) we will consider the primal game G θ (π) from the point of view of Player 1. For any q ∈ (L) and P ∈ (K ) L , denote this game by G θ (q, P) and let v θ (q, P) := v θ (q ⊗ P) denote its value. Using this notation, for each (i, j) one can write: where q i j ∈ (L) and P i ∈ (K ) L . In particular, P i is independent from j and τ (just like Q j does not depend neither on i nor σ ). Explicitly, for all (k, , i, j) one has: .
Use the duality techniques of Sect. 2 to define, for each y ∈ R L , the dual game D[G θ ](y, P), and denote its value by w θ (y, P) := sup q∈ (L) v θ (q, P) − q, y . For each (i, j) one then has: so that: The mappings τ → P Q μτ (i, j) and τ → P Q μτ (i, j, ) being affine, the previous expression shows that τ → f (μ, τ ) is convex, as a supremum of affine functions. Therefore, one can indeed apply Sion's minmax theorem. Exchanging the maximum and the minimum one obtains: .
Again, in order to apply Sion's minmax theorem to exchange the order of the maximum and the infimum one needs to check that the mapping (μ, x) → g(μ, x) is concave-convex, where x := (x i j ) i j ∈ B I ×J and: This property follows from the fact that the mappings μ → P Q μτ (i, j) and μ → P Q μτ (i, j, k) are affine, and that the map x → w θ + (x, Q j ) is convex. We thus obtain: Since the expression above is affine with respect to μ, we can consider without loss of generality the maxima at extreme points: To conclude, note that the minimum over B I ×J can be replaced by a minimum over (R K ) I ×J since the above proof is still valid if we replace v θ + (μ i j , Q j ) by the expression given by (4.1) instead of the one given by (4.2).

Remark 4.2
The proof of Theorem 1.1 follows the main lines of [12], but there is a crucial point where an obstacle arises, namely in proving that the function is concave-convex. Unlike the independent case, where the proof relies deeply on the fact that ( p, q) → v θ + ( p, q) is a concave-convex functions of independent probabilities p ∈ (K ) and q ∈ (L), the arguments of v θ + (μ i j , Q j ) are not independent from each other, and one thus needs to use the duality techniques for the dependent case, which are more sophisticated. In this point, our proof diverges from the one in [12].

Construction of an Optimal Markovian Strategy
In this section, we deduce from Theorem 1.1 and Corollary 2.5 the construction of an optimal strategy for Player 2 in the game G θ (π). The strategy is Markovian, in the sense that it depends on the past history only through the (updated) variable (x, Q, θ).
For all (θ , x , Q ) ∈ (N * ) × R K × (L) K , let us denote by S θ (x , Q ) the set of minimizers in the dual recursive formula, i.e., where x = (x i j ) i j and where, using the notation of the previous sections: An optimal strategy for Player 2 in G θ (π) can be constructed, recursively, as follows. Case 1 If θ 1 = 1, play τ ∈ (J ) L which is optimal in the formula: Case 2 If θ 1 < 1, plays as follows: -Compute (τ, x) ∈ S θ (x, Q) optimal in the dual recursive formula at (x, Q, θ).
-Choose j ∈ J with probability τ , where ∈ L is Player 2's private type.
-Observe i ∈ I and update the triplet (x, Q, θ) to (x i j , Q j , θ + ).
This strategy is optimal in the dual game D[G θ ](x, Q), thanks to Theorem 1.1. By Corollary 2.5 and the choice of (x, Q), the strategy is also optimal in the game G θ (π). Furthermore, it is Markovian. Indeed, at every stage m ≥ 1, the mixed action of Player 2 at this stage τ m ∈ (J ) L depends only on a triplet of variables (x (m−1) , Q (m−1) , θ (m−1) ) ∈ R K × (L) K × (N * ) constructed recursively as follows. For m = 0 set: where (i m , j m ) is the pair of actions played at stage m.

Stochastic Games with Incomplete Information
Consider a game with incomplete information over a finite set of states S, where each state represents a different state of the world. Formally, let G k : S × I × J → R denote a payoff function for each pair of types (k, ) ∈ K × L, depending not only on the players' actions, but also on the state, and let ρ : S × I × J → (S) denote a transition kernel. For any θ ∈ (N * ), any π ∈ (K × L) and any initial state s 1 ∈ S, the stochastic game with incomplete information on both sides, denoted by G θ (π; s 1 ), is played as follows: • First, a pair of parameters (k, ) ∈ K × L is drawn according to π ∈ (K × L). Player 1 is informed of k, Player 2 is informed of . • At each stage m ≥ 1, knowing the current state s m ∈ S and knowing the past actions, the players choose actions (i m , j m ) ∈ I × J . A stage-payoff G k (s m , i m , j m ) is produced (though not observed) and a new state s m+1 ∈ S is drawn with the probability distribution ρ(s m , i m , j m ).
The payoff of Player 1 is the expectation of m≥1 θ m G k (s m , i m , j m ), while the payoff of Player 2 is the opposite amount.

Remark 5.1
The case where the set of states S = {s} is a singleton corresponds to repeated games with incomplete information on both sides. In this sense, stochastic games with incomplete information extend our previous model.
Let p ∈ (K ) and Q ∈ (L) K be such that π = p×Q and let s ∈ S be an initial state. The existence of the value for stochastic games with incomplete information is well-known, and we omit its proof. Let v θ (π, s) and w θ (x, Q, s) denote, respectively, the values of G θ (π; s) and of the corresponding first dual game. The dual recursive formula obtained in Theorem 1.1 can be extended, word for word, to stochastic games with incomplete information on both sides in the dependent case. Notation In the following result, we use the notations introduced earlier. Moreover, for each (Q, τ ) ∈ (L) K × (J ) L and (s, k, i) ∈ S × K × I , we set:

Corollary 5.3
Player 2 can construct an optimal strategy in G θ (π; s 1 ) by using the dual recursive formula, starting from an appropriate pair (x, Q), namely Q is the matrix of conditionals corresponding to π and x belongs to the sub-differential of p → v θ ( p ⊗ Q, s) at p such that π = p ⊗ Q.

Differential Games with Incomplete Information
Differential games with incomplete information were introduced by Cardaliaguet [2]. As in repeated games with incomplete information, before the game starts, a pair of parameters (k, ) is drawn according to some commonly known probability distribution π on K × L. Player 1 is informed of k and Player 2 of . Then, a differential game is played in which the dynamic and the payoff function depend on both types: each player is thus partially informed about the differential game that is played. The existence and characterization of the value function was established by Cardaliaguet [2] in the independent case, and extended to the general case by Oliu-Barton [9]. The proof relies on the geometry of the value function (I -concavity and II -convexity) and on a sub-dynamic programming principle satisfied by its Fenchel conjugates (i.e., the values of the first and the second dual games). Though useful for establishing the existence of the value for differential games with incomplete information, the sub-dynamic programming principles satisfied by the values of the dual games do not yield a construction of optimal strategies for these games, which remains an open problem. Establishing a continuous-time analogue of the dual recursive formula (i.e., Theorem 1.1) would be a natural way to solve it.