Acyclic Gambling Games

We consider 2-player zero-sum stochastic games where each player controls his own state variable living in a compact metric space. The terminology comes from gambling problems where the state of a player represents its wealth in a casino. Under natural assumptions (such as continuous running payoff and non expansive transitions), we consider for each discount factor the value v $\lambda$ of the $\lambda$-discounted stochastic game and investigate its limit when $\lambda$ goes to 0. We show that under a strong acyclicity condition, the limit exists and is characterized as the unique solution of a system of functional equations: the limit is the unique continuous excessive and depressive function such that each player, if his opponent does not move, can reach the zone when the current payoff is at least as good than the limit value, without degrading the limit value. The approach generalizes and provides a new viewpoint on the Mertens-Zamir system coming from the study of zero-sum repeated games with lack of information on both sides. A counterexample shows that under a slightly weaker notion of acyclicity, convergence of (v $\lambda$) may fail.


Introduction
The model of zero-sum stochastic games was introduced by [Shapley, 1953]. A state variable ω ∈ Ω follows a controlled Markov chain with transitions Q(ω|i, j, ω) controlled by the actions of two competing players (i ∈ I for player 1 and j ∈ J for player 2). Shapley assumed the action and state spaces (I, J and Ω) to be finite and proved the existence of the value v λ of the λ-discounted game using a dynamic programming principle, and characterized v λ as the unique fixed point of what has been called the Shapley operator [Rosenberg and Sorin, 2001] and [Sorin, 2002]. For a recent survey, see [Laraki and Sorin, 2014]. [Bewley and Kohlberg, 1976] using algebraic tools, proved the existence of the asymptotic value v = lim λ→0 v λ . Actually, when action and state spaces are finite, the equations that define v λ may be described by finitely many polynomial inequalities, implying that v λ is semi-algebraic and so is converging. The extension of this result to infinite stochastic games is a central question in mathematical game theory (see [Mertens et al., 2015], [Sorin and Neyman, 2003] and [Sorin, 2002]).
Recently, several important conjectures (see [Mertens, 1986], [Mertens et al., 2015]) were proved to be false. [Vigeral, 2013] and [Ziliotto, 2016a] provided examples where the family {v λ } diverges as λ approaches zero. In Vigeral, the state space Ω is finite and the action sets I and J are semi-algebraic. In Ziliotto, the set of actions is finite but the state space Ω is compact, and can be seen as the space of common beliefs on a finite state variable, controlled but not observed by the players.
On the other hand, there are many classes of stochastic games with general state space and action sets where {v λ } converges. Many have in common some irreversibility in the transitions. In recursive games [Everett, 1957, Sorin andVigeral, 2013] the current payoff is zero until the game is absorbed. In absorbing games ([Kohlberg, 1974] [ Mertens et al., 2009] and [Rosenberg and Sorin, 2001]) there is only one non-absorbing state. In repeated games with incomplete information (see [Aumann et al., 1995], [Mertens and Zamir, 1971] and [Rosenberg and Sorin, 2001]) once a player reveals some information, he cannot withdraw it. Similarly in splitting games ([Laraki, 2001b], [Laraki, 2001a] and [Sorin, 2002]) the state follows a martingale which eventually converges. Interestingly, in all those classes of "irreversible" stochastic games, not only we have convergence but also an explicit characterization of the asymptotic value. This leads many to anticipate that irreversibility has to do with convergence.
Our paper provides a weak and a strong definition of irreversibility (we call acyclicity) and prove that they constitute the frontier between convergence and divergence of {v λ }: strong acyclicity guarantee convergence while the closely related weak acyclicity do not. To do so, we restrict ourself to a new class which embeds any product stochastic game [Flesch et al., 2008] and naturally extends gambling houses from one to two players.
A classical gambling house problem has three ingredients : a metric state space S, a Borel-measurable utility function u : S → IR, and a gambling house Φ, where Φ is a set value function that assigns to each s ∈ S a set Φ(s) of ∆(S) (set of Borel probability distributions over S). At each stage t, given the state s t , the decision maker gets the reward u(s t ), chooses p t ∈ Φ(s t ), and the state moves to s t+1 according to the probability distribution p t . The gambling house is called leavable if for every s ∈ S, δ s (the dirac mass at s) belongs to Φ(s). This model was introduced in [Dubins and Savage, 1965], and was studied extensively by several authors, for instance [Maitra and Sudderth, 1996].
In a gambling game, each player controls his gambling house: Γ : X → ∆(X) for Player 1 and Λ : Y → ∆(Y ) for Player 2, and the utility function is now u : X × Y → IR with the convention that player 1 wants to maximize u whereas player 2 wants to minimize u. At each stage t, both players knowing the state ω t = (x t , y t ), simultaneously Player 1 chooses p t in Γ(x t ) and Player 2 chooses q t ∈ Λ(y t ), the stage payoff is u(x t , y t ) and a new state (x t+1 , y t+1 ) is selected according to the probability distribution p t ⊗ q t .
It is well known that any MDP (Markov Decision Process) can be mapped to a gambling house (by encoding actions in the state space) and any positive MDP to a leavable gambling house, see for instance [Dubins et al., 2002], [Maitra and Sudderth, 1996] and [Schal, 1989]. Thus, any product stochastic game [Flesch et al., 2008] can be mapped to a gambling game.
For each λ ∈ (0, 1], one can define the λ-discounted game where the stream of payoffs is evaluated according to t=1 λ(1 − λ) t−1 u(x t , y t ). In economics, λ is usually called the discount rate, 1 − λ = 1 1+r = δ is called the discount factor and r is the interest rate. Hence, λ small means the player is patient and long term optimizing.
Under some classical regularity assumptions, the λ-discounted game will have a value v λ and the family {v λ } will be equi-continuous. The central questions this paper is concerned with are: when does lim λ→0 v λ exists and if so, how to characterize it from the data of the game. A characterization is important if one wants to compute the limiting value. This is a difficult question that this paper will not address. Let us just mention that computing the convex envelop of a given function, a very special case of our characterization, is already an NP-hard problem and there are, to our knowledge, only few algorithms that approximate it (see for instance [Laraki and J.B-Lasserre, 2008]).
Our first main result shows that if at least one of the gambling houses Γ or Λ is strongly acyclic (definition 2.13), {v λ } uniformly converges to a function v as λ goes to 0. Moreover, we provide several characterizations of the asymptotic value v that extend the well known Mertens-Zamir system of functional equations [Mertens and Zamir, 1971]. Our second result proves that under a slightly weaker notion of acyclicity, {v λ } may diverge (even if both houses Γ and Λ are weakly acyclic and both state spaces X and Y are finite). The example has similarities with an example in Ziliotto [Ziliotto, 2016a] for stochastic games where both players control the same state variable. Our example is the first in the class of product stochastic games, and is somehow simpler than the recent counterexamples.
Finally -in the appendix-under an idempotent assumption combined with a bounded variation hypothesis on the transitions, we prove existence of the uniform value 1 and provide simple uniform optimal strategies, extending a recent result in [Oliu-Barton, 2017] for the splitting game. The convergence of the discounted values in splitting games and the link with the Mertens-Samir system of equations was proved in [Laraki, 2001b] and [Laraki, 2001a].

Notations
Given a compact metric space S, we denote by B(S), resp. by C(S), the set of bounded measurable, resp. continuous, functions from S to the reals, and by ∆(S) the set of Borel probabilities over S. For s in S, we denote by δ s ∈ ∆(S) the Dirac measure on s, and whenever possible we assimilate s and δ s . For v in B(S), we denote byṽ its affine extension to ∆(S):ṽ(p) = IE p (v) for all p in ∆(S), where IE p (v) := S v(s)dp(s) is the expectation of v with respect to p. ∆(S) is endowed with the weak-* topology, a compatible distance being the Kantorovich-Rubinstein (or Wasserstein of order 1) metric: d KR (p, p ) = sup v∈E 1 |ṽ(p) −ṽ(p )|, where E 1 is the set of 1-Lipschitz functions on S. When there is no confusion,ṽ(p) will also be denoted by v(p).

Model and examples
A gambling game is a zero-sum stochastic game where each player controls his own state variable. We will always assume in this paper that the state spaces are non empty metric and compact, and denote by X and Y the respective set of states 2 controlled by Player 1 and by Player 2. The transitions of Player 1 are given by a continuous 3 multifunction Γ : X ⇒ ∆(X) with non empty convex 4 compact values: if the state of Player 1 is at x, he can select his new state according to any probability in Γ(x). Similarly, a continuous multifunction Λ : Y ⇒ ∆(Y ) with non empty convex compact values, gives the transitions of Player 2. The players independently control their own state, and only interact through payoffs: the running payoff of Player 1 is given by a continuous mapping u : X × Y −→ IR, and the payoff to Player 2 is given by −u.
Gambling games extend the model of gambling houses [Dubins and Savage, 1965], which correspond to the single player case when Y is a singleton and Player 2 plays no role. It is well known that, by an adequate increase of the state space in order to encompass actions, any MDP can be mapped into a gambling house (see [Dubins et al., 2002], [Maitra and Sudderth, 1996] and [Schal, 1989]).
A standard gambling house is the red-and-black casino where X = [0, 1] is a fortune space. Suppose that at each fortune x ≥ 0, the gambler can stake any amount s in her possession. The gambler loses the stake with probability 1 − w where w ∈ (0, 1) is fixed and given, and wins back the stake and an additional equal amount with probability w. The corresponding transition multifunction reads: More generally, a casino is a gambling house Γ on X = [0, +∞) in which "a rich gambler can do whatever a poor one can do" and a "poor gambler can, on a small scale, imitate a rich one." Formally, for x ∈ X, let Θ(x) := {θ ∈ ∆(R) : ∃γ ∈ Γ(x) such that γ = [θ + x]} where [aθ + b] for some reals a and b is the probability measure s.t. for all u ∈ B(x), X u(x)d[aθ + b](x) = X u(ax + b)dθ(x). With those notations, the gambling house Γ is a casino if for all x ≥ 0 and 0 ≤ t ≤ 1, [tΘ(x)] ⊂ Θ(tx) ⊂ Θ(x). A fundamental result of Dubins and Savage (1965) is the classification of casinos into four types: trivial, subfair, fair, and superfair. Here we need only to distinguish superfair casinos from those that are not superfair. A casino is superfair if there is θ ∈ Θ(x) for some x > 0 such that X xdθ(x) > 0. The red-and-black casino is superfair if and only if w > 1 2 . Observe that a Casino is superfair if the player has a strategy that wins in expectation against the Casino for at least one x > 0 (and so for every x > 0 given the definition of a casino).
Another class of gambling houses are splitting problems where X = ∆(K) is a simplex (K is a finite set) and Γ(x) is the set of Borel probabilities σ on X centered at The idea of splitting was introduced by Aumann and Maschler [Aumann et al., 1995] in the context of repeated games with incomplete information on one side. This "gambling house" type of problems is now very popular in the persuasion and information design literature, see [Kamenica and Gentzkow, 2011].
The above two gambling houses naturally extend to gambling games. One can consider a casino game where each player i controls a red-and-black house with parameter w i , and the running payoff depends on the current pair of fortunes. Another example is a splitting game introduced in [Laraki, 2001b], [Laraki, 2001a], and [Sorin, 2002] where X = ∆(K) and Y = ∆(L) are simplexes, Γ(x) is the set of Borel probability measures on X that are centered at x and Λ(y) is the set of Borel probability measures on Y that are centered at y.

Discounted Evaluations
Given a discount factor λ ∈ (0, 1] and an initial state (x 1 , y 1 ) in X × Y , the game G λ (x 1 , y 1 ) is played as follows: at any stage t ≥ 1, the payoff to Player 1 is u(x t , y t ) and both players knowing (x t , y t ), simultaneously Player 1 chooses p t+1 in Γ(x t ) and Player 2 chooses q t+1 in Λ(y t ). Then, x t+1 and y t+1 are independently selected according to p t+1 and q t+1 , the new states x t+1 and y t+1 are publicly announced, and the play goes to stage t + 1. Under our assumptions of compact state spaces, continuous transitions with convex compact values and continuous running payoff, it is easy to describe the value of such dynamic game.
This is the standard characterization of the value of a discounted game by means of the Shapley operator (a sort of dynamic programing principle). Existence and uniqueness of v λ follow from standard fixed-point arguments (see for instance [Mertens et al., 2015], [Rosenberg and Sorin, 2001]). We refer to v λ (x, y) as the value of the game G λ (x, y).
The goal of the paper is to study the convergence of (v λ ) λ when λ goes to 0, i.e. when players become more and more patient.
Remark 2.2. Cesaro Evaluations. It is also standard to define the value of the n-stage games by: v 1 = u, and for n ≥ 1 and (x, (u(x, y) + nṽ n (p, q)) , It is known that the uniform convergence of (v n ) n when n goes to infinity, is equivalent to the uniform convergence of (v λ ) λ when λ goes to 0, and in case of convergence both limits are the same (Theorem 2.2 in [Ziliotto, 2016b] applies here).

Non expansive transitions
Without further assumptions, convergence of (v λ ) may fail even in the simple case where Γ and Λ are single-valued ("0 player case", players have no choice), so we will assume throughout the paper that the gambling game is non expansive, i.e. have non expansive transitions.
The gambling game has non expansive transitions if the transitions, viewed as mappings from X to 2 ∆(X) , and from Y to 2 ∆(Y ) , are 1-Lipschitz for the Hausdorff distance on compact subsets of ∆(X) and ∆(Y ). Note that the transitions are always non expansive when X and Y are finite 5 . Moreover splitting games are non expansive [Laraki, 2001b], and red-and-black casino houses with parameter w are non-expansive if and only if w ≤ 1 2 . More generally, a casino is non-expansive if and only if it is not superfair [Laraki and Sudderth, 2004].
Let us mention also Markov chain repeated games with incomplete information where each player observes a private and exogenous Markov chain. These repeated games lead to gambling houses with transitions of the form: X is a simplex ∆(K), and Γ(x) = {pM, p ∈ ∆(X) centered at x} with M a fixed stochastic matrix. Here again, transitions are non expansive, see [Gensbittel and Renault, 2015].
Let us mention immediately an important consequence of the non expansive assumption. The proof is in the Appendix.
This proposition extends to two players a similar result in [Laraki and Sudderth, 2004] on gambling houses where it is proved that non-expansivity is necessary and sufficient to guarantee equi-continuity of the values. As a consequence, pointwise and uniform convergence of {v λ } are equivalent, and since X × Y is compact, to prove convergence of {v λ } it is enough to prove uniqueness of a limit point 6 .
Remark 2.5. It is not difficult to see that without non-expansivity, {v λ } may not be equicontinuous and the convergence may not be uniform. For instance in red-and-black casino with a single player, if the parameter w > 1 2 and u(x) = x, v λ is continuous for every λ but v = lim λ →0 v λ is not : v(x) = 0 for x = 0 and v(x) = 1 for x > 0.

Excessive, depressive and balanced functions
Observe that any uniform limit v of (v λ ) λ∈(0,1] is necessarily continuous and balanced (by passing to the limit in definition 2.1).
In a splitting game, excessive means concave with respect to the first variable, and depressive means convex with respect to the second variable.
The gambling game is leavable if each player can remain in any given state. This is a standard assumption [Dubins and Savage, 1965]. This is the case in casinos and splitting games. In the persuasion literature and in repeated games with incomplete information, not moving means revealing no information.
Remark 2.8. If the game is leavable, any excessive and depressive function in B(X) is balanced (the converse is not true as example 2.9 shows). Indeed, since the game is leavable, δ y ∈ Λ(y) and so the equality being due to excessivity. By symmetry, since maxmin ≤ minmax in every game, the value exists and (δ x , δ y ) is a saddle point, implying in particular that the game is balanced.
Example 2.9. Consider a gambling game where players 1 and 2 move on the same finite grid of a circle containing 6 nodes in equidistant positions. Any player can move one step to the left, one step to the right, or not move (and choose randomly between these 3 options, so that transitions have convex values). This game is leavable. It is here possible for a player to go from any state to any other state in at most 3 stages (the game may be called cyclic), so any excessive and depressive function is necessarily constant. Suppose that Player 1's payoff is 1 if he is at most one step away from Player 2, and Player 1's payoff is 0 otherwise. If the players start a distance at most 1, Player 1 can guarantee this property will hold forever by not moving or moving one step to the direction of Player 2 and so, in this case we have v λ = 1 for every λ. On the other hand, if the players start at a distance at least 2, Player 2 can insure that this property will hold forever, by not moving or moving one step in the opposite direction of Player 1. For these initial states, v λ = 0 for every λ. Here, v = lim v λ is continuous and balanced, but not excessive nor depressive.
Definition 2.10. A gambling game is standard if both state spaces X and Y are compact metric, the running payoff u is continuous, and the transitions Γ and Λ have non empty convex compact values and are leavable and non expansive.
In all the paper, we consider only standard gambling games.

Acyclicity
We now come to the main new conditions of the paper.
The gambling house Γ of player 1 is weakly acyclic if there exists ϕ in B(X) lower semi-continuous such that: Similarly, the gambling house Λ of player 2 is weakly acyclic if there exists ψ in B(Y ) upper semi-continuous such that: The gambling game is weakly acyclic if both gambling houses are weakly acyclic.
Observe that any weakly acyclic gambling game is necessarily leavable. Weak acyclicity is, to our knowledge, a new condition in the gambling house literature. If the house Γ is weakly acyclic, the "potential" ϕ decreases in expectation along non stationary trajectories, hence the irreversibility of the process in the space of probabilities over X.
Example 2.12. When w ≤ 1 2 , a red-and-black casino is weakly acyclic and similarly for any casino which is not superfair (take ϕ to be strictly increasing and strictly concave). A splitting game is weakly acyclic (take ϕ to be any strictly concave).
We will now define strong acyclicity, our central condition. For this, we need to consider transitions for several stages. We first extend linearly the transitions to ∆(X) and ∆(Y ) by definingΓ : ∆(X) ⇒ ∆(X) andΛ : ∆(Y ) ⇒ ∆(Y ). More precisely, the graph ofΓ is defined as the closure of the convex hull of the graph of Γ (viewed as the subset {(δ x , p), x ∈ X, p ∈ Γ(x)} of ∆(X) × ∆(X)), and similarly the graph of Λ is defined as the closed convex hull of the graph of Λ. Because Dirac measures are extreme points of ∆(X) and ∆(Y ), we haveΓ(δ x ) = Γ(x) andΛ(δ y ) = Λ(y) for each x in X and y in Y . Be careful that in general, for p in ∆(X) and q in ∆(Y ): We now define inductively a sequence of transitions (Γ n ) n from ∆(X) to ∆(X), bỹ Γ 0 (p) = {p} for every state p in ∆(X), and 7 for each n ≥ 0,Γ n+1 =Γ n •Γ .Γ n (δ x ) represents the set of probabilities over states that Player 1 can reach in n stages from the initial state x in X. Similarly we defineΛ n for each n.
1) The reachable set of Player 1 from state x in X is the closure of n≥0Γ n (δ x ) in ∆(X), and denoted Γ ∞ (x). Similarly, the reachable set of Player 2 from state y in Y is the subset Λ ∞ (y) of ∆(Y ) defined as the closure of n≥0Λ n (δ y ).
2) The gambling house Γ of player 1 is strongly acyclic (or simply, acyclic) if there exists ϕ in B(X) lower semi-continuous such that: Similarly, the gambling house Λ of player 2 is strongly acyclic (or simply, acyclic) if there exists ψ in in B(Y ) upper semi-continuous such that: The gambling game is strongly acyclic (or simply, acyclic) if both gambling houses are strongly acyclic.
Thus, Γ is strongly acyclic if and only if Γ ∞ is weakly acyclic. Also, strong acyclicity of Γ implies weak acyclicity of Γ because Γ(x) ⊂ Γ ∞ (x) for every x ∈ X. The difference between weak and strong acyclicity is sharp as the following lemma shows.
Lemma 2.14. If Γ is standard and weakly acyclic, then Γ n is also standard and weakly acyclic for every 2 ≤ n < ∞, where Γ n (x) = n k=1 Γ n (x).
Note that in splitting games, Γ 2 = Γ and so Γ n = Γ for every n and thus Γ ∞ = Γ and so weak and strong acyclicity coincide. For other examples of strongly acyclic houses, see appendix (section 8.5).
Proof: For a standard gambling house Γ on X, a measurable function f ∈ B(X), and a state x ∈ X, define the so called one-day operator as: From [Laraki and Sudderth, 2004], since Γ is standard, Γ n is standard 8 for every n < ∞ and one can compute recursively G Γ n using backward induction G Γ n (f ) = G Γ (G Γ n−1 (f )). Hence, f is Γ-excessive if and only if G Γ (f ) = f and if so, by induction, we have also G Γ n (f ) = f for every n.
Call ϕ strictly excessive if Γ is weakly acyclic with ϕ as potential, i.e. if for every We want to show that whenever ϕ is Γ-strictly excessive, it is Γ n -strictly excessive for every n.
Recall that G Γ n (f ) = G Γ (G Γ n−1 (f )). Let σ n ∈ Argmax τ n ∈Γ n ϕ(y)dτ n (y). Then, σ n 1 (the strategy at the first stage) must be optimal in By induction we deduce that σ n (x) = δ x for every n and every x.
Hence, a gambling house Γ is weakly acyclic, if and only if the gambling houses Γ n are weakly acyclic for every n < ∞ and is strongly acyclic if the limit of this increasing sequence is weakly acyclic. As we will see in section 4.2, there are non-expansive and standard gambling houses that are weakly but not strongly acyclic.

Main Results
Our main result is the following.
Theorem 3.1. Consider a standard gambling game. 1. If at least one of the players has a strongly acyclic gambling house, (v λ ) uniformly converges to the unique function v in C(X × Y ) satisfying: Moreover: v is the largest excessive-depressive continuous function satisfying P 1, and is the smallest excessive-depressive continuous function satisfying P 2.
2. Even if both gambling houses are weakly acyclic, convergence of (v λ ) may fail.
The conditions of the positive result 1) may be interpreted as follows: • a) and b) : It is always safe not to move. For each player, not moving ensures not to degrade the limit value.
• c) and d) : Each player can reach, if his opponent does not move, the zone when the current payoff is at least as good than the limit value, without degrading the limit value.
These interpretations will lead later to the construction of simple uniformly optimal strategies under some additional assumptions, see section 8.
The positive result of theorem 3.1 relies on the following three propositions (proved in the appendix). Recall that, thanks to proposition 2.4, to get convergence of the values it is enough to show uniqueness of a limit point of (v λ ) λ .
Proposition 3.2. Assume one of the player has a weakly acyclic gambling house. If v in C(X × Y ) is balanced, then v is excessive and depressive.
Without weak acyclicity on one side, a balanced function may not be excessive and depressive as example 2.9 shows. Proposition 3.3. Let v be a limit point of (v λ ) for the uniform convergence. Then v is balanced, and satisfies P 1 and P 2.
This proposition provides some properties that all limit points of (v λ ) sould satisfy. The next proposition shows that strong acyclicity on one side implies that at most one function will satisfy those properties.
Proposition 3.4. Assume one of the player has a strongly acyclic gambling house. Then, any balanced continuous function satisfying P 1 is smaller that any balanced continuous function satisfying P 2. Consequently, there is at most one balanced continuous function satisfying P 1 and P 2.
On the other hand, if none of the player has a strongly acyclic gambling house, one can prove that there may be infinitely many balanced continuous functions satisfying P1 and P2. This will be the case in our counter-examples of section 4.4, where both gambling houses are weakly acyclic. Moreover, one of the counter-examples show that the family of discounted values may not converge as λ goes to zero. This shows that our results are tight and that strong acyclicity condition (on one side) has very strong consequences both on the convergence of the discounted values and on the characterization of the limit.

A strongly acyclic gambling house
Let us first illustrate our characterization on a simple example. Consider the following Markov decision process with 3 states: X = {a, b, c} from [Sorin, 2002]. States b and c are absorbing with respective payoffs 1 and 0. Start at a, choose α ∈ I = [0, 1/2], and move to b with proba α and to c with proba α 2 .
Here formally Y is a singleton (there is only one player, so we can omit the variable y), Γ has compact convex values, the transitions are 1-Lipschitz, and the game is leavable. The gambling game is strongly acyclic: just consider ϕ such that ϕ(a) = 1, and Player 1 can go from state a to state b in infinitely many stages with arbitrarily high probability, by repeating a choice of α > 0 small (so that α 2 is much smaller than α), and the limit value v clearly satisfies: This is the unique function w : X → IR satisfying the conditions a), b), c), d) of Theorem 3.1: P 1 and P 2 implies u ≤ w ≤ 1, and because b and c are absorbing states, w(b) = 1 and w(c) = 0. Finally, w excessive gives w(a) = 1. Notice that δ b ∈ Γ ∞ (a) but for each n, δ b / ∈Γ n (a).

A weakly acyclic gambling house
Let us modify the gambling house of the previous section 4.1. We still have a unique player and a state space X = {a, b, c}. The only difference is that state b is no longer absorbing : in state b the player also has to choose some α ∈ I = [0, 1/2], and then moves to a with probability α, to c with probability α 2 and remains in b with probability States a and b are now symmetric. This gambling house is weakly acyclic, with ϕ(a) = ϕ(b) = 1, ϕ(c) = 0, but it is not strongly acyclic since a ∈ Γ ∞ (b) and b ∈ Γ ∞ (a). We will later use this gambling house to construct our counter-example of theorem 3.1, 2).

An example with countable state spaces
We present here a (strongly) acyclic gambling game with countable state spaces, and illustrate 9 the proof of proposition 3.2, that under weak acyclicity any continuous balanced function is also excessive and depressive. Consider the state space: where x n = 1 − 1 n if n is finite, and x ∞ = 1. We use d(x, x ) = |x − x |, so that X is countable and compact. The transition is given by: The intuition is clear: Player 1 can stay at his location, or move 1 to the right. The gambling house (Y, Λ) of Player 2 is a copy of the gambling house of Player 1. Transitions are non expansive (since | 1 n+1 − 1 n +1 | ≤ | 1 n − 1 n |), and the game is strongly acyclic. The payoff u is any continuous function X × Y −→ IR, so that theorem 3.1 applies.
Consider v : X × Y −→ IR, and for simplicity we use w(n, m) = v(x n , x m ). Here v excessive means that w(n, m) is weakly decreasing in n, and v depressive means that w(n, m) is weakly increasing in m. The meaning of v balanced is the following: for each n and m, w(n, m) is the value of the matrix game ("local game" at (n, m)): Clearly, if v is excessive and depressive it is balanced, but proposition 3.2 tells that if v is continuous, then the converse also holds: balancedness implies excessiveness and depressiveness. The idea of the proof of proposition 3.2 can be seen here as follows.
To conclude with this example, consider the simple case where the running payoff is given by u(x, y) = |x − y|. Player 1 wants to be far from Player 2, and Player 2 wants to be close to Player 1. If initially n < m, it is optimal for each player not to move, so w(n, m) = |x n − x m |. Suppose on the contrary that initially n ≥ m, so that Player 1 is more to the right than Player 2. Then Player 2 has a simple optimal strategy which is to move to the right if the current positions satisfy x > y, and to stay at y if x = y. No matter how large is the initial difference n − m, Player 2 will succeed in being close to player, so that w(n, m) = 0 if n ≥ m.

A weakly acyclic game without limit value
Here we prove the second part of theorem 3.1 by providing a counterexample to the convergence of (v λ ) in a weakly acyclic non expansive gambling house.
The states and transitions for Player 1 are as in example 4.2: The set of states of Player 1 is X = {a, b, c}. The difference with example 4.2 is that the set of possible choices for α a and α b may be smaller than [0, 1/2]. Here α a and α b now belong to some fixed compact set I ⊂ [0, 1/2] such 10 that 0 is in the closure of I\{0}. Then 0 ∈ I, the transitions are leavable and non expansive. States a and b are symmetric, this gambling house is weakly acyclic, with ϕ(a) = ϕ(b) = 1, ϕ(c) = 0, but not strongly acyclic since a ∈ Γ ∞ (b) and b ∈ Γ ∞ (a). The gambling house of Player 2 is a copy of the gambling house of Player 1, with state space Y = {a , b , c } and a compact set of choices J ⊂ [0, 1/2] such that 0 is in the closure of J\{0}. The unique difference between the gambling houses of the players is that I and J may be different. Payoffs are simple: The u function can be written as follows a b c a 0 1 1 b 1 0 1 c 1 1 0 , with a clear interpretation : Player 1 and Player 2 both move on a space with 3 points, Player 2 wants to be at the same location as Player 1, and Player 1 wants the opposite.
Here the gambling game is weakly acyclic but not strongly acyclic, and the following lemma shows that the uniqueness property of proposition 3.4 fails. Consider now v(c, a ). v being depressive, for any fixed β * > 0 in J we have: It only remains to prove that v(a, a ) = v(a, b ). v being depressive, for any β > 0 in J, v(a, a ) ≤ βv(a, b ) + β 2 v(a, c ) + (1 − β − β 2 )v(a, a ). By assumption on J, we get v(a, a ) ≤ v(a, b ). By symmetry of the transitions between b and b , v(a, b ) = v(a, a ), and v satisfies B).
One can easily check that B) implies A), and the proof of lemma 4.1 is complete.
The second part of theorem 3.1 is a direct consequence of point 3 of the following theorem.

Gambling houses (or Markov Decision Processes)
We assume here that there is a unique player, i.e. that Y is a singleton. Then non expansiveness is enough to guarantee the uniform convergence of (v λ ) λ (as well as the uniform value, see [Renault, 2011]) and the limit v can be characterized as follows [Renault and Venel, 2016] where R = {p ∈ ∆(X), (p, p) ∈ GraphΓ} is interpreted as the set of invariant measures for the gambling house (which is not necessarily leavable here). If we moreover assume that the gambling house is leavable, then R = ∆(X) and we recover the fundamental theorem of gambling ([Dubins and Savage, 1965], [Maitra and Sudderth, 1996]), namely, (v λ ) uniformly converges to: v = min{w ∈ C(X), w excessive , w ≥ u} = min{w ∈ B(X), w excessive , w ≥ u}.
It is also easy to see that v(x) = sup p∈Γ ∞ (x) u(p) for each x.
Our approach will lead to other characterizations. We don't assume any acyclicity condition in the following theorem.
Proof: From proposition 3.3 any accumulation point of (v λ ) is excessive and satisfies P 1 and P 2. Thus, we just need to show uniqueness, which is a direct consequence of the following lemma.
Lemma 5.2. If v 1 ∈ B(X) satisfies P 1 and v 2 ∈ C(X) is excessive and satisfies P 2, Because v 2 is excessive and continuous, by lemma 7.
Using the gambling fundamental theorem, we obtain new viewpoints on the characterization of the limit value in leavable gambling houses.
Corollary 5.3. Consider a one player standard gambling house. Then the asymptotic value exists and is: (1) the smallest excessive function v in B(X) satisfying P 2; (2) the largest excessive function v in B(X) satisfying P 1; (3) the unique excessive function v in B(X) satisfying P 1 and P 2. Moreover, v is continuous.

Other characterizations and link with the Mertens Zamir system
Definition 5.4. Given g in B(X × Y ), Exc Γ (g) is the smallest excessive (w.r.t. X) function not lower than g, and Dep Λ (g) is the largest depressive (w.r.t. Y ) function not greater than g.
Exc Γ is usually called the réduite operator and Dep Λ (g) = −Exc Γ (−g). In splitting games, Exc Γ = Cav X is the concavification operator on X and Dep Λ (g) = V ex Y is the convexification operator on Y . We introduce the following definition by analogy with the Mertens-Zamir characterization.
We now introduce other properties, by analogy with the one established for splitting games and repeated games with incomplete information, see for instance [Laraki, 2001b], [Laraki, 2001a], and[Rosenberg andSorin, 2001].

2) v satisfies the E-characterization if:
E1 : f or all (x, y) ∈ X × Y, if x is extreme for v (·, y) then v(x, y) ≤ u(x, y), and E2 : f or all (x, y) ∈ X × Y, if y is extreme for v (x , ·) then v(x, y) ≥ u(x, y).
Proposition 5.7. Consider a standard gambling game and let v in C(X × Y ) be excessive and depressive. Then: v satisfies M Z1 =⇒ v satisfies P 1 =⇒ v satisfies E1, and v satisfies M Z2 =⇒ v satisfies P 2 =⇒ v satisfies E2.
Proof. Let v be a continuous excessive function that satisfies M Z1. Fix y and define for each x, f (x) = min(v(x, y), u(x, y)). Then, for every x, v(x, y) = Exc Γ (f )(x). We consider the gambling house for Player 1 where the state of Player 2 is fixed to y and the payoff is given by f . From corollary 5.3, there is p ∈ Γ ∞ (x) such that v(x, y) = v(p, y) ≤ f (p). Since f (p) ≤ u(p, y), v satisfies P 1. Now, let v be an excessive continuous function that satisfies P 1. Take any x and y and suppose that x is extreme for v(·, y). By P 1, there is p * ∈ Γ ∞ (x) such that v(x, y) = v(p * , y) ≤ u(p * , y). Because v is excessive and continuous, by lemma 7.3 we have p * ∈ arg max p∈Γ ∞ v(p, y). Because x is extreme for v(·, y), p * = δ x and so, v(x, y) ≤ u(x, y). Consequently, E1 is satisfied.
Remark 5.8. It is easy to find examples where E1 is satisfied but M Z1 is not. For instance, assume that Y is a singleton, and that Γ(x) = ∆(X) for each x in X. Consider the constant, hence excessive, functions u = 0 and v = 1. v has no extreme points hence trivially satisfies E1, but Exc Γ min(u, v) = u and so, v does not satisfy M Z1.
Proposition 5.9. Consider a standard gambling game and let v be an excessivedepressive function in C(X × Y ). Then: (Γ strongly acyclic) and (v satisfies E1) =⇒ (v satisfies M Z1), and; (Λ strongly acyclic) and (v satisfies E2) =⇒ (v satisfies M Z2). Consequently, if the gambling game is strongly acyclic, characterizations M Z, P and E are equivalents.
Proof. Let v be excessive-depressive that satisfies E1. Fix y ∈ Y . We want to show that v(x, y) = g(x, y) where g = Exc Γ (f ) and f = min(u, v). g is continuous by corollary 5.3. Since v is excessive and v ≥ f , we have v ≥ g. Let Z = arg max x∈X v(x, y) − g(x, y) and let x 0 = arg min x∈Z ϕ(x), where ϕ comes form the definition of acyclicity. It is enough to prove that v(x 0 , y) ≤ g(x 0 , y).

Open problems and future directions
We introduce the class of gambling games. It is a sub-class of stochastic games which includes MDP problems, splitting games and product stochastic games. We define a strong notion of acyclicity under which we prove existence of the asymptotic value v and we establish several characterizations of v which are linked to the Mertens-Zamir system of functional equations (re-formulated in our more general set-up). We also prove that our condition is tight: a slight weakening of acyclicity implies non-existence of the asymptotic value. Our example is the first in the class of product stochastic games and is probably the simplest known counterexample of convergence for finite state spaces and compact action set (the first counterexample in this class was established by in [Vigeral, 2013]). Many questions merit to be investigated in a future research: • In standard gambling games, is it possible to characterize the asymptotic value in models where we know it exists (for example when X and Y are finite, transition function is polynomial and Γ and Λ are definable [Bolte et al., 2015] in an ominimal structure)? We know that the asymptotic value is balanced and satisfy P1 and P2, but we may have infinitely many functions satisfying those properties.
• Is there an asymptotic value if one house is strongly acyclic and the other not necessarily leavable? As seen, even when both houses are weakly acyclic, we may have divergence: strong acyclicity of one of the two houses is necessary. • It would be interesting to study the non-zero sum analogue of this model. Actually, a static version of non-zero-sum splitting games, with a discontinuous payoff function, has been recently explored by [Koessler et al., 2018] and one natural extension is the dynamic model. Observe that for each discount factor λ, under some regularity assumptions, one can prove existence of subgame perfect equilibrium payoffs E λ , and establish a standard recursive structure. The interesting question is : does E λ converges as λ goes to zero -if all players have a strongly acyclic gambling house-and if so, how to characterize this limit. [ Kohlberg, 1974] ω is non decreasing, concave and lim 0 ω = 0. Denote by C the set of functions v in We start with a lemma.
Proof of lemma 7.1: By the Kantorovich duality theorem, there exists µ in ∆(X × X) with first marginal p and second marginal p satisfying: Similarly there exists ν in ∆(Y × Y ) with first marginal q and second marginal q satisfying: d KR (q, q ) = Y ×Y d(y, y )dν(y, y ). We have for all x, x , y, y : v(x, y) ≥ v(x , y ) − ω(d(x, x ) + d(y, y )).
We integrate the above inequality with respect to the probability µ ⊗ ν, and obtain using the concavity of ω: We now return to the proof of proposition 2.4. Fix λ in (0, 1]. Given Consider the zero-sum game with strategy spaces Γ(x) and Λ(y) and payoff function (p, q) → λ u(x, y) + (1 − λ)ṽ(p, q). The strategy spaces are convex compact and the payoff function is is continuous and affine in each variable, hence by Sion's theorem we have: Consider (x, y) and (x , y ) in X × Y , and let p in Γ(x) be an optimal strategy of Player 1 in the zero-sum game corresponding to (x, y). The gambling game has non expansive transitions, so there exists p ∈ Γ(x ) such that d KR (p, p ) ≤ d(x, x ). Consider any q in Λ(y ), there exists q in Γ(y) with d KR (q, q ) ≤ d(y, y ). Now, using lemma 7.1 we write: λu(x , y ) y )), and Φ(v) belongs to C.
The rest of the proof is very standard. C is a complete metric space for Hence Φ has a unique fixed point which is v λ . Each v λ is in C, and we obtain that the family (v λ ) λ∈(0,1] is equicontinuous, ending the proof of proposition 2.4.

Proof of proposition 3.2
By symmetry, suppose that Γ is weakly acyclic and Λ Leavable. Let us prove that any balanced continuous function v is excessive-depressive. First let us first prove that v is excessive.
Fix any (x 0 , y 0 ) in X × Y , and p 1 ∈ Γ(x 0 ). A direct consequence of balancedness is the existence of q 1 in Λ(y 0 ) such that v(p 1 , q 1 ) ≤ v(x 0 , y 0 ). Now, p 1 is in ∆(X) and q 1 is in ∆(Y ). One has to be careful that there may not exist p 2 ∈Γ(p 1 ) such that for all q 2 ∈Λ(q 1 ), v(p 2 , q 2 ) ≥ v(p 1 , q 1 ). This is because,ṽ being affine in each variable, v(p 1 , q 1 ) can be interpreted as the value of the auxiliary game where first x and y are chosen according to p 1 ⊗ q 1 and observed by the players, then players respectively choose p ∈ Γ(x) and q ∈ Λ(y) and finally Player 1's payoff is v(p, q). And to play well in this game Player 1 has to know the realization of q 1 before choosing p. However since y 0 is a Dirac measure, balancedness implies that there exists p 2 ∈Γ(p 1 ) such that v(p 2 , q 1 ) ≥ v(p 1 , y 0 ). We have obtained the following lemma: Lemma 7.2. Given (x 0 , y 0 ) in X × Y , and p 1 ∈ Γ(x 0 ), there exists q 1 in Λ(y 0 ) and p 2 ∈Γ(p 1 ) such that: v(p 1 , q 1 ) ≤ v(x 0 , y 0 ) and v(p 2 , q 1 ) ≥ v(p 1 , y 0 ).
We now prove the proposition. Define, for x in X, h is continuous. We put Z = Argmax x∈X h(x), and consider x 0 ∈ Argmin x∈Z ϕ(x), where ϕ comes from the definition of Γ weakly acyclic.
Let us now prove that when v is balanced and excessive then it is depressive. For every (x, y), and every p ∈ Γ(x), we have v(x, y) ≥ v(p, y). Thus, for every (x, y), p ∈ Γ(x) and every q ∈ Λ(y), v(x, q) ≥ v(p, q) and consequently, min q∈Λ(y) v(x, q) ≥ min q∈Λ(y) v(p, q). Taking the maximum in p ∈ Γ(x) and using that v is balanced implies that min q∈Λ(y) v(x, q) ≥ v(x, y). Since Λ is leavable, we have equality and so v is depressive with respect to Y .

Proof of proposition 3.3
Let (λ n ) n be a vanishing sequence of discount factors such that v λn − v → n→∞ 0. Fix (x, y) in X × Y , by symmetry it is enough to show that there exists p ∈ Γ ∞ (x) such that v(x, y) ≤ v(p, y) ≤ u(p, y). If v(x, y) ≤ u(x, y), it is enough to consider p = δ x , so we assume v(x, y) > u(x, y). For n large enough, v λn (x, y) > u(x, y) + λ n .
We have max p∈Γ(p n

Proof of proposition 3.4
We start with a lemma.
Proof: p = lim n p n , with p n ∈Γ n (x 0 ) for each n. It is enough to prove that v(p n , y 0 ) ≤ v(x 0 , y 0 ) for each n, and we do the proof by induction on n. The case n = 1 is clear by definition of v excessive. Since p n+1 ∈Γ(p n ), it is enough to prove that for p in ∆(X) and p ∈Γ(p ), we have v(p , y 0 ) ≤ v(p , y 0 ). By definition ofΓ, (p , p ) is in the closure of conv(GraphΓ). y 0 is fixed, and the function h : p −→ v(p, y 0 ) is affine continous on ∆(X). The set is convex and compact, and we want to show that Graph(Γ) = conv(GraphΓ) ⊂ D.
It it enough to prove that Graph(Γ) ⊂ D, and this is implied by the fact that v is excessive. This concludes the proof of lemma 7.3.
We now prove the proposition. Assume one of the gambling houses is strongly acyclic, and let v 1 and v 2 satisfying the conditions of proposition 3.4 (are continuous, balanced, v 1 satisfies P 1 and v 2 satisfies P 2). We will show that v 1 ≤ v 2 .
By symmetry, suppose that Γ is strongly acyclic. From Proposition 3.2, v 1 and v 2 are excessive (in X) and depressive (in Y ). v 1 − v 2 being continuous on X × Y , define the compact set: Consider now ϕ u.s.c. given by the strong acyclicity condition of Γ. The set Z being compact, there exists (x 0 , y 0 ) minimizing ϕ(x) for (x, y) in Z.

Proof of theorem 4.2
We start with considerations valid for the 3 cases of the theorem. We fix J = [0, 1/4] in all the proof, and only assume for the moment that I is a compact subset of [0, 1/4] containing 0 and 1/4. Consider λ ∈ (0, 1). It is clear that v λ (c, c ) = 0, and v λ (a, c ) = v λ (b, c ) = 1. By symmetry of the payoffs and transitions, we have z λ is indeed easy to compute. If the game is at (c, a ), Player 1 can not move, and Player 2 wants to reach c as fast as possible, so he will choose β = 1/4 and we have (see definition 2.11): z λ = λ1 + (1 − λ)( 1 16 0 + 15 16 z λ ), so that: Proposition 7.4. Assume J = [0, 1/4], min I = 0 and max I = 1/4. Then for λ small enough, (6) express the fact that at (a, a ) or (b, b ), it is optimal for Player 2 to play the pure strategy β = 0 (stay at the same location and wait until Player 1 has moved), and Player 1 can play a pure strategy α there. Similarly, (7) express the fact that at (a, b ) or (b, a ), it is optimal for Player 1 not to move. In spite of these simple intuitions, the proof of the proposition is rather technical, and is proved separately below.
Taking for granted proposition 7.4, we now proceed to the proof of theorem 4.2. It is simple to study the simple maximization problem of Player 2 given by (7), which is simply minimizing a concave polynomial on the interval J = [0, 1/4].
The following lemma implies part 2) of theorem 4.2.
Lemma 7.5. If 0 is an isolated point in I, then y λ and x λ converge to 0.
Proof: In this case there exists α * > 0 such that α λ ≥ α * for all λ. Passing to the limit in (11) gives the result.
We will now prove parts 1) and 3) of the theorem. The fact that I ⊂ J gives an advantage to Player 2, which can be quantified as follows.
Consider again the concave optimization problem of Player 1 given by equation (6), and denote by α * (λ) = y λ −x λ 2(x λ −z λ ) > 0 the argmax of the unconstrained problem if Player 1 could choose any α ≥ 0. If y λ and x λ converge to v > 0, then α * (λ) ∼ √ λ 1−v v , and Player 1 would like to play in the λ-discounted game at (a, a ) some α close to Lemma 7.7. Let λ n be a vanishing sequence of discount factors such that √ λ n ∈ I for each n. Then y λn and x λn converge to 1/2.
Proof: By considering a converging subsequence we can assume that y λn and x λn converge to some v in [0, 1]. By the previous lemma, v ≤ 1/2, and we have to show that v ≥ 1/2. We have for each λ in the subsequence, since Player 1 can choose to play α = √ λ: By passing to the limit, we get 2v ≥ 2(1 − v), and v ≥ 1/2.
Lemma 7.8. Let λ n be a vanishing sequence of discount factors such that for each n, the open interval ( 1 2 √ λ n , 2 √ λ n ) does not intersect I. Then lim sup n y λn ≤ 4/9.
Proof: Suppose that (up to a subsequence) x λn and y λn converges to some v ≥ 4/9. It is enough to show that v = 4/9. We know that v ≤ 1/2 by lemma 7.6, and since λ for λ small in the sequence. By assumption ( 1 2 √ λ, 2 √ λ) contains no point in I and the objective function of Player 1 is increasing from 0 to α * (λ) and decreasing after α * (λ). There are 2 possible cases: If α λ ≤ 1 2 √ λ we have: Dividing by λ and passing to the limit gives: v ≤ 1 − v − 1 4 v, i.e. v ≤ 4 9 . Otherwise, α λ > 2 √ λ and we have: Again, dividing by λ and passing to the limit gives: Finally, lemma 7.7 proves part 1) of theorem 4.2, whereas lemmas 7.7 and 7.8 together imply part 3), concluding the proof of theorem 4.2.
• It is not difficult to adapt lemma (7.8) to show the divergence of (v λ ) as soon as J = [0, 1/4] and I satisfies: a) there exists a sequence (λ n ) converging to 0 such that √ λ n ∈ I for each n, and b) there exist η > 0 and a sequence (λ n ) converging to 0 such that for each n, I does not intersect the interval [ √ λ n (1 − η), √ λ n (1 + η)]. • It is important for the counterexample that I = { 1 2 2n , n ∈ IN * } ∪ {0} is not semi-algebraic. Indeed, it has been showed that if we assume X and Y finite, and the transitions Γ, Λ and the payoff u to be definable in some o-minimal structure, then (v λ ) λ converges [Bolte et al., 2015].

Proof of proposition 7.4
We proceed in 4 steps.
1. The game at (b, a ): It is intuitively clear that y λ ≥ x λ since Player 1 is better off when the players have different locations. We now formalize this idea. Consider the game at (b, a ). The current payoff is 1, and Player 1 has the option not to move, so we obtain by definition 2.11: and since y λ ≤ 1, we obtain: x λ ≤ y λ . Now, min β∈J (β(x λ − y λ ) + β 2 (1 − y λ )) ≥ min β∈J β(x λ − y λ ) + min β∈J β 2 (1 − y λ ) = 1/4(x λ − y λ ), hence: In the same spirit, in the game at (a, a ), Player 2 has the option not to move, so we have: 2. The game at (a, a ): Consider now the game at (a, a ). By definition 2.11, x λ is the value of the game (possibly played with mixed strategies), where Player 1 chooses α in I, Player 2 chooses β in J and the payoff to Player 1 is: We want to prove that in this game, it is a dominant strategy for Player 2 not to move, that is to choose β = 0. We need to show that for all α and β, g λ (α, 0) ≤ g λ (α, β). As a function of β, g λ (α, β) can be written as a constant plus: So we want to show that for all α in I, β in J: Since the expression is decreasing in α, it is enough to prove it with α = 1/4: This is true for β = 0, and will be true for all β in J if and only if it is true for β = 1/4, so we are left with proving: Consider λ ≤ 1/32, and recall that z λ ≤ 16λ. If x λ ≤ 1/2, then clearly (14) holds. Assume on the contrary that x λ ≥ 1/2, then z λ ≤ x λ , and (13) gives: (14) holds as well. We have shown that in the λ-discounted game at (a, a ) with λ ≤ 1/32, Player 2 has a pure dominant strategy which is β = 0. Considering a pure best reply of Player 1 against this strategy implies that the game at (a, a ) has a value in pure strategies satisfying x λ = (1 − λ) max α∈I (αy λ + (1 − α − α 2 )x λ + α 2 z λ ), i.e. equation (6) is proved.
Since z λn converges to 0, so does x λn and y λn , and moreover y λn −x λn λn converges to 0. This is in contradiction with equation (12). We have shown (5).
4. The game at (b, a ) again: We proceed as for the game at (a, a ) and will show that in the game at (b, a ), it is a dominant strategy for Player 1 not to move. By definition, y λ is the value of the game where Player 1 chooses α in I, Player 2 chooses β in J and the payoff is λ We want to show that h λ (0, β) ≥ h λ (α, β) for all α and β. That is, for all α and β, For λ small enough, we have z λ ≤ x λ ≤ y λ , and the above property is satisfied. Hence in the game at (b, a ) it is dominant for Player 1 to choose α = 0. Consequently, Player 2 has a pure optimal strategy and we can write: proving equation (7). And the proof of proposition 7.4 is complete.

Appendix B: uniform analysis
To study the uniform value, we restrict the analysis to idempotent games. The extension to the general case is open. In that case, clearly Γ = Γ ∞ and Λ = Λ ∞ . Any state that could be reached in several stages can be reached immediately in a single stage. This holds true for instance in splitting games. Notice also that for any Γ, the multifunction Γ ∞ is idempotent. If Γ • Γ = Γ , then Γ(x) =Γ n (δ x ) = Γ ∞ (x) for all n and x so if the gambling game is idempotent the notions of weak and strong acyclicity coincide.

Definition of idemptotent gambling games
An immediate corollary of theorem 3.1 is the following.
Corollary 8.2. Consider a standard idempotent gambling game where a player has an acyclic gambling house. Then {v λ } converges uniformly to the unique function v in C(X × Y ) which is excessive, depressive and satisfies:

Definition of uniform value and optimal strategies
In repeated and stochastic games, a stronger notion of limit value is given by the uniform value. As usual, a strategy of Player 1, resp. Player 2, is a measurable rule giving at every stage t, as a function of past and current states, an element in Γ(x t ), resp. of Λ(y t ), where x t and y t are the states of stage t. A pair (x 1 , y 1 ) of initial states and a pair of strategies (σ, τ ) naturally define a probability on the set of plays (X × Y ) ∞ (with the product σ-algebra, X and Y being endowed with their Borel σ-algebra), which expectation is written IE (x 1 ,y 1 ),σ,τ .
Definition 8.3. w ∈ B(X × Y ) is the uniform value of the gambling game and both players have optimal uniform strategies if: There exists a strategy σ of Player 1 that uniformly guarantees w: for any ε > 0, there is N such that for any any n ≥ N and initial states (x 1 , y 1 ), for any strategy τ of Player 2, IE (x 1 ,y 1 ),σ,τ 1 n n t=1 u(x t , y t ) ≥ w(x 1 , y 1 ) − ε. And similarly, there exists a strategy τ of Player 2 that uniformly guarantees w: for any ε > 0, there is N such that for any n ≥ N and initial states (x 1 , y 1 ), for any strategy σ of Player 1, IE (x 1 ,y 1 ),σ,τ 1 n n t=1 u(x t , y t ) ≤ w(x 1 , y 1 ) + ε.

Definition of adapted strategies
Our main theorem 3.1 suggests particularly interesting strategies for the players. Consider again conditions P 1 and P 2, and fix a pair of states (x, y). If u(x, y) ≥ v(x, y), the running payoff of Player 1 is at least as good as the payoff he should expect in the long run, so we may consider that Player 1 is "quite happy" with the current situation and in order to satisfy P 1 it is enough for him not to move, i.e. to choose p = δ x . If on the contrary u(x, y) < v(x, y), Player 2 is happy with the current situation and can choose q = δ y to satisfy P 2, whereas Player 1 should do something, and a possibility is to move towards a p satisfying P 1. This looks interesting for Player 1 because if Player 2 does not react, eventually the distribution on states will approach (p, y) and (in expectation) Player 1 will be happy again with the current situation since u(p, y) ≥ v(p, y).
Definition 8.4. Let w in B(X × Y ).
A strategy of Player 1 is adapted to w if whenever the current state is (x, y), it plays p ∈ Γ(x) such that w(x, y) ≤ w(p, y) ≤ u(p, y).
A strategy of Player 2 is adapted to w if whenever the current state is (x, y), it plays q ∈ Γ(y) such that w(x, y) ≥ w(x, q) ≥ u(x, q).
If w satisfies Q1, resp. Q2, Player 1, resp. Player 2, has a strategy adapted to w (using a measurable selection theorem [Aliprantis and Border, 2006]). If moreover w is excessive, we have w(x, y) = w(p, y) ≤ u(p, y). Mertens Zamir [Mertens and Zamir, 1971] in repeated games with incomplete information and Oliu-Barton [Oliu- Barton, 2017] in splitting games used similar strategies derived from the M Z-characterization instead of the Q-characterization.

Bounded variation and existence of uniform adapted strategy
In repeated games with incomplete information or in splitting games, an important property is that any martingale on a simplex has bounded variations. This suggests the following.
Definition 8.5. A gambling house Γ has vanishing L 1 -variation if for every ε > 0, there is N such that for all n ≥ N and any sequence (p t ) s.t. p t+1 ∈ Γ(p t ), one has 1 n n t=1 d KR (p t+1 , p t ) ≤ ε.
The proof of the next result is inspired by Oliu-Barton [Oliu-Barton, 2017] in the framework of splitting games. He shows that Player 1 (resp. Player 2) can uniformly guarantee any excessive-depressive function satisfying M Z1 (resp. M Z2). Our proof is much shorter because it uses the new Q-characterization.
Proposition 8.6. In a standard gambling game where Γ has vanishing L 1 -variation, if w in B(X × Y ) is excessive-depressive and satisfies Q1, then a strategy of Player 1 adapted to w uniformly guarantees w.
Proof: Fix ε > 0. Because Lipschitz functions are dense in the set of continuous functions, there exists K > 0 and a K-Lispchtiz function u ε that is uniformly ε-close to u. Consider w in B(X × Y ) be an excessive-depressive function satisfying Q1, let σ be a strategy of Player 1 adapted to w and let τ be any strategy of Player 2. We fix the initial states and write IE = IE (x 1 ,y 1 ),σ,τ .
Then, the average payoff of the n-stage game is: Because the gambling game is of vanishing variation, there is N such that when n ≥ N one has K n n t=1 d KR (p t+1 , p t ) ≤ ε. Because of Q1, IE(u(x t+1 , y t )) ≥ IE(w(x t+1 , y t )) = IE(w(x t , y t )). Since w is depressive, IE(w(x t , y t ) ≥ IE(w(x t , y t−1 )), so that IE(w(x t+1 , y t )) ≥ IE(w(x t , y t−1 )), and this property holds for every t. Consequently, IE(u(x t+1 , y t )) ≥ IE(w(x 2 , y 1 )) = w(x 1 , y 1 ). We obtain finally 1 n IE( n t=1 u(x t , y t )) ≥ w(x 1 , y 1 ) − 3ε, ending the proof.
Corollary 8.7. If Γ and Λ have vanishing L 1 -variations then there is at most one excessive-depressive function in B(X × Y ) satisfying Q1 and Q2.
Proof: If such a function exists, it can be guaranteed by both players. So it must be the uniform value of the game, which is unique whenever it exists.
Combining proposition 3.3, proposition 8.6 and corollary 8.7, we obtain the existence of the uniform value in a class of gambling games: Theorem 8.8. In a standard and idempotent gambling game where Γ and Λ have vanishing L 1 -variation, the uniform value v exists, and strategies adapted to v are uniformly optimal. Moreover, v is the unique excessive-depressive function in B(X ×Y ) satisfying Q1 and Q2.
Proof: From proposition 3.3, any accumulation point of v λ satisfies P 1 and P 2, i.e. Q1 and Q2, so is unique and is the uniform value, as shown above.
Observe that acyclicity is not assumed in theorem 8.8. But vanishing L 1 -variation is a form of acyclicity (it rules out for example non-constant periodic orbits). A more formal link between acyclicity and vanishing variation is given in section 8.5 of the Appendix.
8.5 Bounded L 2 -variation and acyclicity Definition 8.9. A gambling house Γ is of bounded L 2 -variation if there is C > 0 such that for every sequences {p t } satisfying p t+1 ∈ Γ(p t ), one has ∞ t=1 d KL (p t+1 , p t ) 2 ≤ C < +∞ For example splitting games have bounded L 2 -variation.
Proposition 8.10. If Γ is idempotent, non-expansive, leavable and of bounded L 2variation, then it is strongly acyclic and has vanishing L 1 -variation.
Proof: Bounded L 2 -variation =⇒ weak acyclicity because the real valued function: is strictly decreasing along non-constant orbits of Γ (i.e. arg max p∈Γ(x) ϕ(p) = δ x ). But for idempotent Γ, weak acyclicity and acyclicity coincide because Γ ∞ = Γ. Continuity of ϕ is a consequence of non-expansivity of Γ and the fact the bound C is uniform over the sequences {p t } satisfying p t+1 ∈ Γ(p t ). Finally, that bounded L 2variation implies vanishing L 1 -variation is a consequence of Cauchy-Schwartz inequality ( 1 n n t=1 d KL (p t+1 , p t ) ≤ 1 √ n n t=1 d KL (p t+1 , p t ) 2 ).