Live fast, die young

Irrational agents are driven out of the market. This should favor learning: Irrational agents observing that rational agents are being more successful should adopt rational beliefs. We show that the threat of elimination is not sufficient to push agents toward rationality: A shorter “life” might be more rewarding than a longer one. Even if they are eliminated in the long run, irrational agents might rationally stay irrational in the sense that their ex-ante and ex-post welfare levels over their whole life are higher than (1) the welfare level that they would reach if they adopted rational expectations, (2) the welfare level reached by the otherwise identical (same initial wealth and same risk aversion) rational agents, (3) the welfare level that they would have if they were given the optimal allocation of the rational agent. Threat of elimination is not sufficient to push irrational agents toward rationality, and rational and surviving agents’ performances are not sufficiently high to generate learning through an adaptive process based on imitation of successful behaviors. A numerical illustration is provided.

"Passionate lived, reasonable lasted", Chamfort In a complete market framework, 1 it has been shown that agents who make inaccurate predictions are driven out of the market by those who are more accurate (see e.g., Sandroni 2000). Intuitively, this should provide two kinds of explanations for the rational expectations hypothesis. First, an asymptotic argument. Since irrationals are eliminated, prices are determined by the rational agents at least in the long run. This "natural selection" argument has a long tradition in economic analysis (see e.g., Alchian 1950;Friedman 1953;Cootner 1964 andFama 1965). Yan (2008) provided a first limit to the first argument: even though irrational agents are eliminated, it might take hundreds of years to eliminate them. Furthermore irrational agents might survive and even dominate in the long run. In such a situation, there is obviously no reason for them to adopt a rational behavior.
Second, a pragmatic argument: Irrational agents should see that rational agents are being more successful and they should adopt the same beliefs (or the same heuristics in order to construct their beliefs) as the most successful ones. This argument is similar to the pragmatic beliefs concept of Hvide (2002) that refers to the philosophical school known as Pragmatism. Russel (1945) interprets one of its main ideas as follows: Agents should (or do) hold beliefs that have good consequences.
In this note, we show that even when the threat of elimination is effective, it is not sufficient to push agents toward (informational) rationality: A shorter "life" might be more rewarding than a longer one. Therefore, pragmatism does not necessarily lead to rationality. More precisely, we show that there are situations where irrational agents might rationally stay irrational in the sense that their ex-ante and ex-post welfare levels over their whole life are higher than • the welfare level that they would reach if they adopted rational expectations, • the welfare level reached by the otherwise identical (same initial endowment and same risk aversion) rational agents, • the welfare level that they would have if they were given the optimal allocation of the rational agent.
Our results shed some light on a debate initiated by Grossman and Stiglitz (1980) around the following simple question: Is there an economic rationale for learning? They show that when acquiring information is costly (in terms of money but it could also be in terms of efforts), the markets cannot be informationally efficient. Kyle (1989) solves the paradox by introducing imperfect competition. In both models, agents learn from price observation through a Bayesian learning process. 2 However, both models take learning as an intrinsic element of agents' behavior. Therefore, the question remains open: Is it rational for agents to spend effort on learning? From the economic rationality point of view, a learning process is sustainable only if it leads to a situation where the learners are better off than the non-learners. An irrational agent will engage in a learning process only if he observes that learners or informed/rational agents are being more successful (from their subjective point of view and/or from his own point of view) or if he evaluates that adopting rational beliefs would lead him to be more successful than he currently is. This is the basis of adaptive or evolutionary learning (as opposed to Bayesian learning which assumes that learning and search of truth have an intrinsic economic value).
As done by Routledge (1999) with Grossman and Stiglitz model, our model can be embedded in a repeated setting with non-overlapping generations and where each generation learns from the previous ones. 3 No wealth is transferred between generations. In such a setting, a given agent might have the choice between different heuristics in order to construct his beliefs and will choose whichever one happened to be working well in the past. Let us assume that one of these heuristics leads to be rational and is adopted by a given proportion of the agents. Another heuristics leads to be overly optimistic 4 (pessimistic). The agents who adopt this heuristics may know that they are overoptimistic (overpessimistic), and they may even be told (by the rational agents and with rational arguments) that they are overoptimistic 5 (overpessimistic). However, if they compare the average (ex-post or ex-ante) well-being of the optimistic agents over the past generations they will find no reason to adopt the rational heuristics.
We emphasize that our aim is neither to explain how beliefs are constructed nor to explain how rationality and irrationality emerge. Our aim is just to show that pragmatism and economic rationality do not necessarily eliminate informational irrationality and that beliefs heterogeneity is sustainable even in a dynamic setting.

The model
The filtered probability space ( , F, (F t ) , P) describes uncertainty and W is a standard unidimensional Brownian motion. We consider a continuous-time pure exchange Arrow-Debreu economy, with a single consumption good. The aggregate endowment process e * satisfies the following stochastic differential equations de * t = μdt + σ dW t where μ and σ are given constants. There are two groups of agents. The first (resp. second) group consists of rational (resp. irrational) agents whose common belief is given by P (resp. Q). From the point of view of the rational (resp. irrational) agents, the total endowment is a Brownian motion with drift parameter (mean by unit of time) μ 1 ≡ μ (resp. μ 2 ≡ μ + δ 2 σ ) and scale parameter σ (that corresponds to a variance by unit of time given by σ 2 ) where δ 2 is a given constant. The parameter δ 2 measures irrational agents deviation in their perception of economic growth (normalized by the level of risk). For the symmetry of notations, we introduce δ 1 = μ 1 −μ σ . Irrational agents are optimistic if δ 2 > 0 and pessimistic if δ 2 < 0 while δ 1 = 0.
We assume that all agents have CARA utility functions. A classical aggregation approach permits to aggregate all the agents in the first (resp. second) group into a 3 It might seem puzzling to have a repeated model where each generation lives infinitely. The infinite horizon has been adopted only for the sake of simplicity in the computations. All results remain valid with a finite horizon T for each generation (see the "Appendix"). 4 For instance, a procedure that overweighs good news with respect to bad ones. 5 As chartists are regularly told by fundamentalists that markets are efficient and that there is no rationality behind their heuristics. Nevertheless, we still have a non-negligible proportion of chartists in the economy. unique agent denoted by Agent 1 (resp. Agent 2) whose belief is given by P (resp. Q), whose initial endowment is given by the total endowment within the group and whose risk tolerance level is the sum of the risk tolerance levels across the group and is denoted by θ 1 (resp. θ 2 ). Therefore, Agent i maximizes In the rational setting ( Therefore, irrationality acts asymptotically against the irrational agent in terms of consumption shares. In particular, when 1 2 θ 1 δ 2 2 approaches μ, the irrational agent is asymptotically eliminated by the rational one in the sense that the expected consumptions ratio converges to 0. The second comparison makes sense if we consider our model as a log-linearization of a model with power utility functions and where the total endowment follows a diffusion process with drift μ and volatility σ . When the irrational agent is risk averse enough (with respect to the rational one), he is asymptotically eliminated in terms of expected exponential consumption.
Finally, the irrational agent is also asymptotically eliminated by the rational one in terms of date t expected utility as far as δ 2 σ +θ 1 δ 2 θ 1 +θ 2 > 0 and, in particular, when δ 2 > 0.

Welfare comparisons
Our aim now is to compare the welfare levels of our two agents. The welfare level of Agent i is given by In order to fully specify S 1 and S 2 , we have to specify the probabilities under which each expectation is taken.
• For an ex-ante comparison, it is natural to take the subjective probabilities with S 1 = • For an ex-post comparison, since the states of the world the agents face during their life are governed by the objective probability, we take • It is quite natural for Agent 2 to compare his own consumption plan with the consumption plan of Agent 1 and to dream about how much he would feel happy with this alternative consumption plan even though it is not clear how (with which beliefs) he could obtain that consumption plan at the equilibrium. For an ex-ante comparison, (c * 2 ). • Let c * 2 be the optimal allocation of Agent 2 when being rational. It is not observable and can be computed by Agent 2 only if he knows all the caracteristics of the economy. Under this assumption, he might compare his welfare with the welfare when being rational. For an ex-ante comparison, we take (c * 2 ). We have then six possible comparisons corresponding to the six possible pairs of functions S 1 and S 2 .
Remark that for the four last pairs, only u 2 is at stake, while the two first pairs involve comparisons between u 1 and u 2 . It is typically taken as a matter of faith in economics that interpersonal comparisons of utility have no meaning. On the other hand, it seems natural to consider dynamics in which a given agent takes into account the degree of happiness of the other agent in order to evaluate his own success in relative terms. This might make sense if he considers that the other agent has comparable happiness criteria. In order to make the welfare functions comparable, we may normalize them by replacing u 1 and u 2 respectively by η 1 u 1 and η 2 u 2 where η 1 and η 2 are • either such that the welfare functions are equal along the 0-path. This is the case for η 1 = η 2 = 1, • or such that the welfare functions are equal along the initial allocation, i.e., , • or such that the ex-post welfare functions are equal along the initial allocation. This only makes sense when comparing ex-post realized welfare levels and would lead .
We then have three additional alternative comparisons.

Welfare evaluations
In this section, we provide evaluations for all the functions introduced above.

Proposition 3 Agent i equilibrium welfare level is the risk-tolerance-weighted geometric average of the individual welfare levels within Group i.
Therefore, comparing U 1 c * 1 and U 2 c * 2 amounts to comparing the average utility levels, respectively, among the rational and the irrational groups of agents.

Claim 4 Under Condition (C), Agent i equilibrium welfare level U i c * i is given by
and a 1 = a 2 . If Agent 2 is optimistic enough (δ 2 > 0 and large enough) and more risk averse (θ 2 < θ 1 ) then U 2 c * 2 > U 1 c * 1 .
By construction, we have U ex-post 1 (c * 1 ) = U 1 (c * 1 ). In order to compute U ex-post 2 (c * 2 ) we need the following additional condition.

Claim 5 Under Conditions (C) and (C1), Agent i ex-post welfare level is given by
In order to determine U 2 (c * 1 ) and U ex-post 2 (c * 1 ), let us introduce the following additional conditions.

Claim 7 Under Conditions (C4) and (C5), we have
We are now in a position to analyze specific examples.

A numerical example
The wealth per capita in the whole economy is $50,000, and we assume that there is no difference between the two groups in terms of size nor in terms of wealth distribution.
In the first group, the average level of relative risk aversion is equal to α 1 = 1.7, while it is equal to α 2 = 3.6 in the second group. The objective growth level in the economy is equal toμ = 4.2 % with a volatilityσ = 3 %. These objective parameters characterize the beliefs of the rational agents (Group 1), while the agents in Group 2 believe that the growth rate is equal to 4.35 % (with a 3 % volatility). The reduced form of this economy consists in a two-agent economy where both agents have a $50,000 initial endowment; Agent 1 has an absolute risk tolerance level given by θ 1 = 29,412, while Agent 2 has an absolute level of risk tolerance given by θ 2 = 13,889. With the notations of our arithmetic growth model, we have μ = 4200, σ = 3000 and δ 2 = 0.05.
We finally have the following inequalities and • the ex-ante (subjective) welfare of Agent 2 is higher than the ex-ante (objective/subjective) welfare of Agent 1, • the ex-post (objective) welfare of Agent 2 is higher than the ex-ante/ex-post welfare of Agent 1, • the ex-ante welfare level of Agent 2 is higher than the welfare level he would reach in a model where all agents would be rational, • both ex-ante and ex-post welfare levels of Agent 2 are higher than those that he would reach if endowed with the consumption plan of Agent 1.
As far as the U s or the U s are concerned, we have . Since c * 2 is not observable, comparisons involving it are less natural from a learning point of view. However, from a theoretical point of view, it would be interesting to know whether there are situations where all the inequalities above are satisfied and where we further have U ex-post 2 . With e 0 = 0, μ = 1, σ = 1, θ 1 = 1, θ 2 = 0.5 and δ 2 = 0.05, Conditions (C) to (C4) are all satisfied, Agent 2 is asymptotically eliminated and Agent 2 is asymptotically eliminated, but his welfare level is not be dominated for none of the dominance concepts and for none of the normalizations we considered.

Conclusion
In this note, we provided an example where threat of elimination exists but • is not sufficient to push irrational agents toward rationality and • rational and surviving agents' performances are not sufficiently high to generate learning through an adaptive process: imitating successful behavior does not lead to rationality nor to longevity.
This result can be enlightened by the analysis conducted in Jouini et al. (2013). In their static setting where the agents choose their beliefs strategically and for a very risk-averse agent, the demand in the risky asset is negative, so that the expected utility is increasing in the price of the risky asset. An optimistic belief is associated with a higher demand, hence to a higher price, the "optimal" belief balances this benefit of optimism against the costs of worse decision-making and optimism rather than rationality is the best response to rationality. This is exactly what happens in our dynamic example where irrational agents are better off being irrationally optimistic instead of being rational. In fact, we obtain much more since it appears that irrationality dominates rationality for many other possible criteria.
where M 2 = dQ dP and M 1 = 1 is introduced for the symmetry in the formulas and where λ 1 and λ 2 are Lagrange multipliers and are such that the budget constraint is saturated for each agent, i.e., E with A and B as above. Condition (C) gives us −A + 1 2 B 2 < 0 and the integral above converges which gives Equation (2) gives then with F 1 and G 1 as above. From the expressions of E 1 and E 2 we also have The second solution is in contradiction with Condition (C).
where C i = − C i θ i , D i = − D i θ i and E i = − E i θ i for i = 1, 2. We have C 2 + 1 2 D 2 2 − C 1 + 1 2 D 2 1 = δ 2 σ +θ 1 δ 2 θ 1 +θ 2 which ends the proof. Proof of Proposition 3 Let us index by γ ∈ i the individual members of Group i and let us denote respectively by c * i,γ , θ i,γ and w i,γ their optimal consumption, risk tolerance level and endowment share within Group i. By construction, we have γ ∈ i c * i,γ = c * γ , γ ∈ i θ i,γ = θ i and γ ∈ i w i,γ = 1. It is easy to check that, at the equilibrium, we have c Proof of Claim 4 From the first-order conditions, we have We know that E i = θ i θ 1 +θ 2 e * 0 + θ 1 θ 2 θ 1 +θ 2 ln λ j λ i which gives U 2 c * as far as b 1 and b 2 are positive. These conditions, respectively, correspond to Conditions (C) and (C1). Furthermore, b 1 − b 2 = δ 2 σ +θ 1 δ 2 θ 1 +θ 2 and we have then b 1 > b 2 for δ 2 > 0 which means that U ex-post 2 (c * 2 ) > U ex-post 1 (c * 1 ).