Do People Always Pay Less than They Say? Testbed Laboratory Experiments with IV and HG Values

Hypothetical bias is a long-standing issue in stated preference and contingent valuation studies - people tend to overstate their preferences when they do not experience the real monetary consequences of their decision. This view, however, has been challenged by recent evidence based on the elicitation of induced values (IV) in the lab and homegrown (HG) demand function from different countries. This paper uses an experimental design to assess the extent and relevance of hypothetical bias in demand elicitation exercises for both IV and HG values. For testbed purpose, we use a classic second-price auction to elicit preferences. Comparing the demand curve we elicit in both, hypothetical bias unambiguously (i) vanishes in an induced-value, private good context, and (ii) persists in homegrown values elicitation context. This suggests hypothetical bias in preference elicitation appears to be driven by “preference formation�? rather than "preference elicitation". In addition, companion treatments highlight two sources of the discrepancy observed in the HG setting: the hypothetical context leads bidders to underestimate the constraints imposed by their budget limitations, whereas the real context creates pressure leading them to bid "zero" to opt out from the elicitation mechanism. As a result, there is a need for a demand elicitation procedure that helps subjects take the valuation exercise sincerely, but without putting extra pressure on them.


Introduction
Hypothetical Bias (HB) arises whenever elicited preferences differ depending on whether the elicitation method has real monetary consequences or not. A gap between stated intentions and real economic commitments undercuts the basic foundations of popular stated preference valuation methods used in cost-benefit analyses. The accumulated evidence, from the lab and field, leads Harrison and Rutström (2008) to state: "the evidence strongly favours the conclusion that hypothetical bias exists"; a view confirmed by meta-analysis (List and Gallet, 2001;Murphy, Stevens, and Weatherhead, 2005). The conclusion is that the size (and eventually reliance) of HB highly depends on both the nature of the good and the elicitation mechanism.
Until recently, it was a challenge to relate hypothetical bias results to demand revelation due to the lack of control on the true underlying preferences. Researchers today have addressed this question in a series of studies using induced values settings. While Cherry, Frykblom, Shogren, List, and Sullivan (2004) do find a strong discrepancy between real and hypothetical settings in a Vickrey auction, 1 most IV studies challenge the idea that HB arises in eliciting induced values in a wide range of mechanisms -referendum in Taylor, McKee, Laury, and Cummings (2001); Burton, Carson, Chilton, and Hutchinson (2007) ; Dichotomous Choice and Payment cards in Vossler and McKee (2006) ; BDM and referenda in Murphy, Stevens, and Yadav (2010) ; VCM with provision point in Mitani and Flores (2009a) ; Discrete choice with different provision points in Collins and Vossler (2009) -and for various types of goods (private, public and/or publicly produced private good in Murphy, Stevens, and Yadav, 2010). This evidence points to some good-specific process leading one to conclude that hypothetical bias is more likely to be observed in homegrown settings. But it remains difficult to draw clearcut conclusions due to the lack of clean comparison between induced value and homegrown value elicitation.
To our knowledge, only one study contrasts the discrepancy observed according to whether preferences are induced (IV) or homegrown (HG) -holding all else constant, including the elicitation mechanism and the experimental procedures. Whatever the nature of the good and the mechanism (private good valued through a Becker-DeGroot-Marshak mechanism (BDM) ; public or privately produced private good both elicited through referenda) Murphy, Stevens, and Yadav (2010) find clear evidence in favor of an HB in (and only in) HG settings. This result may not be a universal phenomenon. Ehmke, Lusk, and List (2008) implemented the same referendum experiment asking subjects in China, France, Niger and the United States to reveal their homegrown preferences for bottles of water. They find US subjects (Indiana and Kansas) exhibit a significant hypothetical bias; but subjects in China and Niger are likely to exhibit a 'negative' bias; and finally, French subjects (from Grenoble) are the least prone to overstating bids. While easy to implement, the major drawback of dichotomous choice mechanisms (such as referendum or BDM) is one cannot observe bidding behavior along the entire demand curve.
In this paper, we complement our previous evidence on eliciting preferences in Vickrey auctions (Jacquemet, Joule, Luchini, and Shogren, 2009a,b) with new experiments designed to understand better the discrepancy between values elicited with and without monetary incentives. First, we apply the second price auction to private induced values in France. We find no evidence of hypothetical bias. Second, we assess the robustness of this result by moving from Paris to a second French city, Lyon, and confirm the result. Third, we apply the same mechanism to the elicitation of HG values for a desirable public good, protecting dolphins. We observe a substantial difference in revealed demand given the incentive context. Our companion treatment highlights two main reasons why this difference arises: (a) a hypothetical setting leads some bidders to disregard the upper bound imposed by their disposable income, i.e., they volate their budget constraint; and (b) the real environment can make them feel trapped within the valuation exercise and they can choose a zero bid as a mean to opt out from the mechanism, i.e., the mechanism violates a bidder's participation constraint. These results shed light on the ability to elicit true preferences in experimental or survey contexts. Revealing true preferences for a socially desirable good requires a mechanism that commits bidders to take their budget constraint seriously, but without putting undue pressure on them.

Experiments
We use two related experiments: induced value (IV) and homegrown value (HG) experiments. The IV experiment elicits preferences based on an induced demand function; the HG experiment elicits each bidder's own homegrown preferences for a real-world good. Each experiment is split into two treatments: the monetary incentives are either real or hypothetical. In both experiments, we induce people to reveal their preferences using a Vickrey auction. The focus on the Vickrey (1961) auction stems from its revelation property: without an outside option, a rational bidder's weakly dominant strategy is to bid his induced value. In addition, experimental evidence confirms that the second-price auction performs reasonably well in revealing preferences on average for both induced (Kagel, 1995) and non-induced (e.g., Rutström, 1998) values auctions. The Vickrey mechanism is well-suited for a testbed analysis such as ours, since it allows to observe the whole demand curve instead of only the mass points revealed through dichotomous choice settings.

Design of the IV experiment
The IV experiment assesses hypothetical bias in two locations in France: Paris and Lyon. Following standard procedures, an unspecified "good" is sold in a Vickrey second-price auction: the highest bidder wins and pays the second-highest bidder's bid. An auction has 9 bidders each endowed with a unique induced value -i.e. the price at which the bidder can sell the good to the monitor after the auction (see, e.g., Kagel, 1995). All monetary values are expressed in ecu (Experimental Currency Unit). The induced demand curve is identical in all auctions and is defined by: {84; 76; 71; 68; 65; 63; 53; 38; 24}. The auction is repeated over 9 periods, implementing all possible permutations between individual private values: each participant experiences only once each private value; and the whole demand curve is induced in every period. Although the repetition is deterministic, we avoid end-game effect by providing the subjects with no information on that point -except for the repetition itself. The bidders do not know the other bidders' induced value or the induced demand curve. A bidding period ends when every bidder has chosen a bid between 0 and 100. At the end of the period, subjects are privately informed about whether they win the auction (along with the price paid in this case), their gain for the period and, lastly, whether a new auction period is about to start.
Each subject receives a 10e show-up/participation fee. 2 In the IV-Real treatment, the ecu accumulated across all auction periods are added to this fee -would it happen, negative total earnings would decrease the show up fee up to 5e . 3 In contrast, only the 10e fee is paid under the IV-Hypothetical treatment, i.e. gains or losses are hypothetical. This is made common knowledge by stating explicitly in the written instructions that payments are either constant (hypothetical) or depend on decisions made in each period (real). Details about the nature of the monetary earnings is the only difference between the instructions used in both conditions. 4

Design of the HG experiment
The HG experiment examines preference elicitation of homegrown values for a real-world non-market good: adopting a dolphin. Subjects' homegrown values are elicited using the same elicitation mechanism as before, a second-price auction. The price for improved parallelism with decisions in the real world is the lack of control over true preferences: subjects enter the lab with their own private value, unknown to the experimenter, for the good.
The good sold in the HG auction is provided by the World Wide Fund (hereafter WWF), a wellknown non-governmental organization devoted to "protecting the future of nature". 5 Among a wide range of individual actions, the WWF offers the opportunity to "adopt" endangered animal species. This takes the form of an individual donation to a program aimed at fighting threats like habitat loss and poaching faced by endangered animals. Depending on the amount of the donation (among three pre-determined values), donors are sent gifts such as an adoption certificate, a photograph of the animal, a cuddly stuffed toy dolphin, a gift box, and so on. For the purpose of our experiment, this procedure has the attractive feature of ensuring the credibility of the donation, thanks both to the 2 Minimum hourly wage was 6.50 Euros at the time of the experiment (source: http://www.urssaf.fr). 3 This lower bound stems from how we recruited participants: we contractually commit ourselves to a minimum earnings equaling 5e . 4 For replication purpose, the instructions we use are as close as possible to those of Cherry, Frykblom, Shogren, List, and Sullivan (2004 WWF label and to the documentation associated with donation. We chose the entry-level offer, i.e., an adoption certificate and photograph are sent for each 25 USD (18.50 Euros when the experiments took place) donation to the WWF. Since the photograph and the adoption certificate are essentially symbolic in nature, this reduces the risk of valuations being influenced by "by-product" goods, such as a cuddly stuffed toy or a gift box. As in the IV-Experiment, individual valuations for the good are elicited with a Vickrey (secondprice) auction: each bidder privately posts a bid, the highest bid determines the winner of the auction and the market price is equal to the second highest bid. Subjects are grouped into markets of 9 bidders. We try to reduce noisy observations by repeating homegrown auctions five times. No information is provided before the end of the whole sequence, and one of the five periods is randomly drawn at the end of the game. The winner of the randomly drawn auction is entitled to adopt a dolphin, and the market price of this auction is the amount of the donation.
Our focus on donation behavior requires the bidders to enter the auctions with some positive experimental earnings, which may then be spent on the donation. This would mean giving bidders a large show-up fee for participating in the experiment. But it is an increasing concern in laboratory experiments that behavior can differ according to whether one has to decide on the allocation of either windfall or earned wealth (sometimes called endowment effect, see, among others, Rutström, 1998;Cherry, Frykblom, and Shogren, 2002). In the specific context of demand revelation using Vickrey auctions, Jacquemet, Joule, Luchini, and Shogren (2009a) show that earned money does make a difference to bidding behavior as compared to windfall wealth. In line with these results, and to be as close as possible to actual stated preference surveys in the field, we complement the 10e show-up fee distributed to the subjects with an earned-wealth design. This also replicates a common feature of homegrown valuation experiments focusing on hypothetical bias (e.g., Cummings and Taylor, 1999;Cummings, Elliott, Harrison, and Murphy, 1997). Earned wealth is implemented through a preliminary stage during which the subjects are asked to answer 20 general knowledge questions. The set of questions was taken from the annals of the "Concours de Catégorie B de la fonction publique", a civil service entry test for those who hold at least the French baccalaureate. 6 This pre-test is appropriate to discriminate between undergraduate students. Accompanying each question is a list of four possible answers. Subjects are explicitly told that one of the four answers is correct, and that monetary earnings labeled in ecu are proportional to the number of correct answers. The position of the correct answer is randomized between questions and the ordering of questions is kept the same for all subjects in all treatments. To ease comparison with the IV-experiment, subjects are granted a 10e show-up fee in addition to the wealth earned in the quiz.
The adoption procedure is described to the subjects using a French-language, slightly modified version of the official web page set up by the WWF. 7 The page provides a short description of a dolphin's life and of the WWF and, more importantly, a detailed presentation of the donation 6 Our source is http://pagesperso-orange.fr/bac-es/qcm/annales_c02_r01.html. The full list of questions are available from the authors upon request.
program and the documentation (gifts) sent should a subject adopt a dolphin. The scroll bar used to choose a donation amount between 0 and 30 Euros, along with an "OK" button, appears directly on the page and the bidders see the good description until they confirm their choice. Note the upper bound imposed on the bid is the same for all bidders and does not depend on experimental earnings. We clearly stated in the instructions that any bid above experimental earnings would have to be completed by out-of pocket money. Neither do we impose a lower bound or reservation price in the provision rule -minimum bid is zero. The good sold in the experiment is potentially cheaper in the lab than in the market, so we subsidize the winning donation to reach the market price when monetary incentives are binding. Subjects are not told anything about this subsidy. 8 The two main treatments differ only by the consequences of the adoption auction. The adoption is hypothetical in the HG-Hypothetical treatment; whereas the donations are subtracted from subjects' earnings in the HG-Real treatment. This implies donations are declarative in the hypothetical auction; no funds are actually transferred to the WWF and no adoption certificate is sent to the adopter. Those features are stressed within the instructions read to the participants. All other experimental features are identical in these two treatments -earnings from the quiz are always actually paid to subjects to avoid unwarranted wealth differences between our treatments.

Experimental procedures
We conducted six sessions of the IV experiment in the two largest French cities: Paris and Lyon. In Paris, we ran three Hypothetical sessions and one Real session, at the University Paris 1 ; in Lyon, we ran one Hypothetical session and one Real session at the GATE laboratory, University of Lyon. 9 All sessions of the HG experiment (one for each treatment) were run at the University Paris 1. For both experiments, each session used 18 subjects separated into two independent 9-bidder auctions. Overall, 108 subjects participated in the IV-Experiment sessions, and 36 in the HG experiment. Participants were mostly first to third-year undergraduate students in law, economics or chemistry. Both experiments were computerized using a software developed under Regate (Zeiliger, 2000).
All practical conditions were kept constant in the two locations and across experiments. In both Paris and Lyon, recruitment was internet-based (and made use of Orsee, Greiner, 2004) and all email-messages were harmonized. The two experimental labs are set identically, with wood separation between computers, organized in rows. One monitor ran all sessions and used identical procedures and words in welcoming the participants and describing the experiment.
Whatever the experiment, a typical session proceeds as follows. First, each subject signs an 8 As shown in Section 3.2, this feature implies that most offers elicited in the real context are below the market price.
The observed values are independent of field opportunities, which protects our data from the censoring issue raised by, e.g, Harrison, Harstad, and Rutström (2004). The discrepancy between in-the-lab and market prices may nowadays be influential ex ante on bidding behavior if subjects are actually aware of the donation procedure and the market price of the donation. Questions to assess subjects' knowledge are included in a debriefing questionnaire -see Section 2.3 below. 9 For more information, see http://leep.univ-paris1.fr/ and http://www.gate.cnrs.fr/.
individual consent form before entering the lab and is assigned randomly to a computer. Next, the written instructions are distributed and read aloud. The monitor uses both a non-numerical example and quiz to highlight the most salient features of the design. Finally, participants are encouraged to ask clarifying questions before starting the experiment. Both experiments begin by asking the subjects to fill out a computerized questionnaire about socio-economic characteristics (gender, age,. . . ). In HG-Experiments, the first part of the instructions, describing the quiz, is then distributed and read aloud. Subjects are provided information on their score only at the end of the quiz along with their corresponding earnings in ecu. The payment rate is 2 ecu per correct answer and the common knowledge exchange rate is 3 ecu for 1 A C. Once all 20 questions are answered by all subjects (and immediately after the socio-economic questionnaire in the IV experiment), the auction is introduced. To improve understanding of the game, a non-numerical example is developed covering all the instructions. The instructions do not, however, indicate that bidding one's induced value is the weakly dominant strategy. Participants are also asked to answer a short questionnaire highlighting the most salient features of the game. Before the game begins, bidders are encouraged to ask clarifying questions, which were privately answered by the monitor. In the IV-experiment, the winning bidder's profit in each round equals the difference between his or her induced value and the price he or she pays for the good (the second highest bid). For the 8 non-winning bidders, their profits are zero for that round. Only the winner sees the two highest bids at the end of the round. The only common knowledge difference between the two treatments is that ecu accumulated across rounds are not converted into Euros in IV-Hypothetical, while they are in IV-Real. The instructions for the HG auction describe in detail the WWF, the adoption procedure, and how the collected funds will be used. The auction is then described using the same instructions as in the IV experiment (same non-numerical example and same questionnaire to check subjects' understanding at the end of the instructions). The only difference is the good and its description. The wording of the instructions is slightly modified between the HG-Real and HG-Hypothetical. We follow Cummings and Taylor (1999) in replacing the affirmative language used in real auctions ("you will participate in the adoption procedure", "you will adopt a dolphin", "we commit ourselves to sending your donation to the WWF") with a hypothetical language in the hypothetical auctions: "we want you to suppose you were to participate in the adoption procedure", "you would adopt a dolphin", "we would commit ourselves to sending your donation to the WWF" (italics added). The experimental earnings are adjusted accordingly: the two subjects entitled to adopt a dolphin in each session (one per 9-bidders group) actually lose the amount of the donation in (and only in) HG-Real, and we buy a donation from the WWF for each of them. Before the end of the HG experiment, subjects answer a computerized debriefing questionnaire. The questions assess the previous knowledge of WWF and the level of agreement with its actions of the subject, their knowledge of the WWF adoption procedure, their degree of familiarity with the auction mechanism through online auction websites and whether they have participated in other experiments.
At the end of both experiments, subjects are privately paid their monetary payoff in cash: 10e in Note. A group (G) is a set of 9 subjects -a pair of two successive groups constitutes an experimental session. For each group in row (P-: Paris, L-: Lyon), the upper part displays the aggregate revealed demand (i.e. the observed bids) in each round (in column) and summed over rounds (last column). The bottom part reports the ratio of this revealed demand to the aggregate induced demand, in %.
the hypothetical conditions, plus the result from the quiz in the HG-experiment only ; or computed as the sum of this total and the profits/losses ecu accumulated during the auction, in the real conditions. The experiment lasted between one hour and one hour and half.

Results
Our experimental design allows us to explore the nature of hypothetical bias based on strict preference elicitation (IV treatments) and both preference formation and elicitation (HG treatments).

No hypothetical bias in IV auctions
We first consider aggregate behavior by round and induced value in each treatment, real and hypothetical. Two results emerge. First, at the aggregate level, we find no hypothetical bias. No difference arise between elicited demands in the real or hypothetical contexts. Second, at the individual level, we find no evidence of a significant difference in bidding behavior. Econometric results show that the slope and the constant of the bidding regression line are not significantly different if bids are elicited with or without monetary incentives. First, consider the aggregate results. Table 1 summarizes aggregate bidding behavior by round for each market we observe. Recall a session is split into two independent markets. This provides two groups (denoted G in the Table) per session, from either Paris (P) or Lyon (L). The pattern of bids over time suggests subjects learned rather quickly: for all groups, the revelation ratio is low in the first round, and then quickly converges to more stable values in later rounds. Comparing the results between treatments, there appears to be no hypothetical bias in our data. Strictly rational bidding in real and hypothetical treatments would result in the elicitation of 542×9 = 4878 ECU over all rounds of each market. Adding up all bids posted in each auction in a real context, we elicit a total of 5241 (107.4%) and 5087 (104.3%) ECU in Paris and 4989 (102.3%) and 5049 (103.5%) ECU in Lyon. In the hypothetical context, elicited aggregate demands range from 4763 (97.6%) to 5475 ECU (112.2%). Considering each round separately, 75.0% of the elicited aggregate demands in the real context are in the 90%-110% interval, centered on the true induced value. This percentage increases to 91% when examining the 80%-120% interval. In the hypothetical context, 52.8% of elicited aggregate demand is in the 90%-110% interval, rising to 86.1% for the 80%-120% interval.
We now consider aggregate bidding behavior by induced value and the treatment condition. Table 2 reorganizes the same data, sorted by induced value in each column. Recall each induced value is assigned once in each round for each market. The aggregate demand revealed for each value comes from all 9 rounds of each market. Examining Table 2 suggests that real and hypothetical treatments perform equally well in the aggregate when considering the induced value level. Elicited demands match in aggregate the induced demand for almost all induced values in both treatments. We however observe some overbidding for the lowest induced value (24 ECU)-here these off-the-margin bidders are more likely to submit bids that exceed induced demand. This holds for both real and hypothetical treatments: at the lower end of the induced values distribution, revealed demand are between 106.0% and 164% for the induced demand in the real condition, and between 94.9% and 176.4% in the hypothetical condition.
We now focus on bidding behavior at the individual level. We test for hypothetical bias, using a panel Tobit model censored at 0 and 100 (Cherry, Frykblom, Shogren, List, and Sullivan, 2004) where b it denotes subject i's ECU bid in trial t and ν it denotes subject i's induced value in trial t. The term α i represents subject-specific characteristics and is decomposed in a constant term α and Note. The first row reports the induced values attributed to buyers. The second row reports the corresponding Aggregate Demand (AD) induced in each group, i.e. induced values × number of subjects. A group (G) is a set of 9 subjects -a pair of two successive groups constitutes a session. For each group in row (P-: Paris, L-: Lyon) the upper part of the row displays the aggregate revealed demand (i.e. the observed bids posted by buyers the induced value of whom are reported in column). The bottom part reports the ratio of this revealed demand to the induced Aggregate Demand (AD), in %. a random effect term α i of mean zero and variance σ 2 α standing for individual heterogeneity. Trialspecific effects φ t are introduced as dummies in the regression. HY P i is a dummy variable which equals one for bids elicited in the hypothetical context. The parameter α H associated with HY P i accounts for the effect of the hypothetical condition on the constant term of the bidding regression line while β H accounts for its effect on the slope of the regression line. Finally, it is bid error with mean zero and variance σ 2 .
Estimation results are presented in Table 3. The parameters on round dummies confirm our unconditional results: bidding behavior in the first round differs from bidding in subsequent rounds, but learning sharply declines -all dummies are significantly different from zero, but quickly converge towards the same parameter value. We statistically assess hypothetical bias by testing the joint null hypothesis that coefficients α H and β H equal zero. As shown in Table 3, parameters associated with the hypothetical treatment are not different from zero when tested separately: p = 0.273 for α H and p = 0.490 for β H . An LR test of joint hypothesis of α H and β H moreover cannot reject the null of no difference in bidding behavior between real and hypothetical treatments (LR=2.00 with p = 0.368).
Econometric results based on individual data confirms our discussion on the unconditional results at the aggregate level, i.e we do not find a difference between hypothetical and real behavior in our data.

A substantial hypothetical bias in HG auctions
In HG auctions, we cannot contrast observed demand with true preferences since subjects enter the lab with their own private valuation. We focus our discussion on comparisons of bidding behavior with and without monetary incentives. Figure 1 presents the empirical distribution functions (EDF) of bids in HG-Hypothetical and HG-Real. Bids in HG-Hypothetical dominate bids elicited in HG-Real: the EDF of hypothetical bids is first order stochastically dominated by the EDF of bids elicited using actual monetary incentives. This means that data exhibit a hypothetical-real gap for both low bids and high bids. A closer look at the data is provided in Table 4, in which we compute the average bid, the median bid, and the number of zero bids and the number of bids above experimental earnings. The first two rows for each treatment highlight a substantial difference in the elicited preferences according to whether incentives are binding or not: mean and median bids in HG-Hypothetical are A C17.43 and A C19.5 as compared to A C2.98 and A C1 when monetary incentives are binding. This leads to an average hypothetical-real ratio of 584.9%. This means that bids in HG-Real are on average six times lower than in HG-Hypothetical -indicating a substantial hypothetical bias. We statistically test the difference in mean bids using a two-sample mean difference test based on a non-parametric bootstrap procedure that accounts for potential correlation between the five bids of the same subject and for asymmetry in the empirical distribution of bids. The procedure is based on bootstrapping subjects and their five bids in the sample (999 times), instead of considering independent bids. To account for asymmetry in the empirical distribution, we computed an equaltail bootstrap p-value (see Davidson and MacKinnon, 2006). The test finds a significant difference (p < 0.001). The gap between bids in HG-Hypothetical and bids in HG-Real is not likely to be explained by differences in total experimental earnings (earning from the quiz + show-up fee of 5 euros) between the two treatments: subjects earned on average A C18.9 (s.d. 0.21) in the hypothetical treatment -given an average 13.3 correct answers out of 20 in the earned-money phase -and A C18.6 (s.d. 0.25) with actual monetary incentives -12.88 correct answers in average. A twosample mean difference test leads to p = 0.364. In addition, the gap cannot be explained by differences in respondents characteristics. Unconditional mean and proportion tests show that there are no significant differences in socio-demographic and debriefing questions: gender (p = 0.738), previous knowledge of the WWF (p = 0.614), having already adopted dolphins (only one individual out of 18 Note. For each treatment (in row ) and by round (in column), the table provides bidding behavior in the homegrown (adopt a dolphin) experiment: mean and median bid (first two rows for each treatment) ; number of zero bids (third row ) and bids above subject's experimental earnings (fourth row ). The last row of the table gives the ratio between average hypothetical bids and average real bids.
had already adopted a dolphin, p = 0.320), level of agreement with WWF actions (p = 0.508), past experience with the auction mechanism proxied by stated number of purchases on auction websites (p = 0.400).
The last two rows of the Table for each treatment provide further insight into why such a discrepancy occurs. First, we observe nearly 50% of subjects bid higher than their experimental earnings in hypothetical, whereas no subject bid that high in the real treatment. Recall over-bidding knowingly means for the subjects they would pay out of his/her pocket to adopt the dolphin, hypothetically in HG-Hypothetical and effectively in HG-Real. This occurs, with hypothetical or actual monetary consequences, as soon as the bid is higher than experimental earnings. In our experiment, subjects take such a chance of paying out of his/her pocket to adopt a dolphin only when bidding is hypothetical. One explanation for the bias is that some subjects violate their budget constraint in an hypothetical context -here proxied by their show-up fee and the money earned in the quiz. This result reinforces the longstanding explanation that the bias arises because the budget constraint is not binding in hypothetical valuation exercises (see Cummings, Brookshire, and Schulze, 1986;Cummings and Taylor, 1999;Harrison and Rutström, 2008, for a review).
Second, we observe nearly 27% of bidders bidding zero in the real treatment, whereas no bidder bid zero in the hypothetical treatment. This large number of zero bids with monetary incentives is concentrated on a few subjects: among the 7 (of 18) who bid zero at least once, three subjects bid zero in every auction round -16% of total elicited bids -and two subjects bid zero most of the time, fifty cents otherwise. Recall the participation constraint associated to any mechanism requires subjects to be no worse-off by participating than otherwise (see for instance Laffont and Martimort, 2002).
As we do not provide an explicit opt-out alternative in the auction, the only way to withdraw from the valuation exercise is to bid zero. this behavior suggests there is a second driving force behind the hypothetical bias: the real condition violates the participation constraint implicit in the mechanism: subjects bid zero to opt-out of the auction environment. We expand and test this second explanation in the next section.

Opting-out in real HG auctions
In summary, the HG experiment exhibits two phenomena that produce a significant gap between preferences elicited with or without monetary incentives: the unreliable willingness to pay out of ones' own pocket when incentives are dropped -suggesting a violation of the budget constraint when bidding is hypothetical -and the opting-out of the auction through zero bids when incentives are binding -possibly revealing a violation of the constraint to participate in the mechanism. This second idea is reminiscent of the early literature on contingent valuation surveys (see Cummings, Brookshire, and Schulze, 1986), in which some bids are interpreted as protest responses to the survey rather than true indifference towards the good submitted to valuation.
The opting-out interpretation is supported by the psychological theory of reactance, which states that people try to remove themselves from situations that restrict their freedom in an "unfair or unreasonable" way (Brehm, 1966). Reactance theory works in three steps. First, a person perceives an unreasonable or unfair restriction on his or her action; he fails to see why it is being applied, or judges the context is too harsh, or believes the restriction is unfairly limited to a few people. Second, the restriction induces some reactance, an intense motivational state that arises because people perceive themselves as wronged or misled and they want out of the situation. Third, the person acts to remove reactance. People with reactance try to get the unreasonable or unfair restriction removed, or else they try to subvert the restriction. For our experimental mechanisms, this suggests the experimental design pressured subjects by forcing them to state a bid in an HG auction they do not want to participate in. A (real) HG setting puts subjects in a position to spend money, while they come to the lab with the hope to earn some, as they do in IV settings. This phenomenon occurs because most real bidding experimental designs do not provide people with an opt-out mechanism to exit the market. 10 Based on our evidence, this view of bidding motives is a bit speculative since it could be that those subjects bidding zero in our HG experiment do reveal their true preference for adopting a dolphin. We close our experimental design by a companion treatment to test further the hypothesis that subjects opt-out the auction because they react to a pressing auction environment. This HG-Opt-out treatment provides subjects with an explicit opting-out alternative: subjects are allowed to choose to participate or not in the HG auction with monetary incentives while still staying in the lab. The instructions explicitly announced the opting-out alternative, and made clear that subjects can Note. For each round (in column), the table provides bidding behavior in the homegrown (adopt a dolphin) experiment for the real treatment with an explicit opt-out device: number of subjects who decide not to enter the auction (first row ) ; mean and median bid (two subsequent rows) ; number of zero bids (third row ) and bids above subject's experimental earnings (fourth row ).
offer any amount they want should they enter the auction. The design is similar to the one of HG-Real but at the beginning of each auction, the subject is asked on the screen whether or not she/he would be willing to participate in the auction. Subjects who agree to participate enter a standard HG auction. Subjects who do not accept to participate wait for the auction to end. Before the next auction starts, all subjects are asked again if they want to participate or not to the auction. The same procedure is implemented for each subsequent auction. Subjects are told that their earnings would not be affected should they choose not to participate in the auction. The results from one session, involving 18 subjects, are presented in Table 5. Interestingly, 3 of 18 subjects declined to participate in all five auctions. Over all 5 rounds, 41.1% of subjects refused to participate in one round or more. Moreover, only one zero bid over the five rounds (1.1%) remains when subjects actively choose to participate into the mechanism, which has to be compared to 26.7% of zero bids in the standard HG-Real treatment.
This substantiates our interpretation that zero bids in the real treatment are more likely to be reactance to the implied participation constraint in the auction. That is, zero bids suggest a 'no to the participation itself' rather than being a 'no to the good'. Without an explicit opt-out device, the bidder can use the zero bids to exit the mechanism. The bids observed in the HG-Real treatment are lower than the true underlying preferences because bidders shave bids downward, some subjects bidding zero to exit the auction altogether. Note, however, the bids we observe in this opt-out treatment do not elicit the true underlying preferences: we are still unable to drive those subjects who did not participate to express their preferences truthfully (that may be anything, including zero or more). These results support the idea that subjects are under-stating their preference in the HG-Real auctions. Welfare estimates are driven downwards if opting-out responses are not identified.
This phenomenon goes beyond our current results. To illustrate, Figure 2 reorganizes in terms of cdf the original data generated by the experimental design in List and Shogren (1998). In this field experiment on bidding for baseball cards at a sports show in Denver (CO), List and Shogren observed that 50 to 55 percent of all bids dropped to zero in the real auction (from a positive amount in the hypothetical auction), which translated into over one-third of the valuation gap between real and hypothetical bidding. At the time, hypothetical bidding was seen as the culprit. Reflecting back, the experimental design most likely generated the large number of zero real bids observed given it created an environment that would promote reactance. First, the monitors asked people to state a hypothetical bid for a baseball card; and then immediately asked each person for a bid with actual monetary consequences. A person first bid hypothetically and then was told the auction was now "for real". Given that this experiment was run in the field at a sports card show, many people could have seen this design as a "bait and switch" or "entrapment", and re-acted to this by opting out with a zero bid. People can use the zero bid option to exit a contrived market within which they are otherwise trapped. Many otherwise positive value bidders seemed to use the zero bid as a sure-fire way to exit the auction without playing -no pay, no play.

Discussion
This paper uses a two experiments design, one revealing induced values (IV) the other revealing homegrown values (HG) both with the same elicitation mechanism, to understand better why hypothetical bias appears in non-market valuation. Our first set of results provides new insights into the discrepancy that arises when eliciting preferences for a good with homegrown values. We use a second price Vickrey auction to reveal subject's willingness to pay for adopting a dolphin. The real and hypothetical contexts appear to perform differently on two dimensions. First, the hypothetical context fails to implement the constraint induced by disposable income. This result is in line with earlier work in CV studies. The NOOA panel (Arrow, Solow, Portney, Leamer, Radner, and Schu-man, 1993), for instance, recommended reminding respondents with their budget constraint. In our experiment, this failure to implement the budget constraint on individual bidding behavior causes bidders to make donation promises higher than their budget constraint in a hypothetical context. Our second result shows this overbidding effect is only part of the story. If a budget constraint violation unambiguously suggests there exists a gap between true preferences and revealed demand in an hypothetical context, the bias could be less dramatic than generally thought: once the experimental design puts subjects in a position to spend some money, it is just as possible they understate their true preferences to opt-out from the mechanism.
While those two results help understand better the nature and extent of hypothetical bias, they still leave open the question of why it happens. Our most important result substantiate recent evidence suggesting that HB might not arise in induced values contexts (Taylor, McKee, Laury, and Cummings, 2001;Vossler and McKee, 2006), even with an open-ended elicitation mechanism (Mitani and Flores, 2009a) that has been argued to drive up the difference between real and hypothetical contexts (List and Gallet, 2001;Murphy, Stevens, and Weatherhead, 2005). Murphy, Stevens, and Yadav (2010) further confirms this result by comparing hypothetical bias between IV and HG contexts (for three kinds of goods: public good, private good, and publicly provided private good) -all (but only) HG contexts give rise to the bias. One common explanation for such a difference between IV and HG context relies in the provision uncertainty associated with dichotomous choice contextsthere is some heterogeneity among bidders on the probability that answers are influential on policy decisions (Mitani and Flores, 2009b). Landry and List (2007) provide evidence from the field that support this idea they label Realism or Consequentialism. This view is however challenged by the results of Ehmke, Lusk, and List (2008) that found no bias in referenda on the valuation of a private good (bottles of water) -in France. As stressed by Schläpfer (2008), hypothetical bias can however be an issue even in non-dichotomous choice questions if survey respondents are unable to form consistent preferences about unfamiliar goods .
Our experimental design deliberately rules out the issues raised by dichotomous choice contexts. Because we rely on a second-price auction, we elicit continuous preferences and observe the revealed demand function of all subjects. Our two experiments design allows a precise comparison between IV and HG setting by holding most features constant except for the nature of the good itself. We find the incentive context -real or hypothetical -is neutral on French bidders' behavior in a second price, induced value, auction: our bidders pay what they say. This is only partly a surprise: in an IV context, bidding allow subjects to win money rather than spend some. This rules out the need to opt-out that affect bids in a real context. Interestingly, though, subjects misrepresent their true valuation whatever the incentive context -mainly because of high underbidding at the lower end of the values distribution. Jacquemet, Joule, Luchini, and Shogren (2009a) relate such a behavior to the commitment of subjects into the valuation exercise -obtained through well-identified property rights on the budget spent in the experiment.
The main difference between HG goods and the IV setting is the latter focuses on preference elicitation while the other also involves preference formation. The contrast with the hypothetical we observe when people state their homegrown preferences through the same mechanism suggests the bias arises due to how the mechanism induces subjects to "form" their valuations. The various models of preference formation in valuation studies point to the weakness of the link between subjects and the valuation exercise. According to Ajzen, Brown, and Carvajal (2004), the activation of positive attitudes induced by selling desirable public goods could give rise to contextdependent answers to the mechanism -resulting in a discrepancy between intentions and actions. This discrepancy leads to a social desirability bias (Lusk and Norwood, 2009). In the same spirit, Guzman and Kolstad (2007) hypothesis that hypothetical behavior may be related to the bidder's reluctance to invest in costly information about their own value in an hypothetical context. As a result, subjects' uncertainty about their homegrown values might be what drives a wedge between hypothetical and real values (Johannesson, Blomquist, Blumenschein, Johansson, Liljas, and O'Conor, 1999).

Conclusion
Since the earliest work on non-market valuation, researchers have speculated over the precise reason why hypothetical bias arises (Cummings, Brookshire, and Schulze, 1986;Murphy and Stevens, 2004;Shogren, 2005;Harrison, 2006). Some researchers have focused on preference elicitation -binding budget constraints, open ended statements of value (e.g., (Arrow, Solow, Portney, Leamer, Radner, and Schuman, 1993)); others have focused on preference formation -how a person forms his or her preferences for a nonmarket good (e.g., (Johannesson, Blomquist, Blumenschein, Johansson, Liljas, and O'Conor, 1999)). Herein we explore this debate further by using a two-treatment design that allows us to separate preference elicitation from preference formation given an identical demand revelation mechanism -the second price auction. The first treatment focuses on preference elicitation by using induced values (IV); the second treatment focuses on preference formation by eliciting homegrown values (HG) to adopt a wild dolphin.
Overall, our results point to poor preference formation as the reason bidders overbid in a hypothetical context (as expected given non-binding budget constraints) and underbid in the real context (unexpected, bidders opt out of the auction by bidding zero or low values) -"true" preferences lay somewhere in-between. This result suggests that the hypothetical bias arises from the interaction between the elicitation mechanism and the value formation of the bidder for the good under consideration. We speculate this poor preference formation problem arises because bidders are uncommitted to the valuation exercise. This suggests the second-price auction needs an ancillary mechanism that causes bidders to take their budget constraint seriously, and does not restrict their freedom of choice when eliciting preferences for the good.