1 Introduction

Laboratory experiments are an important tool to gain various economic insights that cannot easily be obtained using market data or field experiment data. While experiments in the laboratory, with greater control over the situation, give higher confidence in the internal validity, the questions about external validity or parallelism of laboratory experiments remain. One crucial question is whether subjects’ behavior in the laboratory is consistent with their behavior outside the lab. There are of course many differences between the laboratory and the field; therefore it is difficult to compare behaviors in these two settings. For example, if people do not give away a large share of their income to charity, it does not prove that the behavior in a dictator game, where subjects on average give away 20 % of their endowment (e.g., Camerer 2003), is not externally valid. Levitt and List (2007) argue that a number of factors can explain the behavioral differences found between the laboratory and the real world: scrutiny, context, stakes, selection of subjects, and restrictions on time horizons and choice sets. It is therefore important to carry out empirical studies that are able to examine the potential behavioral differences directly, identify factors that can reduce the differences, or do both (see, e.g., Smith 1982; List 2008; Falk and Heckman 2009).

One focus in the methodological development of lab experiments is to understand and reduce the differences between the lab and the field by, for example, using non-standard subject pools or having subjects earn the endowment. An important reason for the increased use of earned endowments is the intent to mimic the setting outside the lab, where almost all incomes are earned rather than obtained as windfalls. The evidence of the effect of windfall money on subject behavior in the lab is mixed. In dictator games, the dictators contribute less when the endowment is earned (Cherry et al. 2002; Cherry and Shogren 2008; Ruffle 1998; Oxoby and Spraggon 2008). In a public good experiment, Clark (2002) does not find a significant effect of earned endowment on the share of free-riding subjects, while Harrison (2007) shows that the windfall gain in the experiment by Clark (2002) does have a significant effect in a re-analysis of the data. Cherry et al. (2005) find no significant evidence of a windfall-gains effect on the contributions in a public good experiment, saying that although there seemed to be an effect, it was hidden within the more complex considerations of a public good game. However, in a follow-up paper using the best-shot game, Kroll et al. (2007) find significant differences in a public good experiment with heterogeneous endowment. By and large, previous findings seem to indicate that windfall endowment does have an effect on behavior.

In a recent paper, Smith (2010) argues that using laboratory experiments has resulted in many insights into human behavior, but the extent to which these can be carried over to behavior when people’s own money is involved is questionable.Footnote 1 Note that this should not be seen as a general critique against laboratory experiments. In many instances, when researchers would like to test the effects of a certain treatment or stimuli keeping all other factors constant, there are strong arguments for conducting laboratory experiments, not the least the strong degree of control over the environment in which the decision is made (Falk and Heckman 2009).

In this paper, we are interested in analyzing the behavioral differences between conducting experiments in the lab and the field, and in particular we investigate the role of windfall and earned endowments in the lab and the field. To do this, we use a 2×2 experimental design. We let the subjects participate in a dictator game with a charity organization as the recipient (see, e.g., Eckel and Grossman 1996, for a similar experiment). In the experiment, we keep all factors such as stake, selection of subjects, and the choice sets and time horizons of the experiment constant, only varying windfall gain and whether the experiment is conducted in the lab or in the field. This means that the main differences between the lab and the field in our experiment are due to the environment per se and the degree of scrutiny.Footnote 2 Thus, we can make two comparisons between the lab and the field. The first one is to what extent they provide similar results in terms of the level of donation, under various conditions. The second one is to what extent a change in the context—in our case a change in how the endowment is obtained—affect behavior differently in the lab and in the field. The difference between lab and field can thus also be seen as part of a broader and more complex area related to how behavior is affected by context. It is evident that subjects are potentially sensitive to the context of the experiment and factors such as the choice set, (e.g., List 2007), social distance (Hoffman et al. 1996), and experimenter demand effects (e.g., Zizzo 2010). For a general discussion on the topic of context see the recent work by e.g. Bardsley et al. (2010) and Smith (2010). The contexts of the lab and the field are in many ways very different. In this experiment we have tried to reduce these differences, but there are indeed some fundamental differences between the lab and the field in our experiment as well.

The advantage of using a dictator game is that the game is very easy to understand and there are no strategic motives involved. The game also resembles a charitable giving situation, which means that it is possible for us to compare the behavior with that in a field experiment involving charitable giving. Treatment 1 is a standard lab experiment with windfall endowment, and Treatment 2 is a lab experiment with earned endowment. Treatment 3 is a field experiment with windfall endowment, and Treatment 4 is a field experiment with earned endowment. Our design allows us to make two important comparisons. First, we can investigate the effect of windfall gain in the lab (by comparing Treatment 1 and Treatment 2) and in the field (Treatments 3 and 4). Second, by comparing Treatments 1 and 3, and 2 and 4, we can make an overall comparison between the lab and the field, conditional on the way the endowment is received and earned, and thus also the effect of windfall gains in the lab and in the field. In addition, we also investigate the effect of show-up fees in a traditional lab experiment with windfall gain in order to investigate whether subjects view the show-up fee as a compensation for their time or as a windfall gain by conducting a follow-up experiment with two treatments (treatment 5 and 6).Footnote 3

Why would windfall money matter in a dictator game? One explanation to the potential difference is that people’s preferences for the distribution of money depend on, among other things, the input of the subjects (Konow 2000). When the endowment is a windfall gain, the dictator prefers to split the money more evenly, since she does not do anything to receive the money. Cherry et al. (2002) make a similar argument: earned money legitimizes the endowment and invokes more selfish behavior. In psychology, it has been suggested that subjects use different mental accounts for earned and windfall money (Arkes et al. 1994).

Several previous studies have studied differences in behavior between the lab and the field (e.g., Carpenter et al. 2005a; List 2006; Karlan 2006; Benz and Meier 2008; Laury and Taylor 2008; Antonovics et al. 2009; Carpenter and Seki 2010). However, the only other study we are aware of that makes a direct comparison between lab and field using a dictator game with control for a possible subject effect is the one by Benz and Meier (2008), who use an ingenious within-subject design to compare university students’ donation behaviors in the field and in the lab. They conduct a dictator game with two social funds as external recipients, and compare the behavior in the experiment with actual charitable giving by the same subjects. They find a stronger donation behavior in the lab and that there is a positive correlation between behavior in the lab and in the field. An important reason for the difference between the lab and the field settings could be that the lab experiment uses windfall money while the field experiment does not involve an experimental endowment at all. This is exactly what our experimental design allows us to test. By applying a between-subject design and keeping the difference between the laboratory and field experiments to a minimum, our experiment allows us to make clear comparisons of the behavior in the lab and the field. In addition, there is no significant effect on behavior from offering or not offering subjects in a lab a show-up fee. The remainder of the paper is organized as follows. Section 2 introduces the experimental design, and Sect. 3 reports the experimental results. Section 4 concludes the findings.

2 Experimental design

The experiment was conducted in October 2008 at Renmin University of China, which is located in the northern part of the capital Beijing and has approximately 22,000 full-time and 13,000 part-time students. We conducted a one-shot dictator experiment. The subjects were given ten 5-YuanFootnote 4 bills and were subsequently asked how much they would like to donate to the China Foundation for Poverty Alleviation.Footnote 5 This type of campaign, where people are asked to donate to a charity, is not uncommon in China, and the China Foundation for Poverty Alleviation occasionally conducts similar campaigns on campus to give students the opportunity to donate money, old clothes, or other consumer goods to the poor or those in need. In order to test for (i) the difference between the lab and the field and (ii) the effect of windfall gains, we designed an experiment with four treatments using a 2Ă—2 experimental design.

The laboratory experiment was conducted at the School of Economics at Renmin University of China while a supermarket located on the campus of Renmin University of China was used as the setting for our field experiment. The endowment was given either as a windfall or had to be earned by answering a lengthy questionnaire. Since the experiment is a dictator experiment and since we wanted to compare across treatments in a simple way, the earned endowment was the same for all subjects and did not depend on their performance. However, it was clear to the subjects that they earned the money by answering the questionnaire, and always had the possibility to not answer the survey and hence not receive the compensation. The recruitment for all treatments was such that every third male and every third female customer that exited the supermarket was approached.Footnote 6 For the laboratory experiment, the customers were approached by one of our experimenters and asked if they would like to participate in a study conducted by university researchers. The field experiments were done in collaboration with the supermarket, and the supermarket employed the experimenters. Therefore, in the field experiments the customers were approached by one of our experimenters dressed in a supermarket uniform and asked if they would like to participate in a campaign conducted by the supermarket. Since we wanted to keep the subject pool variations to a minimum, we only allowed students from Renmin University to participate and therefore all treatments began with a screening question asking whether or not they were students at the university. In addition, all the treatments were double-blind.

We begin by describing the laboratory experiment treatments and then the field experiment treatments. The full scripts are presented in the Appendix 2 (Table 7). Table 1, below, summarizes the key features of the experimental design.

Table 1 Summary of the experimental design

In the laboratory experiment treatments, subjects were asked to participate in an experiment conducted by the School of Economics at Renmin University at a scheduled time.Footnote 7 They were told they would receive 10 Yuan as a show-up payment at the end of the experiment to compensate for the inconvenience of coming to the experimental session on a specific date and time.Footnote 8 When subjects arrived at the lab, they were randomly assigned to either the windfall or the earned endowment treatment. In the treatment with the windfall endowment (Treatment 1), an experimenter welcomed the subject who was then led to a cashier where the 50 Yuan payment was given in ten 5-Yuan notes. After the subject had received the money, the experimenter presented the opportunity to donate to the China Foundation for Poverty Alleviation using the money that had just been received. The objectives of the foundation and for what purpose the donations would be used were then explained. At this point, the subjects were again told that the donation campaign was part of a research study. In order to ensure that the decision was anonymous, we put up a booth in which the subjects could make their decisions privately. The subjects were asked to seal any donation they chose to make in a supplied envelope, put it in an official donation box from the China Foundation for Poverty Alleviation and keep the remaining money.Footnote 9

The lab experiment with earned endowment (Treatment 2) was the same as Treatment 1 except that upon arriving at the lab the experimenter asked the subjects whether they would be willing to answer a survey on the use of plastic bags and their views on the supermarket in general.Footnote 10 They were told that if they completed the survey they would receive 50 Yuan. The subjects were again reminded that the donation campaign was part of a study conducted by researchers from the School of Economics. It was made clear that the money was to compensate them for their time and effort. Once the survey had been completed, the experimenter asked the subject to go to the cashier, who paid the 50 Yuan in ten 5-Yuan notes. After the subject had received the money, the dictator game was conducted in exactly the same way as in Treatment 1.

In the field experiment with the windfall endowment (Treatment 3), the experimenter informed the subject that the supermarket was conducting a “Thank you Customer” campaign and that the subject had been randomly selected to receive 50 Yuan. In China, it is common that supermarkets conduct commercial campaigns to improve their customer relations, although in most cases vouchers valid at the supermarket are used rather than cash. It is important to stress that they were given the money without any conditions and not the least no in relation to what they had or would purchase at the supermarket.Footnote 11 In order to keep the logistics the same, the money was given by the cashier. Once the subject had received the money, the experimenter explained that there was an opportunity to donate to the China Foundation for Poverty Alleviation using the money that had just been received. The donation was made in private in a booth. In order to keep the differences between the laboratory and the field settings to a minimum, we used the same recruitment procedure, the same experimenters, the same payout and donation procedure, the same cashiers, the same charity and dictator game introduction script, and the same donation booth.

Finally, in the field experiment with earned endowment (Treatment 4), the experimenter asked the subjects if they would be willing to participate in a survey carried out by the supermarket on the use of plastic bags and on their views about the supermarket in general. The survey was exactly the same as in Treatment 2. They were told that if they chose to participate, they would be paid 50 Yuan in cash. It was made clear that the money was a compensation for their time and effort. Once the survey had been completed, the experimenter asked the subject to go to the cashier, who paid the 50 Yuan in ten 5-Yuan notes. After the subject had received her earnings, the dictator game was conducted in the same way as in the previous treatments.

We used the same experimenters in all treatments, i.e., female university students not from Renmin University of China. The cashiers who handed out the money were always the same male students (not from Renmin University of China). Each experimenter and cashier conducted the same number of experiments in each treatment. The supermarket where the experiments were conducted is the largest supermarket on the campus of Renmin University with around 1,000 customers per day. Treatments 3 and 4 were conducted first over a two-day period. Then the recruitments to Treatments 1 and 2 were made over a two-day period, and the lab experiments were conducted on the two days that followed.Footnote 12

3 Results

In total 211 subjects participated in the main experiments (Treatments 1–4). Table 2 reports the descriptive statistics of the donations for all treatments. The mean donation amount and the proportion of subjects who donated the whole endowment of 50 Yuan vary considerably across treatments.Footnote 13 In the standard dictator game (Treatment 1), the average donation is 37.1 Yuan, corresponding to 74 % of the endowment. In the other three treatments, the donations are much lower. The mean donations are higher in the laboratory experiment treatments (Treatments 1–2) than in the field experiment treatments (Treatments 3–4). This is to a large extent explained by a higher fraction of subjects donating everything in the laboratory experiments.

Table 2 Description of donation behavior for each treatment

The mean donations in our experiment are in general higher than in other similar experiments. For example, with a similar experimental design with a charity recipient, Eckel and Grossman (1996) find that subjects donate on average 30 % of the endowment, while in our experiment the average donation is 74 % of the endowment. On the other hand, the average proportion donated in Benz and Meier (2008) is also rather high, between 62 % and 67 % of the endowment. Clearly, the amount donated is very context specific, but one potentially important reason for the high donation rate in our experiment is that China had just experienced several large natural disasters resulting in a general increase in charitable giving. For example, the total amount of money from individual charitable giving increased in 2008 to 13 times the level of 2007 (Chinese Ministry of Civil Affairs 2007, 2008). On the other hand, in a follow-up experiment to investigate the effect of a show-up fee in the lab, reported below, we observed similar high donation rates.

Table 3 reports the results from statistical tests of the effects of windfall money in the lab and the field environments. We conduct a t-test to test for mean differences as well as a Wilcoxon rank-sum test of equality of distributions for amounts donated across treatments. Moreover, we test the hypothesis of equally sized zero-Yuan and 50-Yuan donation shares and perform t-tests and rank-sum tests for the amount donated conditional on giving a positive amount but less than 50 Yuan.

Table 3 Test of difference between windfall and earned endowment

We can reject the null hypothesis of no effect of windfall gain both in the lab and in the field. In both cases, the mean donation is significantly lower when the subjects have to earn their endowment, and this is largely explained by the large difference in share of subjects donating 50 Yuan. The proportion of subjects giving 0 Yuan or 50 Yuan is significantly different between the windfall and the earned endowment at the 5 % significance level for both the lab and the field experiments, except for the proportion of subjects giving 0 Yuan in the lab. However, there is no difference in the amount donated if the two extreme values of donating, either nothing (0 Yuan) or everything (50 Yuan) are removed. This is true for both the lab and the field experiments. Consequently, in both the lab and the field, the major effect of introducing an earned endowment is that it increases the share of zero donations and decreases the share of full (50 Yuan) donations.

The effect of the earned endowment is similar to that of previous studies using dictator games in a laboratory environment in the sense that the mean contributions decrease when the endowment is earned. However, since there are a number of differences in design and context, it is difficult to make direct comparisons. Cherry et al. (2002) finds a stronger effect in terms of subjects offering zero because while around 15–20 percent offered zero in the treatment with the windfall, in the treatments with the earned endowment 70–79 percent offer zero. In Oxoby and Spraggon (2008), all subjects offered zero when their endowment was earned, but only between 11 and 35 percent did so when the endowment was a windfall gain.

Finally, Table 4 reports the statistical test results of the null hypothesis of no difference between the lab and the field, conditional on the endowment being obtained in the same manner. We can reject the hypothesis of equal donation amounts for both the windfall and the earned endowment treatments. However, the difference is much smaller when the endowment is earned. If the extreme donations are deleted, the difference in mean donations is reduced substantially. For the two treatments with earned endowments, the difference in mean donations is not significant using both a t-test and a rank-sum test. For the two treatments with windfall endowments, the difference is significant using a rank-sum test, but not significant using a t-test.

Table 4 Test of differences between the lab and field experiment contexts

The study of Benz and Meier (2008) is perhaps the study that comes closest to our experiment. They conduct a dictator game with two social funds as external recipients and compare the behavior in the experiment with actual charitable giving by the same subjects. Their study is not tailored to test whether people are more pro-social in the lab compared with the field. However, they do find some indications of stronger pro-social behavior in the lab, for example subjects who did not donate in their field experiment did donate a substantial amount in the lab experiments. As discussed in the introduction of our paper, an explanation for this difference could be that the endowment in the field experiment is earned. Our results suggest that this is an important explanation for the difference between the laboratory and the field results, but even when this is controlled for, a difference in pro-social behavior remains.

One potential explanation for the difference between the lab and the field could be that the lab experiment involved a show-up payment of 10 Yuan, while the field experiment did not for the obvious reason that subjects do not know that they are a part of an experiment.Footnote 14 The behavior in the lab could potentially depend on whether subjects viewed the show-up fee as a compensation for their time or as a windfall gain.Footnote 15 In particular they might be more generous in the dictator game if they also viewed the show-up fee as a windfall gain. In order to rule out that it is the show-up fee that drives the differences we conducted a follow-up experiment with two treatments. The first, denoted treatment 5, is exactly the same as treatment 1, a lab experiment with a windfall endowment and a show-up fee of 10 Yuan. The new treatment, treatment 6, is a lab experiment with a windfall endowment but no show-up fee. Recruitment was made at the same supermarket and the experiment was conducted in the same place by using exactly the same procedures. We conducted a new round of treatment 1, i.e., treatment 5, since it is conceivable that the donation is dependent on factors such as media coverage of charitable organizations and recent disaster events. The results of the two treatments are presented in Table 5.Footnote 16

Table 5 Description of donation behavior for treatments with and without show-up fee

It is clear from Table 5 that there is very limited influence from the show-up fee on donations. The average amount donated is actually larger in the treatment without a show-up fee, but, using a ranksum test, the difference is not statistically significant (p-value=0.456). The same pattern is observed when comparing share of zero donations and share donating everything. Moreover, there is no difference in the proportion of subjects that declined to take part in the experiment after we had given all of the information, including the information about the lack of a show-up fee in treatment 6. Thus, we do not have any direct evidence of a difference in subject pools depending on whether a show-up fee is paid or not. It is also worth noting that the average donations in the two treatments are similar to the ones obtained in treatment 1. To conclude, the observed difference between the lab and field cannot be attributed to the fact that the lab experiment involved a show-up fee while the field experiment did not.

4 Conclusions

The present paper investigates how behavior is affected by windfall endowments as well as by laboratory and field environments. This includes features from the recruitment process and experimenters used to the place where subjects make their decision (in our case the booth). We also vary, in both the lab and the field, how the endowment was obtained. First, we find a substantial and significant difference in behavior between using windfall and earned endowments both in the lab and in the field. The absolute and relative differences are larger in the lab environment, but this can partly be due to the overall higher contribution levels in the lab. Consequently, the strong effects of windfall money found in previous lab experiment studies are not only an artifact of lab environment per se. First, even outside the lab, subjects consider how the endowment is obtained, and are much less pro-social when the endowment is earned. Second, there are sizeable and significant differences in behavior between the lab and the field, particularly with the windfall endowment. The overall differences are smaller but still significant when an earned endowment is used. It should be noted that in the more detailed analyses of the data for the earned endowment treatments on the proportion of giving as well as on the amount conditional giving, there were no significant effects between the lab and the field at 10 % significance level. The present study is the first attempt to investigate the issue of windfall gains in different experimental environments while keeping all other things constant, including the subject pool, with the exception of the basic characteristics of lab and field environments. The field experiment was designed to mimic a lab experiment in as many ways as possible, which means that it had to be designed in a very specific way. It is possible that comparisons among the treatments depend on a number of characteristics of our experiment. For example, in the earned endowment treatments all subjects received the same amount of compensation. Future studies are needed to discover how sensitive our results are to various design characteristics and contexts by using for example different donation recipients, different ways in which the endowment is earned, and different environments in which the decision is made, and at a more general level how different games, e.g., public games, are affected by these design features. In addition, we do not have any direct evidence of a difference in donations depending on whether a show-up fee is paid or not in the lab.

Our experiment and results should not be interpreted as an argument against conducting laboratory experiments. For example, we find similar effects from the windfall endowment in the laboratory and the field. Moreover, it is clear that there are many advantages to conducting laboratory experiments (Falk and Heckman 2009). Our results show the importance of using earned money to reduce the gap between the laboratory and the field, but whether this calibration results in different policy conclusions when, for example, comparing different institutions or whether it is a pure level shift is an important question for future research. Although the experimental design was intended to control for all effects other than the environment, we still find differences. This points to the importance of discussing the environment when interpreting both laboratory and field experimental results, as well as of conducting replication studies, especially for field experiments. Behavior in the field is also likely to depend on the context and the environment. For example, it is likely that subjects would have been more generous in the field experiment if we had conducted the experiment at a meeting of the Communist Party in China (or at a church in a Christian country).