Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF Free Download

Name: Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF
Author: a-owen

1 / 78

0 views•78 pages

Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF Free Download

Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF free Download. Think more deeply and widely.

Title: Accusations of Unfairness Bias Subsequent Decisions: A Study of Major

League Umpires

Authors: Travis J. Carter1*, Erik G. Helzer2.

Affiliations:

1Department of Psychology, Colby College.

2The Johns Hopkins Carey Business School.

*Correspondence to: travis.carter@colby.edu

Abstract: What happens when decision-makers are accused of bias by an aggrieved party? We

examined the ball-and-strike calls of Major League Baseball umpires before and after arguments

from players or managers resulting in ejection. Prior to ejection, the accusing team was, in fact,

disadvantaged by the home plate umpire’s calls. After the ejection, umpires did not revert to

neutrality—they exhibited the opposite bias, advantaging the accusing team. This pattern was

only evident when the ejection was related to pitch location, not other kinds of ejections. Using a

laboratory analogue of the umpires’ situation, we replicated this post-accusation tendency with

experimental participants. This study further revealed that decision-makers were unaware of the

shifts in their behavior in response to the accusations, and another survey indicated that this

tendency violates beliefs about fairness. These results suggest that performance following

accusations may unwittingly succumb to this insidious tendency to favor the accusing party.

One Sentence Summary: After being (rightly) accused of biased behavior toward one team,

MLB Umpires responded by committing the opposite bias, now giving more favorable calls to

the accuser’s team.

MANUSCRIPT UNDER REVIEW—PLEASE DO NOT CITE WITHOUT PERMISSION

Main Text:

Among the many responsibilities leaders bear is a commitment to fairness. Perceptions of

fairness are important for many organizational and interpersonal outcomes (1–3), and leaders, as

decision-makers, find themselves in the unique position to uphold fairness standards. Even the

best-intentioned leaders, however, will occasionally have their decisions questioned on the

grounds of fairness. In this paper, we ask how accusations of bias (one type of fairness violation)

affect the subsequent judgments of decision-makers.

Herein, we examine the perceived fairness of repeated judgments. In such cases, a

decision-maker is expected to base her decisions on an evaluation of the evidence, applying

some pre-ordained standard consistently to each case. Conducted correctly, this procedure should

result in fair outcomes on average. Common examples include judges’ rulings for courtroom

objections, managers’ application of company standards to different job candidates, and (most

relevant to the present studies) baseball umpires’ application of a common strike zone to batters

on both teams.

Systematic bias in serial decisions may be particularly insidious: over time, small

procedural biases can compound into large absolute differences in the distribution of outcomes

(4). Although much research has examined the factors that influence judgments of fairness, as

well as recipients’ reactions to decisions that are judged as fair or unfair (5), little is known about

the effects of accusations of unfairness on decision-makers’ subsequent decisions.

In the present analysis, we view such accusations as a form of performance feedback: the

decision-maker is made aware of a perceived pattern of uneven decisions, presumably indicative

of a flawed (or biased) process. Specifically, we examined fairness-related feedback delivered to

an evaluator by a self-interested party—someone directly (and negatively) affected by those

decisions. For instance, an employee might complain that his performance reviews are unfairly

lower than others’, implying that the manager is exhibiting a bias against him.

In such cases, decision-makers are keenly aware of others’ biases (6), and may be

particularly apt to dismiss an accusation of bias as being self-interested, consequently making no

attempts to investigate or alter subsequent decisions. This reaction may not be warranted,

however, given that even self-interested assessments are constrained by reality (7). Even if the

accusation is considered seriously, the decision-maker may search for evidence of bias and find

none—not because it does not exist, but because the process by which the judgment is formed is

impervious to introspection (8). All of this suggests that even a pattern of unfairness exists,

decision-makers will remain unaware of the presence of bias.

We considered three major possibilities for how decision-makers would respond to an

accusation from an interested party. First, it could have no systematic effect on subsequent

decisions, either because of a successful attempt to ignore the feedback or because the decision-

making process is immune to conscious intervention (9). Second, it could exacerbate existing

biases, strengthening the existing response tendency, either due to motivational processes, such

as reactance (10), or inherent cognitive biases, such as escalation of commitment (11). Third, it

could lead to overcorrection, producing a new bias in the opposite direction, either resulting from

overzealous conscious attempts to correct for past errors or from a non-conscious correction

mechanism. In testing these possibilities, we sought to understand the role that conscious

processes, as well as explicit beliefs about feedback accuracy, play in shaping decision-makers’

responses.

We began by examining a near-ideal decision context in Study 1: the ball-and-strike

judgments of Major League Baseball (MLB) umpires. Hundreds of times in each game, the home

plate umpire declares a pitch a ball or a strike based on their perception of whether it was inside

or outside the strike zone, a decision that must be made immediately, and that carries opposite

consequences for the two teams involved. Umpires are regularly accused of bias in their

decisions, but we focused on the clearest and most discernible cases: when a player or manager

is ejected from a game as a result of arguing a call with the umpire. Ultimately we sought to

compare the relative favorability of the umpire’s calls toward the ejected and the non-ejected

team both before and after the ejection. We further expected that accusations of unfairness would

exert the most direct influence on subsequent judgments in the same domain. In this case, only

arguments related to pitch location (i.e. ball-and-strike calls) should lead to shifts in umpires’

subsequent ball-and-strike calls; ejections prompted by other circumstances (e.g. a close play at

third base) should lead to no such shifts, thus providing an important point of comparison, and a

crucial benchmark and for our predictions.

In order to measure the relative favorability of umpires’ ball-and-strike judgments, we

employed a data-driven approach that allows for both absolute and relative comparisons, making

use of the PITCHf/x pitch location data from 2008-2013. Specifically, we divided the x-z

coordinate plane into bins, then calculated a deviance score for each called pitch by comparing

the actual ball or strike call made by the umpire to the long-run probability of pitches in that

same location being called a ball or a strike. This approach not only allows for the aggregation

and comparison of pitches regardless of their location, it also minimizes the impact of shifts in

players’ behavior as a result of an ejection; as long as the umpire is the arbiter of judgment for a

given pitch, the prior probability should serve as a neutral baseline against which to compare any

individual judgment. The details of the calculation can be found in the Supplemental Materials,

but put simply, positive deviance scores reflect calls favorable to the batting team, and negative

scores reflect calls unfavorable to the batting team.

Examining deviance scores for pitches thrown during the pre- and post-ejection periods

of games featuring a single ejection using linear mixed-effects models, it was clear that for

ejections unrelated to pitch location (396 games, n = 42,414 pitches), the ejection had no impact

on the relative favorability of the umpire’s calls, b = 0.000, 95% CI [–0.012, 0.013], t(402.79) =

0.06, p = .954 (Fig. 1, bottom panel).

Pitch-related ejections (311 games, n = 34,563 pitches), however, did lead to different

patterns of favorability before vs. after the ejection, b = 0.042, 95% CI [0.026, 0.057], t(314.61)

= 5.37, p < .001 (Fig. 1, top panel). The accusation of bias was apparently made with good

reason: prior to the ejection, the ejected team received less favorable calls than the non-ejected

team, t(442.10) = 3.99, phb < .001.1 In the post-ejection period, however, umpires reversed this

bias; the ejected team received more favorable calls than the non-ejected team, t(744.20) = –3.52,

phb < .001. Further examinations indicated that this reversal remained for the rest of the game,

rather than fading shortly after the ejection (see Supplemental Materials). Thus, the data are

consistent with the third possibility outlined above: in response to an accusation of unfairness,

umpires overcorrected, introducing the opposite bias—but only when the argument was relevant

to the domain of judgment.

1 The subscript hb (i.e. Phb) indicates a p value that was adjusted using the Holm-Bonferroni method (12)

in order to account for multiple comparisons.

Fig. 1. Deviance scores by batting team, period of game, and type of ejection (Study 1). Top

panel: Ejections resulting from arguments related to pitch-location (396 games, n = 42,414

●

−0.03

−0.02

−0.01

0.00

0.01

0.02

0.03

Pre−Ejection Period Post−Ejection Period

Pre/Post Ejection

Deviance score

●

Ejected Team Non−Ejected Team

Ejection Type: Pitch−related Ejections

●●

−0.03

−0.02

−0.01

0.00

0.01

0.02

0.03

Pre−Ejection Period Post−Ejection Period

Pre/Post Ejection

Deviance score

●

Ejected Team Non−Ejected Team

Ejection Type: Ejections Unrelated to Pitch Location

pitches). Bottom panel: Ejections unrelated to pitch location (311 games, n = 34,563 pitches).

Error bars represent +/– 1 standard error.

Given that umpires’ primary goal is accurate judgments, it seems likely that the bias

would be most pronounced when there is some ambiguity about whether a given pitch should be

called a ball or strike—pitches near the edge of the strike zone. To test this hypothesis, we

created a proxy variable for ambiguity based on each pitch’s prior probability of being called a

ball or strike. Indeed, the bias exhibited by umpires was moderated by the ambiguity of the pitch

location, b = 0.134, 95% CI [0.086, 0.181], t(34429.69) = 5.49, p < .001 (see Table S5). The bias

against the ejected team pre-ejection (Fig. 2, top panel), and in favor of the ejected team post-

ejection (Fig. 2, bottom panel), was strongest for the most ambiguous pitches.

−0.10

−0.05

0.00

0.05

0.10

−0.50 −0.25 0.00 0.25 0.50

Ambiguity of pitch location

Deviance score

Ejected Team Non−Ejected Team

Game period: Pre−ejection

Fig. 2. Deviance scores by batting team, ambiguity of pitch location, and period of game (Study

1). Top panel: Pre-ejection period. Bottom panel: Post-ejection period. Depicted scores and 95%

confidence intervals (shaded regions) derived from model predictions.

Given that umpires have the most control over the ball and strike calls, we consider the

analysis of deviance scores to be the primary test of our hypothesis. It is, of course, interesting to

consider whether the observed changes in umpires’ post-ejection judgments had downstream

consequences, such as changing the likelihood of batters getting on base (as measured by on-

base percentage; OBP) and scoring runs. However, because these outcomes are less directly

under the influence of the home plate umpire, we would expect to observe smaller effects

compared to deviance scores.

Using the at-bat as the unit of analysis, we tested for the effects of batting team and

period of game on the likelihood of getting on base (OBP; Fig. S4) and runs scored per at-bat

(R/AB; Fig. S5) using linear mixed-effects models. In both cases, the ejected team experienced

poorer outcomes than the non-ejected team in the pre-ejection period (OBP: z = 5.44, phb < .001;

−0.10

−0.05

0.00

0.05

0.10

−0.50 −0.25 0.00 0.25 0.50

Ambiguity of pitch location

Deviance score

Ejected Team Non−Ejected Team

Game period: Post−ejection

R/AB: z = 6.31, phb < .001), just like deviance scores. However, unlike deviance scores, which

showed a reversal in fortunes in the post-ejection period, the ejected team’s disadvantage relative

to the non-ejected team was merely eliminated for OBP (z = 0.21, phb = .831), and merely

attenuated for R/AB (z = 2.31, phb = .021). Baseball games are certainly dynamic systems in that

players and managers act and react as circumstances change, but the lack of a reversal in the

post-ejection period for these two measures cannot be written off as simple regression to the

mean. Indeed, a mediation analysis indicated that the improvements in the ejected teams’ OBP

(indirect effect: 0.039, 95% CI [0.022, 0.058]) and R/AB indirect effect: 0.016, 95% CI [0.008,

0.027]) following pitch-related ejections were at least partially due to changes in the umpires’

balls-and-strikes calls after the ejection, even if those improvements were insufficient to overtake

the non-ejected team.

In order to examine the effects of fairness accusations in a setting that allows for causal

inferences, in Study 2 we created a laboratory task mimicking the situation that umpires face,

with 100 participants randomly assigned to receive accusatory feedback (or not). The task

involved viewing a series of images and judging whether the number of dots on each image was

higher or lower than a target number (see Fig. S6). Participants were all told they had been

assigned to the role of Judge, and would actually perform the dot-estimation task. Ostensibly,

they had been partnered with another participant (the Observer) whose job was to observe the

Judge’s performance and provide feedback after each block of trials. Both the Judge and the

Observer were paid a bonus based on the Judge’s performance, but with misaligned incentives.

That is, the Judge (participant) was paid based on the number of accurate responses she gave,

whereas the Observer (partner) was paid based on the number of directional responses (i.e. the

number of “higher” vs. “lower” responses, counterbalanced) given by the Judge, regardless of

accuracy. Thus, any feedback by the Observer suggesting more directional responses could be

seen as purely self-serving—to the detriment of the participant’s own interests.

After a period of relatively neutral “feedback” from the Observer, participants in the

Critical Feedback condition began receiving feedback accusing them of giving too few

directional responses—easily interpreted as an attempt to garner a more favorable outcome for

themselves. Participants in the Control condition received neutral feedback throughout the

experiment.

Mimicking the umpires’ response to an accusation of bias, the feedback manipulation

impacted the proportion of participants’ directional responses, F(1,91) = 4.09, p = .046, η"

# =

.043 (see Fig. 3, top panel). Participants in the critical feedback condition shifted their judgments

to be more favorable to the observer after the critical feedback began, t(91) = –2.26, phb = .052,

whereas participants in the control condition, showed no systematic change, t(91) = 0.66, phb =

.513. What’s more, examining participants’ explicit estimates after each block of trials showed

no evidence that they were aware of the shifts in their directional responses (all Fs < 1.05; see

Fig. 3, bottom panel) or in accuracy (see Supplemental Materials). Those findings, coupled with

participants’ strong belief that the Observer’s judgment was both biased (p < .001) and inferior to

their own (p < .001), strongly suggest that the shifts were not intentional (see Supplemental

Materials).

Fig. 3. Actual (top panel) and Estimated (bottom panel) proportion of directional responses by

period of the study and feedback condition (N = 93; see Materials and Methods). Error bars

represent +/– 1 standard error.

●

0.40

0.45

0.50

0.55

0.60

Pre Post

Pre/Post Critical Feedback

Actual Proportion Directional Responses

●

Control Critical Feedback

●●

0.40

0.45

0.50

0.55

0.60

Pre Post

Pre/Post Critical Feedback

Estimated Proportion Directional Responses

●

Control Critical Feedback

The tendency to respond to accusations of unfairness by offsetting one bias with another

poses interesting questions about the nature of fairness. To see how the pattern of behavior we

observed comports with folk intuitions about fairness we presented another group of participants

with a scenario where there was a clear pattern of bias in a decision-maker’s past decisions, and

asked them what would constitute a fair outcome going forward. The vast majority (73%)

indicated that the fairest response would be to eliminate bias toward both parties, and only 19%

of indicated that the pattern we observed in both studies (instituting a new bias in favor of the

aggrieved party) would be fair. Thus, there may be important implications for this work with

regard to perceptions of procedural fairness following an accusation of bias.

To be sure, the precise mechanisms underlying changes in cognitive and behavioral

responses following accusations of bias are not easily identified by the present studies. As such,

it is unclear how decision-makers might avoid the negative consequences of feedback. Our

experimental participants appeared to have minimal access to the effects feedback had on their

judgments, suggesting that these biases may be particularly insidious and resistant to conscious

control (8). Objective performance feedback following accusations of bias (such as a computer-

generated report on umpires’ accuracy calling balls and strikes), followed by recalibration and

practice may help reduce these biases in the long-run (13); however, it is unclear how feasible

such interventions would be in the context of real-world serial decisions.

All in all, the present studies provide some insight into an overlooked aspect of the

psychology of fairness. Despite their best intentions, decision-makers charged with upholding

fairness may from time to time slip in their duties. These studies suggest that the most obvious

resolution to this problem—making decision-makers aware of such slips through informal

channels—may promote new patterns of unfairness instead of eliminating the underlying

problem. This may be very welcome news to the aggrieved party, though certainly not to the

decision-maker who must hear a new round of complaints.

References and Notes:

1. Y. Cohen-Charash, P. E. Spector, The role of justice in organizations: A meta-analysis.

Organ. Behav. Hum. Decis. Process. 86, 278–321 (2001).

2. B. A. Mellers, J. Baron, Eds., Psychological perspectives on justice: Theory and

applications (Cambridge University Press, Cambridge, 1993;

http://ebooks.cambridge.org/ref/id/CBO9780511552069).

3. D. T. Miller, Disrespect and the experience of injustice. Annu. Rev. Psychol. 52, 527–553

(2001).

4. L. Babcock, G. Loewenstein, Explaining bargaining impasse: The role of self-serving

biases. J. Econ. Perspect. 11, 109–126 (1997).

5. T. R. Tyler, What is procedural justice?: Criteria used by citizens to assess the fairness of

legal procedures. Law Soc. Rev. 22, 103 (1988).

6. E. Pronin, D. Y. Lin, L. Ross, The bias blind spot: Perceptions of bias in self versus others.

Pers. Soc. Psychol. Bull. 28, 369–381 (2002).

7. Z. Kunda, The case for motivated reasoning. Psychol. Bull. 108, 480–498 (1990).

8. T. D. Wilson, N. Brekke, Mental contamination and mental correction: Unwanted

influences on judgments and evaluations. Psychol. Bull. 116, 117–142 (1994).

9. R. E. Nisbett, T. D. Wilson, Telling more than we can know: Verbal reports on mental

processes. Psychol. Rev. 84, 231–259 (1977).

10. J. Brehm, A theory of psychological reactance. (Academic Press, Oxford, England, 1966).

11. B. M. Staw, Knee-deep in the big muddy: A study of escalating commitment to a chosen

course of action. Organ. Behav. Hum. Perform. 16, 27–44 (1976).

12. S. Holm, A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70

(1979).

13. Y.-W. Chien, D. T. Wegener, R. E. Petty, C.-C. Hsiao, The flexible correction model: Bias

correction guided by naïve theories of bias: theory-based bias correction. Soc. Personal.

Psychol. Compass. 8, 275–286 (2014).

14. Major League Baseball (Organization), The official rules of Major League Baseball.

(Triumph Books, Chicago, 2014).

15. B. M. Mills, Technological innovations in monitoring and evaluation: Evidence of

performance impacts among Major League Baseball umpires (2015), (available at

http://www.brianmmills.com/uploads/2/3/9/3/23936510/full_revised_manuscript.pdf).

16. R Development Core Team, R: A language and environment for statistical computing

(2016; https://www.r-project.org/index.html).

17. D. Bates, M. Mächler, B. Bolker, S. Walker, Fitting linear mixed-effects models using

lme4. J. Stat. Softw. 67, 1–48 (2015).

18. A. Kuznetsova, P. B. Brockhoff, R. H. B. Christensen, lmerTest: Tests in linear mixed

effects models (2016; https://CRAN.R-project.org/package=lmerTest).

19. D. J. Barr, R. Levy, C. Scheepers, H. J. Tily, Random effects structure for confirmatory

hypothesis testing: Keep it maximal. J. Mem. Lang. 68, 255–278 (2013).

20. T. J. Moskowitz, L. J. Wertheim, Scorecasting: The hidden influences behind how sports

are played and games are won (Three Rivers Press, New York, First paperback edítion.,

2011).

21. R. P. Larrick, T. A. Timmerman, A. M. Carton, J. Abrevaya, Temper, temperature, and

temptation: Heat-related retaliation in baseball. Psychol. Sci. 22, 423–428 (2011).

Acknowledgments:

The authors declare no conflicts of interest. The data used in all studies will be made

available via the Open Science Framework (OSF). The complete PITCHf/x data are

available from the MLB website (http://mlb.mlb.com/gdcross/components/game/mlb/).

The authors wish to thank Devin Pope for his advice on the calculation of the deviance

measure.

Supplementary Materials for

Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League

Umpires

Travis J. Carter, Erik G. Helzer

correspondence to: travis.carter@colby.edu

Materials and Methods (Study 1)

Data Sources and Experimental Design

Data pertaining to pitch location and game conditions for every regular season Major

League Baseball game from 2008-2013 were drawn from the several sources. The Major League

Baseball website makes available data on the location (x and z coordinates) of each pitch

(PITCHf/x), game event information (runners on base, steals, and ejections at each point in each

game), and game personnel (batters, pitchers, home plate umpires). We obtained data on Win

Expectancy (WE; the calculated probability of a team winning the game based upon the score,

inning, number of outs, runners on base, and game environment) and Leverage Index (LI; an

index of the amount of pressure in a game based upon its current conditions, such as score and

inning) from Fangraphs.com. Attendance and temperature information for each game was

obtained from Retrosheet.com. Many of the analyses that control for these secondary game-level

variables can be found in the supplemental materials.

Critically, a list of ejections was populated from these data, including all relevant

information about who was ejected by whom and at what point in the game the ejection

occurred. This list was then compared against the data and descriptions found on the Umpire

Ejection Fantasy League (UEFL) website (http://portal.closecallsports.com), which catalogues

ejections, to create a list of 1,126 ejections occurring across regular season games between 2008

and 2013, ranging from 164 (2009) to 207 (2008) per year.

Ultimately, this created a 2 (Ejection Type: Pitch-related vs. Other) × 2 (Batting Team:

Ejected vs. Non-ejected) × 2 (Period of Game: pre- vs. post-ejection) with the latter two factors

varying within each game, and the first factor varying between games. The primary dependent

variable is the pitch deviance measure (described in detail below), which occurred at the level of

the individual pitch. We also examined two major offensive outcomes: on-base percentage

(OBP) and runs scored per at-bat (R/AB), which occurred at the level of the at-bat.

Calculation of Pitch Deviance Measure:

The rulebook strike zone is defined as a box with horizontal limits (on the x-axis) that range

from one edge of home plate to the other, with vertical (on the z-axis) limits defined based on the

batter: the upper limit is defined as “a horizontal line at the midpoint between the top of the

shoulders and the top of the uniform pants,” and the lower limit is defined as “a line at the

hollow beneath the kneecap” (14). In practice, however, umpires call pitches according to a

strike zone that differs from the rulebook strike zone in systematic ways, ignoring the corners

and making adjustments for right- and left-handed batters, for instance (15). Thus, using the

rulebook strike zone is clearly inadequate as a neutral and valid reference point to assess

umpires’ judgment. Although it might seem straightforward to simply use the “practical strike

zone” instead, there is simply too much ambiguity in the data to identify hard boundaries

between strike and ball. Additionally, it is important to account for the possibility that, if umpires

are altering their practical strike zone based on arguments leading to ejections, then batters and

pitchers might similarly be altering their behavior to adapt to that new strike zone. For instance,

if a batter argues after being called out on a third strike that was well above the strike zone, a

pitcher may attempt to throw more pitches in the same location expecting the same result, and

batters may feel it necessary to swing at those pitches to avoid a similar called-strikeout.

Although it may be impossible to completely account for these changes in players’ behavior, we

employed a data-driven approach that should minimize its influence while simultaneously

defining the practical strike zone in a much more fluid way: calculating the probability that a

pitch in a given location would be called a ball or a strike based on what call umpires typically

make for a pitch in that specific location. Thus, regardless of any shifts in behavior on the part of

batters or pitchers as a result of an ejection or perceptions of an umpire’s shifting strike zone, as

long as the umpire is the arbiter of judgment for a given pitch, the prior probability should serve

as a neutral baseline against which to compare any individual judgment.

In order to group pitches based on their location, the x-z coordinate plane was divided into a

grid of bins. To calculate the bin size, we used the size of a baseball as a guide. The size of an

official MLB baseball is defined as measuring “not less than nine nor more than 9 1/4 inches in

circumference” (14). With the circumference defined by the rulebook as a range, we used the

midpoint of that range (9.00-9.25 in., 22.86-23.50 cm) as the value for the circumference, 9.125

in. (23.178 cm). Using basic geometry to find the diameter from the circumference, 9.125/π =

2.905 in. (7.379 cm), we calculated bins based on one-quarter (0.726 in., 1.844 cm) the diameter

of a baseball. This size was intended to reflect a balance between the precision of the location

and the precision of the probability. Smaller bins provide more meaningful discrimination based

on location, but larger bins provide more confident estimates of the prior probability by ensuring

a sufficient sample size of pitches located within the bin. It is worth noting, however, that the

analyses were robust to different bin sizes. The results were the same using both larger (one-half

baseball diameter; 1.452 in., 3.689 cm) or smaller (one-eighth baseball diameter; 0.363 in., 0.922

cm) bins.

Translating the PITCHf/x data in its raw form onto a grid of bins of a fixed size required

further calculations. To understand how we accomplished this, it’s helpful to know a bit more

about how the raw data reflect the rulebook strike zone (defined above). Having a fixed

definition for the horizontal dimension (the width of home plate) allows for all bins to have an

equal width. The PITCHf/x data scale the x-axis relative to the strike zone’s horizontal

boundaries (-1 and +1 correspond to the left and right edges of the strike zone). In order to

determine the horizontal boundaries of individual bins based on an absolute size (fractional

diameter of a baseball, as described above), we translated this into an absolute width. Because a

pitch is still considered a strike if only part of the baseball crosses the plate, we calculated this as

the width of home plate (17.00 in., 43.18 cm) plus the full width of one baseball (2.905 in., 7.379

cm)—allowing for pitches where the middle of the ball touches any part of the plate to be within

the horizontal boundaries of the strike zone—for a total of 19.905 in. (50.559 cm). Thus,

knowing that 2.0 units on the relative scale (the range of –1 to +1) corresponds to 19.905 in.

(50.559 cm) on the absolute scale, allows for an easy translation of the absolute bin widths to the

relative scale.

This same basic approach was applied to the z-axis, but because the vertical dimension of

the strike zone is defined based on each player’s height and stance, which can even vary from

pitch to pitch, it was problematic both practically and theoretically to keep a constant bin height.

The most reasonable solution, in our minds, was to ensure that x-z coordinate plane would be

divided into the same number of bins for each batter, even if that meant the absolute bin height

would vary from pitch to pitch. Fortunately, the PITCHf/x system identifies and reports the

absolute height of the strike zone on each pitch, which can be used to normalize the z-axis to be

on the same relative scale as the x-axis, with the vertical boundaries of the strike zone defined as

–1 and +1. We used the average absolute height of the strike zone across the entire data set

(21.761 in., 55.272 cm) to translate the intended absolute bin heights to the relative scale.

Having defined the bin boundaries, we identified which bin each pitch in the entire data set

of 2,212,150 called pitches (2008-2013) fell into. Next, we calculated the percentage of pitches

located in each bin that were called balls, effectively identifying the probability that a pitch in

that location would be called a ball. Each bin’s probability ranged from 0 (100% strikes) to 1

(100% balls; see Fig. S1, which depicts a heat-map of these prior probabilities). Note that the

probabilities were calculated separately for left and right-handed batters, given known

differences in the practical strike zones based on handedness (15). Thus, for any given pitch, the

actual call made by the umpire can be compared to the long-run probability of pitches in that

same location being called a ball or a strike. This allows us to calculate the degree to which a call

deviated from that probability by subtracting the probability of a called ball or strike from the

actual call (called ball = 1; called strike = 0). For instance, a called ball located in a bin where

80% of the called pitches were called balls (bin-probability = .80) would have a deviance value

of 0.20 (1.0 – 0.80 = 0.20). A called strike located in that same bin would have a deviance value

of –0.80. Thus, the range of possible deviance scores for a pitch goes from –0.999 (an extremely

unlikely called strike) to +0.999 (an extremely unlikely called ball). Put more simply, positive

deviance values reflect calls that were favorable to the batting team, and negative deviance

values reflect calls that were unfavorable to the batting team, relative to what would be expected.

Across the entire sample of pitches, the deviance scores form a distribution that is, by

definition, centered on zero, meaning that zero is the expected deviance value for any given pitch

or collection of pitches, and provides an absolute baseline to test for evidence of bias. That is, if

the average deviance score for a given collection of pitches is significantly different from zero,

that would be evidence of systematic bias in favor of one team, with the magnitude and valence

of the average indicating the amount and direction of bias, respectively. For instance, if the

average deviance for pitches thrown to the visiting team over the course of a single game was

+.03, that would roughly translate to ball-and-strike calls being an average of 3% more favorable

than expected.

Having an absolute comparison point is helpful, but relative comparisons are the most

relevant for the present analyses—particularly if a given umpire has a slightly more expansive

(or constrictive) strike zone than the league average. For instance, imagine a situation where the

visiting and home teams had average deviance scores of +.03 and +.06, respectively, for a single

game. The fact that both scores are positive would indicate that the umpire generally employed a

slightly smaller strike zone (more called balls) than the league average, but the home team’s

relatively larger score indicates that it received more favorable calls than the visitors, on average.

Thus, regardless of how a given umpire’s strike zone compares to the league as a whole,

comparing the deviation scores of opposing teams can reveal which team, if any, was the

recipient of undue generosity. (We also deal with umpire- and game-level variation statistically.)

Thus, examining these deviance values in the aggregate allows us to detect systematic shifts

in the umpires’ calls before and after an ejection, and whether they favor one team over another.

It is worth noting that, because umpires are generally quite accurate in their calls—as depicted by

the relatively small band of ambiguity in Fig. S1—the vast majority of pitches show a fairly

small amount of deviation from their prior probability. Thus, examining aggregated outcomes

represents a very conservative test of a hypothesis of shifting favorability.

Calculation of Bin Ambiguity

For each pitch, the prior probability (P) of being called a ball ranges from 0 (all strikes) to 1

(all balls), with 0.5 representing equal likelihood of being called a ball or strike. Thus, pitches in

bins with a prior probability closer to 0.5 would be considered more ambiguous. The ambiguity

index was thus calculated using the following formula:

$ − 0.5 ∗ 2 + 0.5

Although this index would theoretically range from –0.5 (no ambiguity) to +0.5 (complete

ambiguity—equal likelihood of called ball or strike), because pitches located in bins with zero

variability were excluded from the analyses, the lower end of the range of possible ambiguity

scores was actually –0.4967. It is worth noting that the prior probability for each pitch factors

into the calculation of both the ambiguity index and the deviation measure, and thus factors into

both sides of the regression equation. By its very definition, there are limits to the amount of

variation that can be observed for very high and low probability values (i.e. the highly

unambiguous pitches). Nonetheless, observing differences in the predictive power of that pitch’s

ambiguity based on whether the ejected or non-ejected team was batting and whether it occurred

in the pre- or post-ejection period should be informative.

Calculation of Offensive Statistics

To calculate on-base percentage (OBP), each at-bat was categorized as resulting in the

batter getting on base (coded as 1) or not (coded as 0). This allows us to examine OBP at the

level of the at-bat—essentially the likelihood that batter reached base. According to standard

scorekeeping procedure, some at-bats are not figured into the calculation of OBP, such as when

the batter successfully executes a sacrifice bunt, or is given a base due to catcher’s interference.

These instances were coded as missing, and were thus not included in the analysis.

We also examined the number of runs scored per at-bat (R/AB), which is simply the number

of runs scored by the batting team during a given at-bat. Note that we counted runs scored for

any reason, not just runs resulting from the batter getting a hit, or even a sacrifice (bunt or fly). A

run scored from a player stealing home on a passed ball, or from the umpire issuing a walk with

the bases loaded, were also counted.

Coding Ejection Type

Although baseball players and managers frequently express their displeasure with

unfavorable calls, it is not obvious exactly how one should quantify such an expression, nor how

to identify those instances where the umpire is specifically accused of being biased. We believe

that the least ambiguous examples of such expressions are cases when a player or manager

argues a call with the umpire, and is subsequently ejected from the game. Although it is not

possible to know the exact content of the arguments that lead to ejections, the authors’ collective

lifetime of experience watching baseball2 certainly suggests that the arguments typically involve

the player or manager accusing the umpire of a pattern of unfavorable (biased) calls, with the

latest pitch being only the most recent example.

In order to identify the underlying reason for each ejection, which is not included in the data

provided by MLB, an independent coder categorized each ejection based on the UEFL

descriptions of the game context in which the ejection was made. Each ejection was coded as

2 The Atlanta Braves and Seattle Mariners, respectively.

resulting from an argument that was either related or unrelated to pitch location. The coder was

not blind to the hypotheses, but because the descriptions of the ejection event were completely

separate from the PITCHf/x data—meaning that the coder had no knowledge of the location of

the antecedent or subsequent pitches when making the determination—there was virtually no risk

of this knowledge biasing the coder’s judgments. The vast majority of the ejections were

unambiguous, such as a batter arguing that a called third strike should have been called a ball

(pitch-related ejection), or the manager arguing that a runner called safe at first base should have

been called out (other ejection). Some cases, however, introduced some difficulty, such as an

argument about a batter who begins to swing and then attempts to hold back the swing (a “check

swing”). In this case, if the umpire rules the ambiguous motion as a swing, then it is a strike

regardless of the location. If instead it is ruled as a non-swing, then the umpire must judge it a

ball or a strike based on the pitch location—meaning that the pitching team could argue about the

lack of a check swing call, and either team could argue about a called ball or strike. In these

cases, the coder attempted to discern, based on the context of the ejection and the description of

the ejection, whether the true crux of the argument was about the location of the pitch or about

something else. Any ambiguity was resolved through discussion with the authors prior to

examining the actual pitch data.

Selection of Games and Pitches:

Of the 1,126 ejections, 489 (43.4%) were coded as being related to pitch-location, and 637

(56.6%) were coded as “other.” We excluded from the analysis any game with more than one

ejection event, or when members of both teams were ejected, so that it would be clear which

team suffered the ejection, and exactly when it occurred. That is, if several members of the same

team were ejected as a result of the same at-bat (e.g. the umpire ejects both the batter and

manager of the batting team for arguing the same called third strike), this would be considered a

single ejection event. If two members of the same team were ejected after different at-bats (e.g. a

batter was ejected immediately after arguing a called third strike, but the manager was ejected

after arguing about a call several at-bats later), then it would be considered two ejection events.

Of the 815 (72.4%) games that met the definition of a single ejection event, we further excluded

games where either team did not have at least one at-bat with a called pitch before and after the

ejection, leaving a total of 707 games in the final data set (n = 311 involving pitch-related

ejections, n = 396 involving other kinds of ejections). It is worth noting that the results do not

change if the full set of 815 games is included in the analyses.

From that set of games, there were 110,806 called pitches with valid x and z coordinates.3

To ensure the robustness of the analysis, we excluded pitches falling in bins that contained too

few pitches (fewer than 100) to be confident that the prior probability was reasonably accurate

(15,398 pitches). These pitches were virtually all called balls (99.94%), indicating that they were

well outside the strike zone. We also excluded pitches from the analysis where there was zero

variability in the bin (100% called balls or 100% called strikes; 17,195 pitches), the vast majority

of which (95.00%) were called balls. After these exclusions, the final data set we examined

consisted of 79,220 pitches, which we can be certain required at least some interpretation on the

part of the umpire, defined quite conservatively.

Statistical Analysis:

3 Some data were missing due to a combination of the occasional technical error on the part of the

PITCHf/x system, and because the system was not yet fully deployed in every stadium until mid-way

through the 2008 season.

Examining the effects of these independent variables involves comparing the outcomes of

opposing teams within the same game, with many of the outcomes determined by the umpire

behind home plate, who also officiated other games in the data set. By using outcomes between

opposing teams within the same game, any fixed bias in outcomes for that particular game or that

particular umpire should not be problematic (such as a given umpire’s tendency to have a larger

or smaller strike zone than the league average). However, because observations from the same

game came from the same umpire, they would violate the assumption of independence

underlying standard linear models or ANOVA. Thus, we employed linear mixed-effects models

for all analyses for Study 1, treating the three main independent variables as fixed effects.

To aid the interpretation of model parameters, we used contrast coding rather than dummy

coding for all categorical variables (i.e. Pitch-related ejections: +0.5, Other types of ejections: –

0.5; Ejected team: +0.5, Non-ejected team: –0.5; Post-ejection period: +0.5; Pre-ejection period:

–0.5).

All analyses were conducted in R (16), using the lme4 package (17) for the linear mixed-

effects models and confidence intervals, and the lmerTest package (18) to calculate p values

(using Satterthwaite’s approximations for the degrees of freedom), estimate cell means, and to

perform any post hoc or pairwise comparisons. For any post hoc or pairwise comparisons, p

values were corrected for multiple comparisons using the Holm-Bonferroni procedure (12);

corrected p values are indicated with a subscript (i.e. phb).

For the random-effects structure, we began with a maximal model (19), which we identified

as having random intercepts for home plate umpire and for game (nested within umpire),

allowing for the effects of Batting Team, Period of Game, and their interaction to vary within

individual games (i.e. random slopes). Although the maximal model also allowed for correlated

slopes and intercepts, the model would not reliably converge unless those were dropped from the

model. It is worth noting that in the few models with correlated slopes and intercepts that did

converge, the model fit was no better than a model without correlated slopes and intercepts, as

evidenced by a likelihood ratio test, c2 (6) = 6.86, p = .334. The final model we employed should

thus account for any non-independence of the individual observations, while also allowing us to

examine or control for the impact of pitch, at-bat, and game-level variables (e.g. whether there

was an impact of game attendance on deviance scores).

The structure defined above was used for all analyses of the pitch-level deviance scores.

Because the two at-bat-level measures (OBP and R/AB) involved non-normal data, the analyses

for those measures involved generalized linear mixed-effects models, specifically a mixed-

effects binary logistic regression for OBP, and mixed-effects Poisson regression for R/AB. In

both cases, even the slightly reduced version of the maximal model failed to converge, so the

complexity of the random-effects structure was selectively reduced until convergence could be

achieved. This resulted in a model with random intercepts for game, and random slopes for

Batting Team, Period of Game, and their interaction within games (without allowing for

correlated slopes and intercepts). In other words, only the random intercepts for umpires were

dropped from the model.

For the pitch-level deviance measure, we first tested a linear mixed-effects model featuring

a 2 (Batting Team: Non-ejected team vs. Ejected team) × 2 (Game Period: Pre-ejection vs. Post-

ejection) × 2 (Ejection Type: Pitch-related vs. Other kinds of ejections) design for the fixed

effects. The results of this model are presented in Table S1. Based on the significant three-way

interaction, we conducted a linear mixed-effects model testing the effects of Batting Team and

Game Period separately for pitch-related ejections (see Table S2) and other kinds of ejections

(see Table S3), which is what is reported in the main text. Based on the results of the deviance

measure, for the at-bat-level measures, we only conducted the analyses on the subset of data with

pitch-related ejections.

Mediation analysis. As reported in the main text, we conducted a mediation analysis to

confirm that the observed relative improvements in at-bat-level offensive outcomes (OBP and

R/AB) by the ejected team after the ejection do in fact result from shifts in the umpires’ behavior

(as measured by the deviance scores), rather than being solely due to regression to the mean.

That is, the observed pattern of effects for OBP and R/AB—a mere attenuation of the pre-

ejection bias—could be explained by regression to the mean. The pitch-level deviance scores,

however, show a reversal (not attenuation) of the pre-ejection bias in the post-ejection period, so

a regression-to-the-mean explanation does not apply. Thus, if we can demonstrate that the effect

of the ejection on the at-bat-level variables was statistically mediated by deviance scores, then

we can be reasonably sure that those effects were not merely a regression to the mean. In order to

ensure that the mediator (deviance scores) and the dependent variables (OBP and R/AB) were all

operating at the level of the at-bat, we calculated at-bat-level deviance scores as the mean

deviance score for each at-bat. For this analysis, we treated the batting team × period of game

interaction as the independent variable. Consistent with the pitch-level analysis, there was a

significant effect of the independent variable (batting team × period of game) on the mediator

(at-bat-level deviance scores), b = 0.040, 95% CI [0.023, 0.058], t(328.20) = 4.52, p < .001.

When the mediator was included in the same analyses described above predicting OBP (mixed-

effects binary logistic regression) and R/AB (mixed-effects Poisson regression), it was a strong

predictor in both cases (ps < .001). To test the significance of the indirect effect (the product of

the effect of the independent variable on the mediator, and the effect of the mediator on the

dependent variable, controlling for the independent variable), we calculated Monte Carlo

confidence intervals with 50,000 repetitions (MacKinnon, Lockwood, & Williams, 2004;

Preacher & Selig, 2012), which did not include zero in either case, as reported in the main text

(OBP: 0.039, 95% CI [0.022, 0.058]; R/AB: 0.016, 95% CI [0.008, 0.027]).

Supplementary Text (Study 1)

Examining the At-Bat Prompting the Ejection.

For all analyses, we examined only outcomes occurring during the pre- and post-ejection

periods, excluding the at-bat that prompted the ejection (hereafter referred to as the ejection at-

bat). To confirm that the ejection at-bat featured particularly egregious calls from the perspective

of the team that suffered the ejection, at least for pitch-related ejections, we examined the

deviance scores of pitches thrown during those at-bats (n = 1,236 pitches). This would most

clearly be evident in highly unlikely strike (vs. ball) calls when the ejected team was batting (vs.

pitching) during the ejection at-bat. Indeed, there was a significant interaction between Ejection

Type (Pitch-related vs. Other) and Ejected Team Role (Batting vs. Pitching) on the deviance

measure, b = 0.384, 95% CI [0.298, 0.470], t(901.90) = 8.74, p < .001 (see Fig. S2). For pitch-

related ejections, the average deviance score was considerably lower when the ejected team was

batting (M = –0.210, 95% CI [–0.241, –0.179]) compared to when the ejected team was pitching

(M = 0.194, 95% CI [0.145, 0.243]), t(944.0) = –13.80, phb < .001. For other types of ejections,

there was no difference in deviation scores depending on whether the ejected team was batting

(M = –0.007, 95% CI [–0.050, 0.036]) or pitching (M = 0.013, 95% CI [–0.035, 0.061]) at the

time of the ejection, t(1007.7) = –0.62, phb = .535. As expected, the at-bat prior to a pitch-related

ejection featured calls that were highly unfavorable for the team that was ultimately ejected

(unlikely strike calls for the batting team, unlikely ball calls for the pitching team), suggesting

that it was these egregious calls that ultimately prompted the ejection-inducing argument.

Control variables.

The interaction between batting team (ejected vs. non-ejected) and period of game (pre- vs.

post-ejection) holds when controlling for characteristics of the pitch location (prior probability

within the bin, number of pitches in the bin; p < .0001), characteristics of the current at-bat and

game situation (current count of balls and strikes, number of outs, number of runners in scoring

position; p < .0001), as well as other metrics of situational pressure and importance, such as Win

Expectancy (WE; probability the batting team would win), Run Expectancy (RE; probability the

batting team would score a run during this at-bat), and Leverage Index (LI; an index of the

importance of the situation), p < .0001). Although the home plate umpire issued pitch-related

ejections in all but two cases (99.4% of games), the interaction also held when limiting the

analysis to ejections issued by the home plate umpire (p < .0001).

Considering a continuous measure of game period.

The models described above and reported in the main text treat Period of Game as a

categorical variable (i.e. pre- vs. post-ejection), largely as a matter of simplicity. Such an

approach implicitly assumes that any bias exhibited by the umpire (i.e. favoring one team over

another) is constant within the pre- and post-ejection periods, but it’s worth considering whether

the reality is more complicated than can be accommodated by a dichotomous variable. For

instance, it could be that the apparent reversal in bias in the post-ejection period is limited to a

few at-bats immediately following the ejection—the umpire could issue a few make up calls to

assuage the ejected team’s anger before reverting to a more neutral (unbiased) baseline, or

perhaps even revert back to the previous pattern of bias.

To test this possibility, we first created a continuous measure of game period by calculating

the distance from the current outcome to the ejection event (hereafter referred to as AB distance)

by subtracting the current at-bat from the ejection at-bat. For instance, if the ejection occurred

during the 25th at-bat of the game, then pitches thrown during the 29th at-bat would all have a

distance score of +4 (29 – 25), and all pitches thrown during the 12th at-bat would have a

distance score of –17 (29 – 12). Thus, all events in the pre-ejection period have negative values,

and all events in the post-ejection period have positive values.

First, in a model with batting team, AB distance, and their interaction predicting deviance

scores (pitch-related ejections only), the batting team × AB distance interaction was significant,

b = 0.0005, 95% CI [0.0002, 0.0007], t(237.13) = 3.62, p < .001, which is consistent with the

categorical variable (see Fig. S3, top panel). However, because it’s likely that any irregularities

would be non-linear, we also considered a model testing linear, quadratic, and cubic versions of

the AB distance measure (and each version’s interaction with batting team). Intriguingly, all

three of the interaction terms were significant (Linear: p < .001; Quadratic: p = .033; Cubic: p =

.012; see Table S4). As can be seen in Fig. S3 (bottom panel), which depicts the predicted values

from this model 50 at-bats before and after the ejection, there is some evidence of non-linearity

in the relative favorability of the umpire’s calls, but those seem to occur primarily at more

extreme values of AB distance, which are also the least represented in the data (hence the

increasingly large confidence intervals).

To take a slightly different approach, we examined the post-ejection period separately to see

if the bias in favor of the ejected team (relative to the non-ejected team) remained constant as the

distance from the ejection at-bat increased. Consistent with a constant effect, in a model with

batting team, AB distance, and their interaction, there was the expected main effect of batting

team, b = 0.021, 95% CI [0.033, 0.009], t(255.95) = 3.51, p < .001, but no interaction between

batting team and AB distance, b = –0.0002, 95% CI [–0.0009, 0.0005], t(2482.39) = –0.50, p =

.615. A similar model including linear, cubic, and quadratic terms for AB distance showed the

same result: a main effect for batting team (p < .001), but not for any terms involving AB

distance (all ps > .14). Based on these two results, it appears that the shift in bias after the

ejection remains—and remains relatively constant—throughout the game, and that treating game

period as a categorical variable is a valid approach.

Testing potential moderators.

We began by considering whether properties of the ejection itself may have moderated the

observed reversal in bias exhibited by the umpires in games featuring pitch-related ejections.

Although there is evidence that much of the proverbial home team advantage is due to more

favorable calls from the umpires (20), there was no evidence that the shift in the umpires’ favor

after an ejection was solely granted to the home team. Indeed, the shift in favor of the ejected

team was evident whether it was the home team or the away team that was ejected (both ps <

.001). None of the other properties of the ejection we examined moderated the effect, such as

whether the ejected team was batting or pitching at the time of the ejection, or whether the

person ejected was a manager or a player. In each case, the interaction between batting team

(Ejected vs. Non-Ejected) and period of game (Pre vs. Post Ejection) remained significant (all ps

< .001), but there was no significant three-way interaction with the moderator (all ps > .17).

We also tested whether variables related to the game itself might have mattered, including

the ambient temperature (see 21), game duration (in minutes; log transformed due to right skew),

and game attendance. Neither temperature nor game duration moderated the effect (both ps >

.11), nor diminished it (interaction: all ps < .0001). Game attendance (centered on the grand

mean, 31,311.74), however, did show some promise as a moderator. Although the interaction

between batting team and period of game remained significant (p < .0001), there was also a

three-way interaction with game attendance (p = .018). The interaction was such that the basic

effect—bias against the ejected team prior to the ejection, and in favor of the ejected team after

the ejection—was larger when attendance was high, and nearly absent when it was low.

However, we hesitate to draw strong conclusions from this finding, as attendance is no doubt

correlated with other relevant variables, such as the home team’s current record, or the

importance of the game.

We also tested whether variables related to the importance and impact of the situation

surrounding the ejection at-bat. That is, it’s possible that umpires may exhibit a stronger bias in

favor of the ejected team when the ejection came at a particularly bad time for that team.

Specifically, we tested the score differential prior to the ejection, and situational importance

metrics associated with the ejection at-bat, including RE, RE24 (the change in RE as a result of

the ejection at-bat), LI, WE, and WPA (Win Probability Added; the change in WE as a result of

the ejection at-bat). In every case, the interaction between batting team (ejected vs. non-ejected)

and period of game (pre- vs. post-ejection) remained significant (all ps < .0001), but there was no

significant three-way interaction with the moderator (all ps > .34).

Finally, we tested whether the umpires’ calls might be sensitive to the importance of the

current situation (specifically the current at-bat’s WE, LI, and RE), especially depending on

which team is batting. For instance, the shift in bias from the pre- to post-ejection period might

only occur for relatively unimportant situations, perhaps to avoid having an undue influence on

the game in very important situations. However, there was no evidence of any sensitivity to

context. For all three of the variables we tested, the interaction between batting team (ejected vs.

non-ejected) and period of game (pre- vs. post-ejection) remained significant (all ps < .0001), but

the three-way interaction was not significant (all ps > .16).

Materials and Methods (Study 2)

Experimental Design

The study employed a 2 (Feedback: Control vs. Critical) × 2 (Period of Study: Pre- vs. Post-

Critical Feedback) factorial design, with the first factor manipulated between-participants, and

the last factor manipulated within-subjects.

Participants.

We recruited 100 participants (64 male, 36 female) from Amazon.com’s Mechanical Turk

to play a “visual estimation game” with incentives for accuracy. Data collection stopped upon

reaching the target sample size (N = 100), and the data were not examined prior to that point.

We excluded participants who did not appear to make a reasonable effort at the task by

setting minimum standards for accuracy (at least 55%) and variable responding (at most 90% of

responses could be in the same direction). Based on these criteria, which were the only criteria

we considered, seven participants were excluded from the analyses, though including them does

not change the outcome of any analysis.

Dot-Estimation Task.

After consenting to participate, a game designed to test “perceptual acuity” was introduced

to participants. The game required participants to make perceptual judgments similar to those

made by umpires in Study 1. Participants completed 10 blocks of 10 trials. To begin, for each

block, participants were assigned a target number that varied randomly between 12 and 20. Then,

for each of the ten trials within the block, an array of dots was flashed on the screen and

participants had to judge whether the number of dots in the array was higher or lower than the

target number (indicating their response with a key press). The dot arrays were randomly

generated for each trial such that the actual number of dots was within a certain range (between

5% and 25%) of the target number (but never equal to the target number), ensuring variability in

difficulty. Because the number of dots was generated randomly, the number of trials in each

block where the correct response was “higher” also varied (ranging from 0-10, but typically 3-7).

This was made explicit to participants to ensure they were not deliberately giving an equal

number of higher and lower responses.

On each trial, the target number was displayed on the screen for 600ms, followed by the dot

image for 400ms. Participants were required to respond within 2 seconds, or they had to repeat

the trial with a newly generated dot array. This was intended both to make the task somewhat

difficult and to encourage snap judgments. Prior to starting the game, participants completed a

practice block of easy trials, on which they were given feedback about their performance, to

ensure they understood the game.

Over the 100 trials, participants averaged 75.91% accuracy, 95% CI [74.46%, 77.37%],

which was significantly greater than chance, t(92) = 35.40, p < .001, but nowhere near a ceiling

effect. Thus, the difficulty of the task was neither impossible nor trivial.

Partner Description and Incentive Structure.

Participants were told that they had been assigned the role of Judge, and were paired with a

partner who had been assigned to the role of Observer, whose role was to watch the Judge’s

performance and provide feedback after each block of trials. In truth, the partner did not exist,

and all feedback provided by the Observer was bogus.

The incentives for participants assigned to the role of Judge (i.e. all actual participants) were

based on accuracy. In addition to their base pay, participants received an additional $0.02 for

each correct response. To penalize guessing, this bonus was only awarded for the number of

correct responses above chance (50%). Thus, over the course of 100 trials, participants with

perfect accuracy would earn a bonus of $1.00 ($0.02 for each of 50 correct responses above

chance), and participants with a mere 60% accuracy would earn $0.20 ($0.02 for each of 10

correct responses above chance). Accuracy of 50% or less would earn no bonus. Participants

were not given feedback about their performance, and therefore their bonus, until the very end of

the experiment.

The Observer’s monetary incentives were also explained to participants. Whereas the Judge

(participant) was paid for accuracy, the Observer (partner) was paid based on the direction of the

response (i.e. higher or lower) given by the Judge, regardless of its accuracy. For instance, for

each “higher” response, the Observer’s bonus would increase by $0.01, and for each “lower”

response, it would decrease by $0.01. The particular response associated with a positive or

negative outcome was counterbalanced. As in the main text, the response that yielded a higher

monetary bonus for the Observer is referred to as the “directional” response. Thus, if the

participant gave 64 directional responses out of 100 trials, the Observer would earn a bonus of

$0.14. With 50 directional responses or fewer, the Observer’s bonus would be zero. This

misaligned incentive structure was designed to make it clear that the Observer’s feedback,

particularly any instruction to give more directional responses, might be purely self-serving—to

the detriment of the participant’s own interests.

To ensure that participants clearly understood both their own and their partner’s incentives,

participants were required to pass a “quiz” about the incentive structure before the first block of

trials.

Feedback Manipulation.

At the end of each block of trials, participants received some feedback ostensibly written by

their partner. In the beginning, the feedback was relatively neutral (e.g. “man those dots move

fast. Nice!”) or served to remind the participant of the partner’s incentives (e.g. “ur getting

quick! Remember, higher is better! jk”). After block 5, the feedback diverged depending on

condition. Participants in the critical feedback condition began receiving feedback critical of

their performance, but always suggesting that the participant’s errors were systematically biased

in a direction that hurt the partner’s bonus (e.g. “too many lows! I think you might have missed a

couple of highs there! help us both out!”). This critical feedback continued and intensified (e.g.

“what, do you have something against highs??” and “you’re killing me here!”) until the final

block of trials. Participants in the control condition received feedback that did not point to a

particular directional bias (e.g. “so many highs and lows! it’s hard to keep up with all of them.

sorry i can't be more help!” and “i’m getting tired! keep your head in the game, ur almost done”).

The final feedback, after the last block of trials, was somewhat neutral and identical in both

conditions. Thus, participants responded to 50 trials before and 50 trials after the critical

feedback began, allowing us to compare responses not just between conditions, but also over

time.

Explicit Beliefs and Manipulation Checks

After each block of trials, following the partner feedback, participants estimated the number

of correct responses (from 0 to 10) they gave, as well as the number of higher/lower responses

they gave in that block. These explicit estimates allowed us to ascertain whether participants

were aware of any bias creeping into their responses.

After the final block of trials and the last round of feedback, participants evaluated both

their own and their partner’s overall ability at the dot estimation task (1 = Very Poor; 6 =

Average; 11 = Very Good), and estimated the number of trials, out of 100, that they thought their

partner would have answered correctly. These items were intended to rule out the possibility that

participants began to doubt their own abilities in the face of critical feedback. To confirm that

this is not a likely explanation for the results, we conducted a 2 (Feedback: Control vs. Critical)

× 2 (Target: Self vs. Other) mixed-model ANOVA, with feedback as a between-participants

variable, and target as a within-participants variable. As expected, there was a strong main effect

of target, F(1,91) = 29.18, p < .001, η"

# = .092, indicating that participants clearly thought that

they were more skilled than their partner at the game. More importantly, this main effect was

qualified by a significant feedback × target interaction, F(1,91) = 4.47, p = .037, η"

# = .047 (see

Fig. S7). Specifically, although participants in the control condition did indeed think more highly

of their own abilities, t(91) = 2.12, phb = .036, this tendency was exaggerated in the critical

feedback condition, t(91) = 5.40, phb < .001, suggesting that participants only became more sure

of their own abilities (and more skeptical of their partner’s abilities) as a result of the critical

feedback. Furthermore, there was no difference in participants’ estimates of how many trials

their partner would have gotten correct (Critical feedback: M = 61.92, 95% CI [57.17, 66.67];

Control: M = 60.00, 95% CI [55.72, 64.28]), t < 1, ns. Thus, across multiple measures, we found

no support for the idea that critical feedback led participants to doubt their abilities. In fact, just

the opposite appeared to be true: as described above, participants gave higher estimates of their

own abilities in the critical feedback condition compared to participants in the control condition,

t(91) = 2.86, phb = .010.

During the last block of questions, participants also indicated the degree to which they

thought their “partner was judging the dot images accurately, or had a biased perspective” (1 =

Definitely biased to judge Lower; 6 = Partner’s judgment was accurate; 11 = Definitely biased to

judge Higher). This item, intended as a check on the incentive structure, was reverse-scored for

participants whose partner’s incentive was for “lower” responses, so that higher numbers always

indicated greater bias. (The direction of the partner’s incentive did not impact perceptions of

bias, p = .106.) Overall, participants did perceive a great deal of bias in their partner (M = 7.58,

95% CI [7.07, 8.09]), as confirmed by a one-sample t-test against the scale midpoint, t(92) =

6.21, p < .001. However, this belief was stronger for participants in the critical feedback

condition (M = 8.44, 95% CI [7.67, 9.21]) than participants in the control condition (M = 6.58,

95% CI [6.07, 7.09]), t(82.47) = –4.05, p < .001, d = 0.814. This confirms that participants did in

fact perceive a bias consistent with the partner’s monetary incentives, and that the perception of

bias was especially large when the partner appeared to give feedback consistent with her own

self-interest.

Participants also answered two questions related to the cover story, one about the degree to

which their partner motivated them (1 = Not at all; 11 = A great deal), and one about how

pleasant it was to have someone watching their performance (1 = Extremely Unpleasant; 11 =

Extremely Pleasant). Although it was not explicitly intended as such, participants’ responses to

this last question can speak to the possibility that participants liked their partner, and made more

directional responses in order to ensure that she got a reasonable bonus. Contradicting that

account, participants in the critical feedback condition reported that it was less pleasant to have

someone else watching their performance (M = 5.06, 95% CI [4.28, 5.84]), compared to

participants in the control condition (M = 6.33, 95% CI [5.75, 6.90]), t(91) = 2.56, p = .012, d =

.680.

Finally, after providing basic demographic information (gender, age, race, household

income), participants were asked for “any comments or thoughts you might have about your

experience in the task, the experiment, the incentives, or about your partner.” Some participants

expressed mild suspicion about whether their partner actually existed, but none with certainty, so

no one was excluded from the analyses based on suspicion.

Statistical Analysis

All analyses were conducted in R (16). For all tests of the simple-effects, p values were

corrected for multiple comparisons using the Holm-Bonferroni procedure (12); corrected p

values are indicated with a subscript (i.e. phb).

As reported in the main text, we tested the effect of the feedback manipulation by

conducting a 2 (Feedback: Critical Feedback vs. Control) × 2 (Timing: Pre- vs. Post-Feedback)

mixed-model ANOVA, with feedback as a between-subjects variable and timing as a within-

subjects variable.

Supplementary Text (Study 2)

Effects on Accuracy

Although our main prediction was about the shift in directional responses, we considered

whether the shift in directional responses had an impact on participants’ accuracy. The results of

a similar mixed-model ANOVA predicting accuracy of responses also yielded an interaction

between feedback condition and timing, F(1,91) = 4.97, p = .028, η"

# = .052. Participants in the

critical feedback condition became less accurate after the introduction of the critical feedback

(Mpre = 0.772, 95% CI [0.747, 0.798]; Mpost = 0.746, 95% CI [0.720, 0.771]), though this

difference was not statistically significant, t(91) = –1.76, phb = .164. Conversely, participants in

the control condition showed a non-significant improvement in accuracy (Mpre = 0.747, 95% CI

[0.720, 0.775]; Mpost = 0.771, 95% CI [0.744, 0.798]), t(91) = 1.44, phb = .164.

To see if participants were able to detect the shifts in accuracy, we conducted a 2

(Feedback: Critical Feedback vs. Control) × 2 (Timing: Pre- vs. Post-Feedback) × 2 (Response

Type: Actual vs. Estimated) mixed-model ANOVA on the proportion of correct responses, with

feedback as a between-subjects variable, and timing and response type as within-subjects

variables. This analysis revealed main effects of feedback condition (p = .016) and response type

(p < .001, indicating a general tendency to underestimate accuracy), a two-way interaction

between feedback condition and response type (p = .022), all of which were qualified by a three-

way interaction, F(1,91) = 9.22, p = .003, η"

# = .092. The interaction was such that, despite

actually becoming less accurate after the critical feedback started (as described above),

participants in the critical feedback condition estimated that their accuracy increased, t(91) =

2.18, phb = .064. Participants in the control condition, however, did not change their estimates

over time, t(91) = 0.60, phb = .547.

Materials and Methods (Explicit Beliefs Study)

We recruited 90 participants from MTurk for a brief study about fairness. Participants were

randomly assigned to read one of three versions of the scenario: Disadvantage-to-self,

Advantage-to-self, or Self-as-reviewer. Each version described a situation where a manager is

found to be exhibiting a biased pattern of behavior (similar to that of the umpires in Study 1). As

an example, the disadvantage-to-self scenario is presented below:

Suppose you and a coworker are being evaluated for a promotion at work. Only one of

you will be promoted. To decide who gets the promotion, an independent reviewer from

Human Resources (HR) will be monitoring both your and your coworker’s performance

over a two-week period and judging it for its quality.

The reviewer evaluates your and your coworker’s work in this way: She keeps close

tabs on the amount that each of you worked and what you accomplished each day, and

assigns a score for each of you based on both criteria. Then, at the end of each day, you and

your coworker receive a “scorecard,” which displays the rating that each of you received

for the day. For example, on a particular day, you might receive a score of 4 to 3,

indicating that you outscored your coworker on that day. This scorecard system is intended

to keep both of you performing at your best. It’s also worth noting that the reviewer’s only

goal is to render a fair and accurate judgment.

By the halfway point in the judging period, you have noticed a bias in the daily scores.

It seems to you that your coworker and you are performing at more or less at the same

level, yet at the end of most days, your coworker receives a higher score than you. You are

concerned that if this continues you will be passed up for the promotion based upon these

skewed scores.

You don’t want to get the reviewer in trouble, but you decide to bring up the issue of

bias to the reviewer’s boss in HR. The boss takes a look at the scores, alongside the

concrete work that each of you has done, and decides that he, too, sees a bias that disfavors

you. He decides to get in touch with the reviewer and let her know that her reviews appear

to be biased.

Suppose that after talking to her boss, the reviewer agrees that her ratings have shown

bias. Because of the way the system works, she cannot change her past ratings, so she is

faced with a question about what is the right thing to do. There is one week remaining in the

review process and then the promotion decision will be made. What is the fair thing for the

reviewer to do?

a. To be fair, the reviewer should simply attempt to rid herself of bias, judging

you and your coworker on the merits of the work you both do for the next week

b. To be fair, the reviewer should keep using the same criteria to judge you and

your coworker, even if there is a bias in those criteria

c. To be fair, the reviewer should “reverse” her bias, so that she now favors you

over your coworker for an equal number of evaluations

What varied between conditions was the role in which participants imagined themselves to

be. In the disadvantage-to-self condition (above), participants imagined that they had been

disadvantaged as a result of the manager’s decisions; in the advantage-to-self condition,

participants imagined themselves to be the co-worker who had benefitted from the manager’s

bias; in the self-as-reviewer condition, participants imagined themselves in the role of the

manager. In every case, they answered a version of the question (above) appropriate to their role.

The different conditions were intended to allow for the possibility that self-interest would

color participants’ views of what was fair. Although the pattern of responses is consistent with

that notion, because the three conditions did not significantly differ, we collapsed across the

three versions of the scenario.

Fig. S1.

Heat map depicting the likelihood of ball or strike call based on pitch location as viewed from

the umpire’s perspective (Study 1). Probabilities are calculated from 2,212,150 called pitches in

regular season games between 2008-2013.

Left−Handed Batters

Right−Handed Batters

−3

−2

−1

−3−2−1 0 1 2 3 −3−2−1 0 1 2 3

X (normalized units)

Z (normalized units)

0% Balls

(100% Strikes)

25% Balls

(75% Strikes)

50% Balls

(50% Strikes)

75% Balls

(25% Strikes)

100% Balls

(0% Strikes)

Proportion of Called Balls

Fig. S2

Pitch-level deviance scores from pitches thrown during the ejection at-bat (Study 1). Error bars

represent +/– 1 standard error.

−0.3

−0.2

−0.1

0.0

0.1

0.2

0.3

Pitch−related ejections Other kinds of ejections

Ejection Type

Deviance score

Batting Team Pitching Team

Fig. S3

Predicted values from a linear mixed-effects model predicting pitch-level deviance scores by

Batting Team and AB Distance (top panel; Study 1). The bottom panel depicts a model that

includes linear, quadratic, and cubic terms for AB distance. Shaded regions represent 95%

confidence intervals.

−0.04

−0.03

−0.02

−0.01

0.00

0.01

0.02

0.03

0.04

−50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0 5 10 15 20 25 30 35 40 45 50

At−bat from ejection

(Ejection at−bat = 0)

Deviation score

Ejected Team Non−Ejected Team

−0.25

0.00

0.25

−50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0 5 10 15 20 25 30 35 40 45 50

At−bat from ejection

(Ejection at−bat = 0)

Deviation score

Ejected Team Non−Ejected Team

Fig. S4

On-base percentage (OBP; Study 1). Error bars represent +/- 1 standard error.

Fig. S5

Runs scored per at-bat (R/AB; Study 1). Error bars represent +/- 1 standard error.

0.26

0.28

0.30

0.32

0.34

0.36

0.38

0.40

Pre−Ejection Period Post−Ejection Period

Pre/Post Ejection

On−Base Percentage (OBP)

Ejected Team Non−Ejected Team

0.00

0.05

0.10

0.15

0.20

Pre−Ejection Period Post−Ejection Period

Pre/Post Ejection

Runs scored per at−bat

Ejected Team Non−Ejected Team

Fig. S6

Example image used in the dot-estimation task (Study 2).

Fig. S7

Participants’ perceptions of their own and their partner’s ability in the game by feedback

condition (Study 2). Error bars represent +/– 1 standard error.

Control Critical Feedback

Critical Feedback Condition

Task Ability Evaluation

Self Partner

Table S1.

Linear mixed-effects model predicting pitch-level deviance scores by Game Period, Batting Team, and Ejection Type (Study 1).

SD!

Random effects

Home Plate Umpire

Intercept

0.016

[0.011, 0.021]

Game ID

Intercept

0.036

[0.032, 0.039]

Game Period (Pre- vs. Post-ejection)

0.014

[0.000, 0.027]

Batting Team (Non-ejected vs. Ejected)

0.036

[0.027, 0.043]

Game Period × Batting Team

0.034

[0.000, 0.055]

Residual

0.318

[0.316, 0.320]

Fixed effects

Intercept

-0.001

[-0.006, 0.004]

-0.48

85.71

.634

Game Period (Pre- vs. Post-ejection)

0.005

[-0.000, 0.010]

1.84

676.05

.066

Batting Team (Non-ejected vs. Ejected)

0.000

[-0.005, 0.006]

0.09

716.83

.932

Ejection Type (Other vs. Pitch-related)

-0.002

[-0.009, 0.006]

-0.41

716.4

.680

Game Period × Batting Team

0.021

[0.011, 0.031]

4.23

712.72

< .001

Game Period × Ejection Type

0.012

[0.002, 0.022]

2.30

677.85

.022

Batting Team × Ejection Type

0.001

[-0.009, 0.012]

0.24

716.78

.808

Game Period × Batting Team × Ejection Type

0.042

[0.022, 0.061]

4.11

712.65

< .001

Table S2.

Linear mixed-effects model predicting pitch-level deviance scores by Game Period, Batting Team (pitch-related ejections only; Study

1).

SD!

Random effects

Home Plate Umpire

Intercept

0.014

[0.000, 0.022]

Game ID

Intercepta

0.037

[0.033, 0.042]

Game Period (Pre- vs. Post-ejection)

0.016

[0.000, 0.033]

Batting Team (Non-ejected vs. Ejected)

0.033

[0.018, 0.044]

Game Period × Batting Team

0.049

[0.000, 0.075]

Residual

0.321

[0.319, 0.324]

Fixed effects

Intercept

–0.001

[–0.008, 0.005]

–0.44

73.74

.660

Game Period (Pre- vs. Post-ejection)

0.011

[0.003, 0.018]

2.77

284.33

.006

Batting Team (Non-ejected vs. Ejected)

0.001

[–0.007, 0.009]

0.15

319.24

.878

Game Period × Batting Team

0.042

[0.026, 0.057]

5.37

314.61

< .001

Table S3.

Linear mixed-effects model predicting pitch-level deviance scores by Game Period, Batting Team (other kinds of ejections only;

Study 1).

SD!

Random effects

Home Plate Umpire

Intercept

0.014

[0.004, 0.021]

Game ID

Intercepta

0.036

[0.031, 0.041]

Game Period (Pre- vs. Post-ejection)

0.013

[0.000, 0.029]

Batting Team (Non-ejected vs. Ejected)

0.038

[0.028, 0.048]

Game Period × Batting Team

0.009

[0.000, 0.051]

Residual

0.315

[0.313, 0.318]

Fixed effects

Intercept

-0.001

[-0.007, 0.004]

-0.49

78.36

0.624

Game Period (Pre- vs. Post-ejection)

0.001

[-0.006, 0.008]

0.32

398.67

0.751

Batting Team (Non-ejected vs. Ejected)

0.000

[-0.007, 0.008]

0.13

402.89

0.898

Game Period × Batting Team

0.000

[-0.012, 0.013]

0.06

402.79

0.954

Table S4.

The table below reports the results of a linear mixed-effects model predicting pitch-level deviance scores by Batting Team (non-

ejected vs. ejected) and a continuous measure of game period (AB Distance). To allow for non-linear effects of AB Distance, the

model included linear (L), quadratic (Q), and cubic (C) terms, as well as their interaction with Team.

SD!

Random effects

Home Plate Umpire

Intercept

0.014

[0.000, 0.022]

Game ID

Intercepta

0.037

[0.032, 0.043]

Distance from Ejection AB (Linear)

1.743

[0.000, Inf]

Batting Team (Non-ejected vs. Ejected)

0.033

[0.019, 0.044]

Distance from Ejection AB × Batting Team

4.268

[0.000, 6.731]

Residual

Fixed effects

Intercept

-0.003

[-0.009, 0.004]

-0.79

71.58

.430

Distance from Ejection AB (Linear)

0.776

[0.027, 1.526]

2.03

268.89

.043

Distance from Ejection AB (Quadratic)

-0.100

[-0.790, 0.593]

-0.28

568.81

.778

Distance from Ejection AB (Cubic)

-0.300

[-0.974, 0.373]

-0.87

910.19

.384

Batting Team (Non-ejected vs. Ejected)

0.004

[-0.004, 0.011]

0.88

301.60

.379

Distance from Ejection AB (L) × Batting Team

-2.588

[-4.021, -1.153]

-3.55

253.60

< .001

Distance from Ejection AB (Q) × Batting Team

-1.497

[-2.868, -0.127]

-2.14

543.20

.033

Distance from Ejection AB (C) × Batting Team

1.720

[0.374, 3.070]

2.50

919.98

.012

Table S5.

Linear mixed-effects model predicting pitch-level deviance scores by Game Period, Batting Team, and Bin Ambiguity (only pitch-

related ejections; Study 1).

SD!

Random effects

Home Plate Umpire

Intercept

0.014

[0.000, 0.022]

Game ID

Intercepta

0.037

[0.033, 0.043]

Game Period (Pre- vs. Post-ejection)

0.016

[0.000, 0.032]

Batting Team (Non-ejected vs. Ejected)

0.033

[0.018, 0.044]

Game Period × Batting Team

0.049

[0.000, 0.075]

Residual

0.321

[0.319, 0.324]

Fixed effects

Intercept

-0.002

[-0.009, 0.005]

-0.46

98.36

.649

Bin Ambiguity

-0.001

[-0.013, 0.011]

-0.15

34430.77

.880

Game Period (Pre- vs. Post-ejection)

0.018

[0.009, 0.027]

3.92

591.79

< .001

Batting Team (Non-ejected vs. Ejected)

-0.000

[-0.010, 0.009]

-0.10

627.40

.923

Bin Ambiguity × Game Period

0.035

[0.012, 0.059]

2.91

34426.65

.004

Bin Ambiguity × Batting Team

-0.005

[-0.029, 0.018]

-0.45

34416.04

.653

Game Period × Batting Team

0.070

[0.052, 0.089]

7.51

648.52

< .001

Bin Ambiguity × Game Period × Batting Team

0.134

[0.086, 0.181]

5.49

34429.69

< .001

213

E. Bijleveld and H. Aarts (eds.), The Psychological Science of Money,

DOI 10.1007/978-1-4939-0959-9_10, © Springer Science+Business Media New York 2014

Abstract This chapter discusses the psychological research related to the act of

spending money, with the aim of understanding the underlying psychological

processes involved. To that end, the emotions involved in spending money before,

during, and after the money changes hands are explored, including the role of antici-

pated and anticipatory emotions, different orientations to the gains and losses inher-

ent in an act of spending, and the process of hedonic adaptation. Additionally, given

how fundamental choice is to the act of spending money, factors that inﬂ uence

the decision- making process are discussed, including the role that comparative

processes and expectations play in the process of making decisions and evaluating

their outcomes. In each case, particular attention is paid to the psychological forces

that inﬂ uence the ultimate goal underlying any act of spending: happiness. Finally,

several concrete strategies for making purchases most likely to lead to success on

this goal are identiﬁ ed, including purchasing experiences over possessions, spending

pro-socially, and making meaningful purchases.

The Act of Spending Money

The act of spending money is absolutely ubiquitous in modern life. It is the primary

way that we meet our basic needs, spending it on food, clothing, shelter, health care,

transportation, and entertainment, and is so ingrained in modern life that we rarely

reﬂ ect on what that act represents. At its most basic level, the act of spending is

nothing more than an exchange: one person gives money to another and receives

some good or service in return. This deﬁ nition is serviceably descriptive, but omits

Chapter 10

The Psychological Science of Spending Money

Travis J. Carter

T. J. Carter (*)

Department of Psychology , Colby College ,

5550 Mayﬂ ower Hill Drive , Waterville , ME 04901-8855 , USA

e-mail: tjcarter@colby.edu

214

any psychological antecedents or consequences for the spender. For one thing, it

leaves out the element of choice. Money isn’t spent by accident, the result of tripping

over an errant shoelace; one chooses to exchange money for some particular pur-

chase instead of other possible purchases—or instead of purchasing nothing at all.

Choices are made with a purpose, intended to create some outcome. That particular

choice is based on the belief that the purchase will produce a greater hedonic

beneﬁ t—for oneself, or for others—than the alternatives over some period of time

(Mellers & McGraw, 2001 ; Mellers, Schwartz, & Ritov, 1999 ). In addition to that

expected hedonic gain, spending money also inherently involves costs. There is

obviously the direct monetary cost, but also the opportunity cost: all of the other

ways that one could have spent this money must now be foregone. Thus, a more

psychological deﬁ nition of the psychological act of spending money would be a

simultaneous loss (of money and opportunity) and gain (of some good or service)

for oneself and/or someone else that one chooses to undertake based on some beliefs

about future hedonic states.

To see the implications, it’s worth unpacking the various components of this deﬁ -

nition further. First, gains and losses are inherently affectively laden constructs;

they are important because they create feelings of pleasure and pain, even when

merely anticipating a potential gain or loss (see Knutson, Rick, Wimmer, Prelec, &

Loewenstein, 2007 ). Although it can be seen as the output of some cost–beneﬁ t

analysis, the choice to spend money is not merely some cold cognitive calculation;

it is an affective event involving some balance of pleasure and pain paid out over

some period of time. Purchases are certainly made with the intention of producing

an emotional experience, but emotions felt during the act of considering a purchase

can also inﬂ uence the decision-making process and its outcome (Andrade & Ariely,

2009 ; Isen, 2001 ; Lerner, Small, & Loewenstein, 2004 ; Mattila & Wirtz, 2000 ).

Second, the exact nature of the pleasure and pain experienced as a result of a given

purchase is by no means certain. Rather, it is how we anticipate we will feel as a

result of the purchase, a forecast based on some imagined future. Making a forecast

requires that we ﬁ rst imagine what the basic facts of the situation will be like before

estimating how that imagined situation will make us feel. Unfortunately, we tend to

be overconﬁ dent and optimistic in our predictions about the basic facts of a future

situation (e.g., Grifﬁ n, Dunning, & Ross, 1990 ; Newby-Clark, Ross, Buehler,

Koehler, & Grifﬁ n, 2000 ), so perhaps it is not surprising that predictions of future

emotional states are also typically inaccurate (Wilson & Gilbert, 2003 ). This is

especially important because of a third aspect of the act of spending: choice. The act

of spending inherently involves an act of choosing—choosing not only if but also

which thing to purchase. Thus, forecasting a single imagined future is insufﬁ cient.

In order to choose which option to purchase, we must imagine a future scenario for

each possible choice we might make, and predict how each one will make us feel.

The uncertainties and biases involved can multiply quite quickly, turning what

could have been a simple exchange into a daring act of mentalism. Fourth, the self

is an important component to any purchase (see Belk, 1988 ). The decisions we

make help make us who we are, and purchase decisions are no different. Indeed

some purchases are explicitly intended to reﬂ ect or convey aspects of our personalities

T.J. Carter

215

(Tian, Bearden, & Hunter, 2001 ). Finally, and relatedly, other people are certainly

present in our forecasted futures. In addition to predicting how something will make

you feel, you must often imagine how a given purchase will make someone else

feel—a spouse or friend who might share in the outcome, for instance—and factor

these other feelings into decision-making process.

The remainder of this chapter will explore these facets of the act of spending

money in greater depth, but always keeping in mind why people choose to spend

money: in order to make themselves happier (see Csikszentmihalyi, 2000 ; Diener &

Fujita, 1995 ). Indeed, based in part on the belief that accumulating wealth will allow

them to spend more money and further improve their welfare (Aknin, Norton, &

Dunn, 2009 ; K a h n e m a n , K r u e g e r , S c h k a d e , S c h w a r z , & S t o n e , 2004 ; Van Praag &

Frijters, 1999 ) , p e o p l e w o r k v e r y h a r d t o a c q u i r e m o n e y ( s e e A h u v i a , 2008 ) , 1 often

sacriﬁ cing time with family and friends in the pursuit of wealth (Kasser, Cohn,

Kanner, & Ryan, 2007 ; N i c k e r s o n , S c h w a r z , D i e n e r , & K a h n e m a n , 2003 ), even to the

point that wealth acquisition has become a mindless enterprise (Hsee, Zhang, Cai,

& Zhang, 2013 ). This chapter will examine how each of the different aspects of the act

of spending money highlighted above connects to the broader goal of happiness, but

it’s worth ﬁ rst asking the more global question: does spending money, on average,

make people happier?

One fairly straightforward approach to answering this question is simply to

examine the relationship between wealth and happiness. Having money is, after all,

a precondition to spending it (ignoring for the moment the perils of using credit

cards to spend money one doesn’t have). Thus, if spending money is effective in

serving its purpose, then the richest individuals, who have more money to spend,

should be the happiest. If not, then the pursuit of additional wealth seems futile;

having more money wouldn’t actually make people any happier. An abundance of

research over many decades shows that although there is most deﬁ nitely a positive

relationship between wealth (typically measured as income) and happiness, it is

typically quite modest and suffers considerably from diminishing returns (for recent

reviews, see Diener, Tay, & Oishi, 2013 ; Sacks, Stevenson, & Wolfers, 2012 ). That

is, although richer people are generally happier than poorer people, the hedonic

impact of additional wealth levels off. The same amount of additional wealth has a

fairly dramatic impact on the happiness of the impoverished, but it has a fairly small

impact on the wealthy.

One of the generally accepted reasons for this has to do with how money is spent

at different levels of wealth. At lower income levels, money is generally being spent

to meet basic human needs, like food and shelter, which, not surprisingly, produces

1 It is worth noting, of course, that people accumulate wealth for reasons that have nothing to do

with speciﬁ c planned expenditures, such as to prevent an unexpected and catastrophic life event

(like an expensive health care emergency) from destroying one’s ability to meet basic needs.

Indeed, the anxiety associated with debt has devastating effects on well-being (Brown, Taylor, &

Price,

2005 ). The status that comes with wealth is also seen by some as an end in and of itself

(Kasser & Ryan,

1993 ). While these factors undoubtedly play a role in the acquisition of wealth,

because this chapter is speciﬁ cally exploring the act of spending money and not its acquisition,

they are better suited for discussion elsewhere.

10 Psychological Science of Spending Money

216

a fairly large hedonic return ( Biswas-Diener & Diener, 2001 ).

2 At higher income

levels, where basic needs can be taken for granted, much of the money that people

spend can be considered discretionary: spending on wants instead of needs, with the

express intention of making themselves happier. It is this general realm of spending,

where the pressures of basic survival don’t apply, and indeed where the relationship

between wealth and happiness is fairly modest, that will be the focus of this chapter,

because it is the one that requires more explanation. If money spent on discretionary

purchases seems to make a relatively small contribution to well-being, then we are

left with two possibilities. Either discretionary spending is simply ill suited to pro-

ducing happiness (despite our intuitions and intentions) or people simply have mis-

guided notions about how to spend their money to actually make themselves happier

(Dunn, Gilbert, & Wilson, 2011 ) . I n t h e s e c t i o n s t h a t f o l l o w , I w i l l f o c u s o n t h e r o l e

that emotions and choice play before, during, and after one engages in an act of

spending, and in particular identifying issues that prevent purchases from producing

their intended effect: happiness. Then, I will outline some strategies, including the

types of purchases and the recipient of the expenditure, that can maximize each indi-

vidual act of spending’s contribution toward that overarching goal of happiness.

Emotions

As described above, the mere act of spending money itself is not hedonically neu-

tral. It’s important to note, however, that equivalent gains and losses produce asym-

metrical hedonic outcomes (pleasure and pain, respectively). As put forth by

prospect theory (Kahneman & Tversky, 1979 ), from the same reference point,

losses are felt more strongly than gains (Kahneman & Tversky, 1984 ; Tversky &

Kahneman, 1991 ; cf. Novemsky & Kahneman, 2005 )—dropping $20 down a storm

sewer would feel worse than ﬁ nding $20 on the street would feel good. Thus, when

considering a purchase, it is no surprise that people naturally focus on the losses that

they will incur (Carmon & Ariely, 2000 ), because that is often the more potent emo-

tional experience.

Anticipated vs. Anticipatory

However, the affect experienced as a result of a given purchase does not simply start

at the moment the money is spent; there are emotions felt well prior to the purchase,

and which continue to reverberate long into the future. That is, there is a distinction

to be made between anticipated emotions and anticipatory emotions (Loewenstein,

Weber, Hsee, & Welch, 2001 ). Anticipated emotions are the emotions you expect

2 At the extreme low end of the income spectrum, spending money might even be better thought of

as intended to decrease misery rather than increase happiness (see Martin & Hill,

2012 ).

T.J. Carter

217

to feel when you actually take possession of the new purchase—the joy you’d

experience when using a new iPhone, or the guilt you might feel after eating a tub

of popcorn at the movies—and aren’t really emotions at all. They are cognitions, a

forecast of what your experience with the purchase will be like at some point in the

future, and the emotions you predict that experience will stir up. The role of antici-

pated emotions on choice and evaluation is a largely conscious one: we decide

whether and how to spend money based on how we anticipate the various courses of

action will make us feel (Mellers et al., 1999 ; Shiv & Huber, 2000 ), and evaluate the

outcome based partly on how the actual outcome compares to our expectations

(Bell, 1985 ).

Anticipatory emotions, on the other hand, are the emotions you experience at the

very moment you are considering the purchase: imagining the pleasure you will

experience when you ﬁ nally get to use your new iPhone might very well make you

giddy in the present, or you might feel some immediate guilt as a result of imagining

gorging yourself on buttery popcorn. Or, instead of thinking about how the purchase

you’re considering might make you feel, you might think about the opportunity

costs—purchases you’ll have to delay or forgo as a result of spending this money.

Buying a new car might mean you have less money to spend on dinners at restau-

rants, and you might feel some negative emotions while merely considering missing

those opportunities. The role of anticipatory emotions in choice and evaluation tends

to be less conscious, and as a result, people may not realize how large an impact it

might have (Andrade & Ariely, 2009 ) . T h e s e i m m e d i a t e e m o t i o n s c a n b e u s e d a s a

cue for how one should choose in normal circumstances (e.g., Pham, 1998 ) , b u t c a n

also exert a considerably more powerful (and hard to control) inﬂ uence when the

emotions are more intense (see Loewenstein, 1996 ) .

Because they play different roles in guiding the choice and evaluation process,

the distinction between anticipated and anticipatory emotions is important to under-

standing the act of spending money. However, it can be difﬁ cult to tease their roles

apart in practice, largely because they inﬂ uence each other both directly and indi-

rectly (Loewenstein & Lerner, 2003 ). The type and magnitude of the expected

(anticipated) emotions resulting from some event in the future (eating a delicious

meal, for instance) will inﬂ uence the type and magnitude of the anticipatory emo-

tions you experience immediately upon imagining that future state. At the same

time, anticipatory emotions can inﬂ uence exactly how that future state is imagined,

which will, in turn, inﬂ uence the emotional experience predicted to result from it.

What’s more, because the act that sets it all in motion is imagining a future state,

that entire process will also be inﬂ uenced by any number of other factors that are

important to future-oriented thinking. For instance, simply thinking about an event

that is close in time, as opposed to one that is further off into the future, will lead

people to imagine it very differently. The closer in time an event is, the more likely

people are to focus on its more concrete aspects (Trope & Liberman, 2003 ), to

reduce their subjective conﬁ dence about what exactly will transpire (Gilovich, Kerr,

& Medvec, 1993 ), and to experience more intense immediate emotions (Loewenstein,

1996 ). This difﬁ culty notwithstanding, researchers have had a great deal of success

both measuring and manipulating the separate cognitive (anticipated) and affective

10 Psychological Science of Spending Money

218

(anticipatory) processes involved in decision-making and outcome evaluation

(see Loewenstein & Lerner, 2003 for a review). One notable issue that has arisen

relates to the pleasure and pain—both anticipated and anticipatory—evoked by the

gain and loss side of a monetary transaction, respectively, and the psychological

consequences of focusing on one side or the other.

The Pain of Paying

Because people vary in the degree to which they tend to focus on acquiring pleasurable

gains (promotion goals), rather than avoiding painful losses (prevention goals; Higgins,

1997 ), focusing on the gain rather than the loss side when pondering a purchase

decision will have a big impact on both anticipated and anticipatory emotions, and as a

result, the likelihood of actually spending money. The different spending habits of

so-called spendthrifts a n d tightwads illustrate the consequences of gain/loss focus

quite well (Rick, Cryder, & Loewenstein, 2008 ). Spendthrifts tend to focus on what

they’ll gain from spending money, and all but ignore the costs, and so end up spending

too freely on purchases whose hedonic impact is ﬂ eeting at best. Tightwads generally

focus on the losses involved when spending money and will often refuse to spend

money that might nonetheless yield signiﬁ cant hedonic gains.

3 Indeed, in addition to

concentrating on the “pain of paying” (Prelec & Loewenstein, 1998 ), tightwads worry

about opportunity costs, something that most people do not do spontaneously

(Frederick, Novemsky, Wang, Dhar, & Nowlis, 2009 ) u n l e s s t h e y a r e a c t i v e l y c o n s i d e r -

ing many different options and must forgo all but the one they choose (Carmon,

Wertenbroch, & Zeelenberg, 2003 ; s e e a l s o A r i e l y , H u b e r , & W e r t e n b r o c h , 2005 ).

T h e c o n t e x t i n w h i c h a d e c i s i o n i s m a d e c a n c r e a t e a s e n s e o f “ ﬁ t” with one’s natu-

ral focus and lead to better outcomes, such as greater satisfaction (Avnet & Higgins,

2006 ) . A s s u c h , o n e w a y t o e n c o u r a g e t i g h t w a d s t o p a r t w i t h t h e i r m o n e y i s t o e m p h a -

size aspects of the purchase situation that reduce the perceived pain of paying.

For instance, in one experiment, participants were asked to imagine that they could

choose to receive a boxed set of DVDs from Amazon.com for free, if they were will-

ing to pay $5 to cover shipping costs. In the baseline condition, true to form, spend-

thrifts were considerably more willing than tightwads to pay the $5 in order to receive

the DVDs. However, when the shipping charge was described as “a small fee,”

making the amount seem insigniﬁ cant and reducing the perceived pain of paying it,

tightwads were just as willing as spendthrifts to pay the fee (Rick et al., 2008 ).

Perhaps examining these different spending tendencies, rather than looking at

the relationship between wealth and happiness, can provide a more direct answer to

the question of whether spending money makes people happier. That is, if spending

money does increase well-being on average, then tightwads, who are generally quite

reluctant to part with their money, may be missing genuine opportunities to impact

3 Those who generally feel that they spend and save appropriately are referred to as unconﬂ icted

(Rick et al.,

2008 ).

T.J. Carter

219

their happiness. Conversely, spendthrifts, who engage in spending opportunities

they probably shouldn’t, might actually be measurably happier than both tightwads

and unconﬂ icted spenders as a result. To ﬁ nd out how these different attitudes

toward spending money relate to more global measures of happiness, I recruited

participants from Amazon.com’s Mechanical Turk to complete the Spendthrift–

Tightwad scale (ST–TW; Rick et al., 2008 ) a n d t h e S u b j e c t i v e H a p p i n e s s S c a l e

(Lyubomirsky & Lepper, 1999 ) . E v e n w h e n c o n t r o l l i n g f o r r e l e v a n t d e m o g r a p h i c

differences (income and age), participants classiﬁ ed as tightwads did report lower

subjective happiness ( M = 4.47, SD = 1.28) than the other two groups, β = .232,

t (309) = 2.07, p < .05, but spendthrifts ( M = 4 . 7 6 , S D = 1 . 2 2 ) a n d t h e u n c o n ﬂ icted

( M = 4 . 7 6 , S D = 1 . 2 9 ) w e r e e q u a l l y h a p p y , t < 1 , n s .

Why do spendthrifts, who experience the least pain of paying, and who should

presumably be reaping some hedonic rewards from their unrestrained spending, show

no gains in happiness relative to the unconﬂ icted? Or, put another way, what does this

non-difference say about the ability for purchases to actually make people happy?

One reason might be related to how people adapt to hedonic events, like the short-term

shifts in happiness produced by spending money. That is, since spendthrifts are more

focused on the potential gains (or at least less concerned with the potential losses),

they may be more likely to succumb to a classic forecasting error: failing to anticipate

how quickly they will adapt to their future circumstance (Gilbert, Pinel, Wilson,

Blumberg, & Wheatley, 1998 ; W i l s o n , W h e a t l e y , M e y e r s , G i l b e r t , & A x s o m , 2000 ) ,

an issue to which I’ll return below. There is also the possibility that, by not confronting

the pain of paying, spendthrifts are not forced to fully consider whether a given

purchase’s predicted beneﬁ ts will outweigh its costs, and as a result are making the

kinds of purchases least likely to actually increase happiness.

It’s worth noting that although tightwads experience the pain of paying to a much

greater degree than most, the loss of money is an inevitable part of any purchase,

meaning that everyone will experience the pain of paying to some degree. In many

circumstances, the exchange of money for goods and services is simultaneous,

meaning that the pains and pleasures are also experienced simultaneously, the pain

thus robbing some of the pleasure. However, the exchange need not be simultane-

ous, and by temporally decoupling the gain and loss, one can reduce the chances

that pain experienced from the loss of money will negatively impact the pleasure

experienced from the new purchase (Prelec & Loewenstein, 1998 ). One way to do

this is to consume ﬁ rst and delay the pain of payment for as long as possible, hoping

that it will be less painful in the future than it would be right now (Kassam, Gilbert,

Boston, & Wilson, 2008 ). To an extent, this has its intended effect: the immediate

pleasures are unspoiled by an immediate loss. The allure of this approach is evident

in the difference between paying with cash and with credit card. Cash payments are

immediate and visceral—the money literally leaves your hands and becomes some-

one else’s possession. Credit cards, on the other hand, are abstract and distant; they

allow you to put off the pain of paying until next month, often while enjoying the

beneﬁ t immediately. Spending money this way may seem painless, and almost

certainly does reduce the negative anticipatory emotions that might prevent one from

making a purchase, but it only forestalls the inevitable. When the end of the month

10 Psychological Science of Spending Money

220

rolls around and the credit card bill comes due, that pain may actually be magniﬁ ed

because the pleasure you experienced is already in the past. What’s more, because

credit cards diminish the pain in the present, they can encourage reckless spending—

you’re much more likely to have a “what was I thinking?!” moment for purchases

made with credit cards than with cash (e.g., Prelec & Simester, 2001 ; S o m a n , 2001 ).

A somewhat counterintuitive alternative that seems to have considerable hedonic

beneﬁ ts is to endure the pain of paying immediately and delay consumption until

later. Paying in advance may be painful initially, but it allows two distinct beneﬁ ts.

First, you to get the beneﬁ ts of anticipating a positive experience (e.g., Nowlis,

Mandel, & McCabe, 2004 ; an issue discussed further below), and second, because

the pain of paying is behind you when actually consuming, there is no anticipated

pain to dampen the experience. All-inclusive resorts might cost a bundle up front,

and they do hold some risk of paying more for the same amount of consumption, but

they do effectively decouple the payment from the experience. Rather than feeling

a slight twinge of pain each time you shell out the money for a cocktail, you can feel

like you’re getting a better and better deal with each drink—putting the sunk cost

effect (Arkes & Blumer, 1985 ) to work in your favor, though with the possible side

effect of severe hangovers. If making yourself happy is the goal, then it might be

worth the risk of overpaying to feel better about the money you’re spending. In

short, it’s often far better to pay up front and delay consumption until later (for a

review, see Dunn & Norton, 2013 ).

Hedonic Adaptation

Purchases, like anything else that produces hedonic gains, are subject to one of the

fundamental facets of human experience: hedonic adaptation (Frederick &

Loewenstein, 1999 ; s e e a l s o D i e n e r , L u c a s , & S c o l l o n , 2006 ) . T h a t i s , o v e r t i m e , t h e

same experience that once made you dizzyingly happy will merely bring a smile to

your face. Hedonic adaptation to a new car may be inevitable, but it isn’t necessarily

problematic unless it’s unaccounted for in the decision-making process. Unfortunately,

when people anticipate how a given purchase will make them feel, they can recognize

that it will become less intense over time, but generally fail to consider this fact at the

time of purchase (Ubel, Loewenstein, & Jepson, 2005 ; W a n g , N o v e m s k y , & D h a r ,

2009 ) . F o c u s i n g o n l y o n t h e i m m e d i a t e s p i k e i n h a p p i n e s s a n d i g n o r i n g t h e s u b s e -

quent decline means that the anticipated experience—the one on which people base

their expectations, and thus, their decisions—may be quite different from the actual

experience, increasing the chances of disappointment. Accurately predicting not just

the initial hedonic experience that a given purchase will provide, but also how it will

change over time, is important in making sound purchase decisions.

In order to accomplish more accurate predictions, it’s helpful to know a little

more about how hedonic adaptation operates. One of the reasons our experiences

become less intense over time is through the process of satiation with repeated

experiences. For instance, people know not to eat their favorite meal seven nights in

T.J. Carter

221

a row for fear that, by the time night seven rolls around, the mere smell of it will at

best be unappetizing, and at worst will be stomach-churning. People seek variety

and novelty to prevent satiation with repeated experiences, but probably don’t do it

optimally (for a review, see Alba & Williams, 2013 ). Even with adequate intervals

between events, sometimes we gain expertise that renders the earlier experience less

impressive. For instance, many novice wine drinkers are quite happy to drink what-

ever wine is put in front of them. The ﬂ avors that are easiest to discern (sweetness,

for instance) are often the ﬂ avors characteristic of less expensive wine. But, over

time, as the palate grows more sophisticated, many wine drinkers start to crave

more complex and subtle ﬂ avors, and must pay handsomely for the privilege.

Thus, they must spend more money to achieve the same hedonic beneﬁ t—a certain

amount of happiness from drinking a glass of wine—than would have been neces-

sary earlier in their wine-drinking career. What was once a favorite bottle will

eventually begin to taste cloyingly sweet, or perhaps bland and muted. Indeed,

many positive life changes, like purchasing a new car or getting a raise, create

aspirations over time that make the previously great change seem unimpressive

(see Sheldon & Lyubomirsky, 2012 ) .

One obvious lesson of hedonic adaptation, of course, is that novices should not

spend a lot of money on something that requires more sophistication than they possess

to fully appreciate. Another implication is that attempting to maintain a relatively

stable level of happiness may require spending ever-increasing amounts of money.

This is, in many ways, similar to the way that drug addiction operates. Neurological

systems respond to repeated use of addictive drugs with neuroadaptation: since for-

eign chemicals (e.g., cocaine) are doing the same job as natively produced neurotrans-

mitters (e.g., acting on dopamine receptors), the systems that produce those

neurotransmitters begin to produce less and less over time. With fewer neurotransmit-

ters naturally available to bind to those receptors, those systems will require increas-

ing amounts of the drug to achieve the same level of activation. Plus, since those

systems are typically involved in the experience of pleasure, the reduced activation of

those systems during any period of abstention reduces positive affect, which fuels a

desire for the drug just to get back to baseline levels—the neurochemical equivalent

of loss aversion (Koob & Le Moal, 2001 ). In just the same way, if you decide to

upgrade from the 1994 Ford Fiesta you’ve driven for years to a new Mercedes, the ﬁ rst

drive off the lot will be thrilling. After a year or 2, that thrill will mostly be gone, and

the feeling of luxury provided by the Mercedes will eventually begin to feel normal.

The only way to get that thrill again will be to increase your dosage with the new

model, which will not be cheap. Any abstention from that new baseline, say if you

go back to driving your old Fiesta while the Mercedes is in the shop, what was once

perfectly adequate will feel perfectly intolerable—your baseline level of activation

has changed, and you’ll jones for that new normal.

4 A recent blind taste-test study found that those with some training with wine show a positive

(though small) relationship between price and enjoyment, meaning that they enjoyed the more

expensive wines more. Novices, however, actually showed a negative correlation; they liked the

cheaper wines better (Goldstein et al.,

2008 ).

10 Psychological Science of Spending Money

222

In fact, this is one explanation for the very modest relationship between wealth

and happiness: as income rises, people adapt to their new standard of living, and

must spend more to feel the same amount of happiness they had at their old salary

(Diener & Biswas-Diener, 2002 ). A reduction in salary is now treated as a loss,

which has more severe negative consequences for well-being than the initial increase

did positive consequences ( Boyce, Wood, Banks, Clark & Brown, in press). What’s

more, new evidence suggests that wealth may actually hinder the ability to savor

positive experiences and emotions. In one study, participants given a series of

vignettes, such as discovering an amazing waterfall, and asked how they would

behave in each scenario. Wealthier participants, as well as participants who were

merely exposed to reminder of wealth (a photograph of a stack of money), were less

likely to claim that they’d use a savoring strategy, such as reminiscing or telling

friends about the experience. That reduced ability to savor seems to explain some of

the relatively weak correlation between wealth and happiness; wealthy participants

were less happy because they were less likely to engage in savoring activities

(Quoidbach, Dunn, Petrides, & Mikolajczak, 2010 ) . T h u s , i t m a y not be that spending

more money is absolutely required in order to overcome the forces of adaptation.

Rather, focusing on the experiences, savoring them each time they happen, may

prevent the need from spending an ever-increasing amount of money (Chancellor &

Lyubomirsky, 2011 ; Kasser, 2011 ).

Choices, Choices, Choices

Aside from having money to spend, the initial step toward the act of spending

money is to choose which particular good or service you’ll be purchasing. In the

simplest case, you are faced with a single purchase option, and the decision is sim-

ply whether or not to make the purchase. Presumably, as described above, that deci-

sion is based on some assessment of the expected costs compared with the expected

hedonic gains. For instance, you might hear that the new Daft Punk album just came

out, and decide whether or not it is worth $10 to own the album. The calculus is

fairly simple: if you think that you’ll get a greater hedonic gain from listening to the

synthesized singing of French robots than the other ways you can think of spending

$10, then you should choose to buy it. Otherwise, keep the money.

This extremely simple scenario is becoming less and less common, however.

The more likely case is that there are multiple options you are considering that

would ﬁ ll the same need, and you must choose only one of them. When buying

lunch, for example, it’s often not a simple question of whether or not to buy a salad

(and “not” isn’t really an option, since you’re not about to go hungry). Instead,

you’ll need to decide whether to buy a salad, a burrito, a slice of pizza, a bowl of

curry, a falafel sandwich, or any of the myriad lunch options that happen to be avail-

able to you at the time. Each of these options carries with it some potential hedonic

gain, some monetary cost, and choosing any one of them requires that you forego

the other options—at least for the day.

T.J. Carter

223

Even assembling the set of options you intend to choose from—the consideration

set—is becoming an increasingly difﬁ cult task in and of itself (see Schwartz, 2004 ).

In theory, more options should lead to better outcomes for consumers, as the likeli-

hood of ﬁ nding an option that exactly matches one’s preferences should increase

with the size of the choice set (e.g., Johnson & Payne, 1985 ; Kahn & Lehmann,

1991 ; Shugan, 1980 ), and indeed, people generally share this intuition, preferring to

have a lot of options to choose from (Chernev, 2003 ). However, the number of

options available within product categories has ballooned well past what is actually

good for consumers (Schwartz, 2004 ),

5 sapping people of the motivation to engage

in the decision-making process (Iyengar & Lepper, 2000 ).

6 In practice, the cognitive

burdens created by large choice sets and time constraints can leave people feeling

confused and unconﬁ dent (Haynes, 2009 ; Lee & Lee, 2004 ), even when they have

a great deal of control over the information presented to them (Ariely, 2000 ).

To illustrate how you might approach a choice from a large set of options, imagine

that you are deciding which television to buy. You should be able to narrow your

options by excluding options that are too expensive or too small (or large, for that mat-

ter) pretty easily, but you may still have hundreds of options to choose from, and no

easy way to know which one to choose. There are at least two major strategies for

whittling one’s consideration set down to a single chosen option. One approach is to

compare the relevant attributes of all of the options you’re considering, and attempt to

identify the very best option. This strategy is referred to as maximizing . An alternative

approach to making such a decision is to use a satisﬁ cing s t r a t e g y : s i m p l y s e t a s t a n -

dard for quality and select the very ﬁ rst option you come across that meets this stan-

dard (Simon, 1955 ). Although maximizing should theoretically yield better

outcomes—done properly, you should always get the best option available—in prac-

tice, people who tend to engage in maximizing (rather than satisﬁ cing) are subject to

a host of negative psychological outcomes, such as increased depression and decreased

life satisfaction (Schwartz et al., 2002 ). What’s more, maximizers have a hard time

committing to any one option, showing less of the post-decision rationalizing that

helps us feel good about our choices no matter how good a choice it was (Sparks,

Ehrlinger, & Eibach, 2012 ) . T h i s h e l p s e x p l a i n w h y m a x i m i z e r s r e p o r t l e s s s a t i s f a c -

tion than satisﬁ cers despite obtaining objectively better outcomes (Iyengar, Wells, &

Schwartz, 2006 ) . T h e d i f f e r e n c e s b e t w e e n u s i n g a m a x i m i z i n g a n d a s a t i s ﬁ cing

approach, and particularly the differences in the resulting psychological well-being,

help illustrate two of the big reasons why large choice sets can be problematic: the

large number of comparisons required and unreasonable expectations.

5 This is in part due to companies attempting to distinguish themselves in a crowded marketplace.

For any given brand, adding more options leads consumers to infer that the brand has expertise

in the area, and therefore that its offerings are better (Berger, Draganska, & Simonson,

2007 ) .

This approach is, of course, less effective when everyone does it, starting the arms race that created

ultra-speciﬁ c options like Diet Caffeine-Free Cherry Vanilla Coke, and resulted in sagging store

shelves and bewildered consumers.

6 A recent meta-analysis suggests that the demotivating effect of too-much-choice may be present

in only certain circumstances, such as under time constraints or when the need to justify one’s

choice is high (see Scheibehenne et al.,

2009 , 2010 ). This is described further below.

10 Psychological Science of Spending Money

224

Comparisons

Making a choice from a large consideration set can require a large number of

comparisons, particularly when using a maximizing strategy. To be sure, it is quite

natural to engage in comparative processes (Gilbert, Giesler, & Morris, 1995 ), and

people often do need comparative information in order to evaluate something prop-

erly. In one particularly telling example, participants were willing to pay more for

7 oz of ice cream when it overﬂ owed a tiny cup than for 8 oz of ice cream when it

only partially ﬁ lled an enormous cup—they used the size of the cup to inform their

judgments, when it really should be extraneous to how much the ice cream itself is

worth (Hsee, 1998 ; Sevdalis & Harvey, 2006 ). Without the ability to make certain

comparisons (e.g., the actual amount of ice cream), misleading cues (like inappro-

priately sized cups) can cause people to make poor decisions.

Indeed, some comparisons might be quite helpful, particularly when they are

easy to make, and there is little chance for error. In the television example above, it’s

quite easy to compare models on price and size, because those attributes are align-

able (e.g., Gentner & Markman, 1994 ). Clearly, cheaper is better than more expen-

sive, and larger is better than smaller (within reason, of course). If price and size

were the only attributes televisions had, it would be relatively trivial to make a

choice; you’d still need to ﬁ nd the sweet spot in the apparent trade-off between price

and size, but that’s it. Unfortunately, there will quite often be other features that do

not align—a feature that is present in one option but absent in others. One set might

have a smart dimming feature, while another might have a suite of internet- connected

apps, and still another might include a camera so that you can video chat with fam-

ily and friends. How can you possibly compare these features or decide which one

you’ll appreciate more over time? Attempting to compare incomparable features

can be very frustrating, incredibly demanding (Zhang & Markman, 2001 ), and

because people tend to search for more options as they learn more about the differ-

ent nonalignable features available (Grifﬁ n & Broniarczyk, 2010 ), it can exacerbate

the problem by making the choice set even larger. As the size of the choice set

increases, so do the number of difﬁ cult comparisons required, which has negative

consequences for your ultimate satisfaction with your choice (Reutskaja & Hogarth,

2009 ; Scheibehenne, Greifeneder, & Todd, 2010 ). Perhaps it is no surprise that hav-

ing more alignable features can mitigate some of the downsides of large choice sets

(Herrmann, Heitmann, Morgan, Henneberg, & Landwehr, 2009 ).

A b i g p a r t o f t h e r e a s o n t h a t n o n a l i g n a b l e f e a t u r e s a r e s u c h a n i s s u e i s r e l a t e d t o

the different modes in which we make evaluations (see Hsee, Loewenstein, Blount,

& Bazerman, 1999 ) . I n t h e s t o r e , m a k i n g a d e c i s i o n b e t w e e n t e n d i f f e r e n t t e l e v i s i o n s ,

you are in joint evaluation (JE) mode. In your living room, where you’ll actually

watch the television, you’re in separate evaluation (SE) mode (Hsee & Zhang, 2004 ) .

People can rely on comparative information in JE, when the options are side by side,

but less so in SE, when the other comparison targets are not present. For instance, in

the store, you might see that Television A has a slightly better picture quality than

Television B and decide that this justiﬁ es its higher price. However, because it’s very

T.J. Carter

225

difﬁ cult to evaluate small differences in attributes like picture quality without a direct

comparison, you may not be able to appreciate that slightly better picture once you

bring the television home, removing the justiﬁ cation for spending the extra money

spent. Attributes that may seem important on a relative level (i.e., when in JE mode)

might not matter at all on an absolute level (i.e., when in SE mode), as long as they’re

above some threshold of quality.

This can work slightly differently for nonalignable attributes, because unlike

alignable attributes, your memory for the presence or absence of some feature can

make SE mode feel like JE mode. If you decide not to spend the extra money to get

Television A’s better picture quality (an alignable attribute), as long as the picture

quality of Television B generally looks good to you, it is unlikely to impact your

day-to-day enjoyment. However, if you choose a set without the smart dimming

feature (a nonalignable attribute), each time you are nearly blinded by the screen

when turning on the television at night, you might recall that you could have avoided

that experience by getting a different television, and that knowledge can diminish

your satisfaction. Even though you’re not in the store anymore, because you learned

about and retained information that does not require the comparison target to be

present to evaluate, you may ﬁ nd yourself in JE mode and lose some of the beneﬁ ts

of getting away from comparative information. This is not to say that these non-

alignable attributes cannot contribute to enjoying the money you spend, but that

they can come with unanticipated costs. Engaging in an extensive comparison pro-

cess can haunt you later on (Dhar, Nowlis, & Sherman, 1999 )—it can even feel like

the unchosen options that you considered closely are being taken away from you

(Carmon et al., 2003 ). Without such extensive comparisons, you might remain

blissfully unaware.

Expectations

When deciding how to spend your money, your expectations will play a role in how

you decide as well as how you evaluate the outcome. While pondering whether or

not to make a particular purchase, people certainly do try to anticipate how that

purchase will ultimately make them feel and make their choices based on these

beliefs (Mellers et al., 1999 ; Shiv & Huber, 2000 ). Later, when evaluating the pur-

chase, people compare their actual experience with the purchase to their prior

expectations of its performance (e.g., Bell, 1985 ; Oliver, 1980 ) as well as how their

experienced affect matches their expected affect (Patrick, Macinnis, & Park, 2007 ;

Phillips & Baumgartner, 2002 ). It’s easy to see how people might be wrong on

either count and in either direction. In terms of performance, you might correctly

expect a new wool sweater to be warm and comfortable but fail to anticipate how

itchy it gets, or you might be pleasantly surprised that a new jacket is much better in

the rain than you expected. In terms of affect, even if your predictions about how a

new pair of shoes will feel are very close to the reality, you might ﬁ nd that you get

much more or much less enjoyment out of them than you expected you would

10 Psychological Science of Spending Money

226

(particularly if you fail to consider the role of adaptation, as described above).

Money is generally considered well-spent when expectations of performance and

experience are met or exceeded, creating happiness and satisfaction, and ill-spent if

those expectations are not met, creating dissatisfaction and regret (Bell, 1985 ;

Oliver, 1980 ).

Expectations are tricky, however, because they are not completely independent

of how the event itself is experienced (Wilson, Lisle, Kraft, & Wetzel, 1989 ). For

instance, participants in one study who spent some time thinking about how great a

Hershey’s kiss would taste, thus inﬂ ating their expectations, ended up enjoying the

chocolate more than participants who simply ate it right away (Nowlis et al., 2004 ).

Delaying consumption thus has additional beneﬁ ts beyond decoupling the pleasures

of consumption from the pain of paying, as described above. It provides hedonic

beneﬁ ts from the mere act of anticipating something positive, and it provides time

for positive expectations to increase enjoyment of the event. There are limits to how

much expectations can positively inﬂ uence our experiences, of course, so it’s impor-

tant not to raise expectations well beyond what is reasonable, or dissatisfaction and

regret are the likely outcomes. That is, there is a sweet spot in which we are able to

reap the beneﬁ ts of anticipation without succumbing to the problems of missed

expectations. This is particularly true of our affective expectations, since affective

experience is generally more intense during anticipation than recall (Van Boven &

Ashworth, 2007 ), and people aren’t particularly good at predicting the magnitude

(Buehler & McFarland, 2001 ; Gilbert et al., 1998 ) or duration (Wilson et al., 2000 )

of the emotions brought on by some future event. When people inevitably do

misforecast their affective reaction, it seems to be that feeling worse than expected

negatively impacts evaluations, but feeling better than expected doesn’t have an

equivalent positive impact (Patrick et al., 2007 ). Consistent with the notion that

losses loom larger than gains (Kahneman & Tversky, 1984 ), people spend a lot

more time thinking about why an affective experience didn’t live up to their expec-

tations, but simply accept a more positive affective experience without further

elaboration (Gilovich, 1983 ; Hastie, 1984 ) .

T h e d o w n s i d e s o f e x p e c t a t i o n s a r e e s p e c i a l l y evident in large choice sets, since the

large number of options can create the expectation that the perfect option is actually

available (Diehl & Poynor, 2010 ). This expectation certainly seems reasonable—

how could you not ﬁ nd exactly the right television for you from the hundreds of

models available? Having such high expectations can lead to a more extensive

search if that perfect option does not present itself quickly, further encouraging a

maximizing approach. Plus, as described above, the more extensive your search,

the more you learn about nonalignable features (Grifﬁ n & Broniarczyk, 2010 ) . T h a t is,

as you browse through the available television sets, you will start with a certain

number of features that you know you should be checking and comparing, such as

price, screen size, picture quality, and energy consumption. When you encounter a

set that has a smart dimming feature, something you didn’t previously realize you

might want, you now must add it to the list. Each new attribute that you encounter

teaches you something about the possibilities, and changes your expectations about

what it means to be a good choice. The longer you search, the more you learn, the

T.J. Carter

227

higher your expectations, and the less likely you are to ultimately end up being

satisﬁ ed with your choice (Grifﬁ n & Broniarczyk, 2010 ).

High expectations can inﬂ uence not just the search and decision-making process

but also what people end up choosing. When the choice is difﬁ cult, as it typically is

from large choice sets, many people feel a greater pressure to make a decision that

is justiﬁ able to others, and the justiﬁ able choice isn’t necessarily the best choice, at

least in terms of happiness. For instance, people are more likely to select a utilitarian

option than a hedonic option, since it’s easier to justify buying something that’s useful

than something that could be considered indulgent (Sela, Berger, & Liu, 2009 ).

People also place a greater emphasis on alignable features than nonalignable fea-

tures because they are easier to compare and therefore easier to justify (Markman &

Medin, 1995 ). In fact, the negative effects of choice overload may only occur when

decision-makers have some expectation of needing to justify their choice, since the

strategy most likely to produce a justiﬁ able choice is maximizing; in the absence of

that pressure, large choice sets might not be detrimental at all (Scheibehenne,

Greifeneder, & Todd, 2009 ; S c h e i b e h e n n e e t a l . , 2010 ; s e e a l s o B o t t i & M c G i l l ,

2006 ; T s i r o s , M i t t a l , & R o s s , 2004 ) . T h e m e r e a c t o f e n g a g i n g i n a n e x t e n s i v e s e a r c h

and comparison process, with expectations for a good outcome high, the pressure to

get a really good option may be quite high. After all, if you’ve put in a great deal of

effort to ﬁ nd a good option, if it doesn’t turn out well, then you can blame yourself

for not doing just a little bit more searching or comparing.

For all the reasons outlined above, it may be no surprise that the kind of exten-

sive search process that maximizers engage in, with all its comparisons and effort,

might provide an objectively better outcome, but might actually produce less

enjoyment (Iyengar et al., 2006 ). Thus, whenever possible, you should avoid large

choice sets, engage in relatively few comparisons, keep the pressure to get the

very best option low, and try to keep in mind whether the relative differences

between options will actually produce a meaningful gain in enjoyment. To be

sure, many choice contexts are set up in ways that makes it difﬁ cult to take that

advice. Plus, much of that advice is of the “thou shalt not” variety, which isn’t

always particularly helpful. To provide more positive approaches, the next section

speciﬁ cally discusses purchases that, by their very nature, eliminate (or at least

lower) many of the roadblocks between the act of spending money and the expected

hedonic payout.

On What, and on Whom, Should You Spend Money?

The sections above deﬁ ned and described the act of spending in terms of the

psychological processes involved, with a special emphasis on issues that prevent a

purchase from achieving its intended outcome: happiness. This section focuses on

speciﬁ c types of purchases that tap more directly into the psychological processes

most likely to yield satisfaction and increase overall well-being. To start, the distinction

between material possessions (tangible objects like jewelry, clothes, and electronic

10 Psychological Science of Spending Money

228

gadgets) and experiences (intangible purchases like vacations, meals at restaurants,

and concerts) has proven quite useful (Van Boven & Gilovich, 2003 ). Generally,

research suggests that for the same amount of money, experiences tend to be more

satisfying, and make people happier, than possessions (Carter & Gilovich, 2010 ,

2012 ; Howell & Hill, 2009 ; Howell, Pchelin, & Iyer, 2012 ; Nicolao, Irwin, &

Goodman, 2009 ; Van Boven & Gilovich, 2003 ; cf. Caprariello & Reis, 2013 ).

Although there are several speciﬁ c reasons why experiences seem to offer

hedonic beneﬁ ts, much of the explanation has to do with the features inherent to

each type of purchase. It’s worth stating, of course, that the deﬁ ning features vary

by degree, and thus the distinction between experiences and possessions isn’t

always clear-cut. Although most experiences are indeed intangible, there are cer-

tainly physical objects that are highly experiential when they are being used—

allowing them to change states like ice melting and refreezing. Although a good

ﬁ ction book is a physical object, it is highly experiential while you are reading it:

mentally transporting you to other places, times, or even to other realities. Similarly,

owning a physical copy of your favorite movie is indeed a tangible object, but your

main interaction with it is through the experience of watching the ﬁ lm. Once that

experience is over, the object goes back on the shelf, just like any other material

possession. The existence of these purchases with ambiguous properties does not,

however, impugn the importance of the distinction between material and experien-

tial purchases. Even though some purchases might seem quite slushy, not easily

categorized as solid ice or ﬂ uid water, focusing attention on the ice or the water

makes different psychological processes salient, thus creating different psychologi-

cal outcomes—as if the mere act of focusing on the water melted all of the ice.

For instance, when the exact same purchase (e.g., a boxed set of music or a 3D TV)

is described in terms of its material or experiential qualities, it has the same beneﬁ -

cial psychological effects as more canonical possessions or experiences (Carter &

Gilovich, 2010 , 2012 ; Rosenzweig & Gilovich, 2012 ). Plus, people generally have

little trouble understanding the distinction and can readily identify examples that

observers agree ﬁ t the categories well, apparently interpreting a gradient as distinct

hues (Carter & Gilovich, 2010 ). Indeed, in the studies investigating that distinction,

recalling different types of purchases based on even the barest description of the

categories seems to have hedonic consequences for participants, suggesting that

the categories are both useful and consequential. Still, it might be better to think of the

distinction between experiences and possessions as a continuum, and the position of

any one purchase on that continuum as a function of not just its inherent properties,

but also which properties are psychologically salient at the moment (see Carter &

Gilovich, 2013 ).

S o w h a t i s i t a b o u t e x p e r i e n c e s t h a t s e e m t o m a k e p e o p l e h a p p i e r ? A l t h o u g h i t i s

undoubtedly multiply determined, there are several distinct reasons that have been

identiﬁ ed so far. The sections below will discuss several such reasons: the beneﬁ ts of

experiences’ intangibility to issues of expectations and adaptation, the smaller role

that comparisons play in experiential decision-making and evaluation, the ability for

experiences to strengthen social bonds, and the greater contribution that experiential

purchases make to the self-concept.

T.J. Carter

229

Expectations and Adaptation

Prior to making the purchase, expectations can exert both a positive inﬂ uence

(via positive anticipation) and a negative inﬂ uence (when raised to unreasonable

levels) on satisfaction. How might you ﬁ nd the sweet spot—allowing positive antic-

ipation to increase your expectations so that they increase actual enjoyment, without

setting the bar so high that disappointment is the only possible result? Experiences

seem to offer some beneﬁ ts over possessions in this regard, both in terms of allowing

high expectations to increase enjoyment and in terms of reducing disappointment

when the outcome isn’t as positive as expected.

F o r i n s t a n c e , i n a s t u d y o f s p r i n g b r e a k e x p e r i e n c e s , p a r t i c i p a n t s r e p o r t e d t h e i r

expectations for how their vacation would go, their enjoyment while actually on the

vacation, and their retrospective memories for the event weeks later (Wirtz, Kruger,

Scollon, & Diener, 2003 ) . I n t h i s s t u d y , p a r t i c i p a n t s ’ e x p e c t a t i o n s w e r e p o s i t i v e l y

related to both their online reports and their memories for the event, suggesting that

they were positively anticipating the event and that those increased expectations

actually improved both the experience itself and their memories of it. Why might this

be the case more so for experiences than possessions? Because an experience is

intangible, abstract, and ﬂ eeting, with a fair amount of uncertainty about exactly how

it will transpire. A small amount of uncertainty alone can make a positive experience

more enjoyable by encouraging a pleasant elaboration on potential explanations

(Wilson, Centerbar, Kermer, & Gilbert, 2005 ) . A n d b e c a u s e e x p e r i e n c e s a r e m o r e

abstract—in fact, merely taking time to think about a recent material or experiential

purchase puts people into a more concrete or an abstract mindset, respectively

(Carter, 2013 ) — t h a t p o s i t i v e e l a b o r a t i o n c a n b e m o r e e f f e c t i v e .

If your expectations for a vacation in Grand Cayman are particularly high—

indeed, it would be hard not to expect a week sipping drinks on a white sand beach

to be fantastic—even if that positive anticipation improved the experience, the odds

that the reality truly lives up to your expectation may be quite low, partly because

you won’t bother to imagine any potential downsides (Newby-Clark et al., 2000 ).

Chances are pretty good that you failed to foresee the frustration of constant sun-

screen application, the embittering effect of overpriced drinks, or the bafﬂ ed

annoyance at a nearby couple’s decision to blast Jock Jams’96 for the entire beach

to hear. Over time, however, the actual feeling of anger created by those nuisances

will fade and seem trivial, allowing you to see it as a learning experience, or a funny

story; the more positive aspects eventually dominate memories (Mitchell, Thompson,

Peterson, & Cronk,

1997 ). Indeed, in the spring break study mentioned above, it

was only memories of the experience, not the experience itself, that predicted how

likely they were to want to repeat the experience (Wirtz et al.,

2003 ). However,

because possessions are more concrete and physically endure through time, they are

not as easily reconstrued or reimagined. Thus, if your new couch turns out not to be

the paragon of comfort and style you’d expected, it will sit in your living room each

day as a constant reminder of your folly. That greater ability to reconstrue the negative

aspects of an experience is one reason why happiness with experiences seems to

10 Psychological Science of Spending Money

230

hold steady or even improve over time, whereas happiness with possessions tends to

decline (Carter & Gilovich, 2010 ).

As described above, well before physical decline sets in, hedonic adaptation can

begin to leach away a purchase’s initial pleasure, so any disruption of adaptation

processes will help that initial pleasure endure. Here too, experiences offer a beneﬁ t,

since they seem to do a better job than possessions in resisting hedonic adaptation

(Nicolao et al., 2009 ). One reason is because experiences are, by deﬁ nition, tran-

sient states, it can be very difﬁ cult to get used to them. Possessions, being physical,

tangible objects that persist in space and time, are more prone to this sort of adapta-

tion. That initial thrill from owning a new dining room table will fade as it sits there,

unchanged, day after day. That is not to say that one cannot adapt to a transient state

if it is repeated too often. As mentioned in the example above, eating your favorite

meal too frequently can rob you of its pleasure. Adding variety, surprise, and uncer-

tainty can help prevent the natural process of affective adaptation to pleasurable

events (Wilson & Gilbert, 2008 ). For instance, adding short interruptions to experi-

ences can be sufﬁ cient to prevent them from getting old, to the point that commer-

cials, typically derided as unpleasant, may actually increase enjoyment of a

television show (Nelson & Meyvis, 2008 ). Applying a similar logic, frequent small

purchases may actually provide a greater hedonic beneﬁ t than a single large pur-

chase (Dunn et al., 2011 ; Dunn & Norton, 2013 ). Because pleasurable experiences

are subject to diminishing marginal utility (another insight of prospect theory;

Kahneman & Tversky, 1979 ), you can get a greater total amount of pleasure by

consuming several small experiences than one big one. Taking frequent small vaca-

tions is likely to make a bigger impact on your well-being than one big one. This is

also likely true of possessions; frequently buying small material possessions may

make you happier than one extravagant purchase. Small frequent material purchases

suffer from one signiﬁ cant disadvantage, however: they accumulate over time and

clutter up your life.

Invidious Comparisons

As described above, large choice sets and decision-making strategies that empha-

size comparative information (i.e., maximizing) can have negative hedonic conse-

quences. However, many of these effects are much more true of possessions than

experiences. To start, maximizing appears to be the strategy that offers a more natu-

ral ﬁ t for material possessions, in no small part because of the tangible nature of

possessions. It was no accident that many of the examples used to describe maxi-

mizing in the sections above were physical objects. Televisions, for instance, can

fairly easily be compared side by side, inviting comparisons that quite often don’t

matter after you’ve brought your purchase home. You might be able to see that one

television offers deeper blacks than another when they’re right next to each other

(in JE), but in your living room (in SE), that direct comparison will be impossible

and therefore will not impact your enjoyment (Hsee,

1996 ; Hsee et al., 1999 ;

T.J. Carter

231

Hsee & Leclerc, 1998 ; Hsee & Zhang, 2004 ). With possessions, because the

comparisons are so easy and prevalent, people seem inclined, perhaps even feel

obligated, to use the more comparison-oriented strategy of maximizing. Indeed,

when faced with a material purchase decision, people report that they’re more likely

to use a maximizing strategy (Carter & Gilovich, 2010 ).

E x p e r i e n c e s , o n t h e o t h e r h a n d , s e e m t o o f f e r a m o r e n a t u r a l ﬁ t with the satisﬁ cing

approach. For instance, imagine that you’re deciding where to go on vacation. There

is certainly no shortage of places to visit, meaning that the best decision will by no

means be obvious. There is also plenty of opportunity to compare all of the various

destinations, but those comparisons are much more difﬁ cult than comparing two

televisions—the attributes of experiential purchases tend to be much less alignable

than the attributes of possessions. Plus, the intangible nature of experiences makes

it impossible to truly compare two vacation destinations side by side, except on the

more tangible and concrete attributes, like price. Most of the comparisons will be

either entirely hypothetical—imagining yourself on a beach is very different than

actually being at one—or even completely incomparable—comparing the sun of

Aruba to the culture of Venice is very much an apples-to-oranges proposition. If one

cannot make such comparisons, then a maximizing approach is decidedly unsuit-

able, and it makes more sense to evaluate each option on its own merits. Indeed,

participants report that they are more likely to use a satisﬁ cing approach for experi-

ential purchase decisions (Carter & Gilovich, 2010 ).

The different decision-making strategies evoked by material and experiential

purchase decisions show downstream consequences in line with what you’d expect:

maximizing and satisﬁ cing, respectively. In one experiment, participants were

assigned to recall either a material or experiential purchase they had made from a

large array of options. Consistent with a more extensive decision process, partici-

pants reported that making a material purchase decision was simply more difﬁ cult

than making an experiential purchase decision. If, because of the more extensive

comparison process involved in the material purchase decision, information about

the foregone options was retained, possessions might be particularly likely to

provoke the kind of negative counterfactuals that create feelings of regret and dis-

satisfaction (see Rosenzweig & Gilovich, 2012 ). Indeed, participants who recalled

a possession were still being bothered by thoughts of the foregone options, and it

was these nagging thoughts that explained why possessions were less satisfying

than experiences in the present (Carter & Gilovich, 2010 ).

Although making comparisons between experiential options is certainly more

difﬁ cult, comparative information is also less important for experiences, forming a

smaller part of satisfaction judgments than is the case for possessions. When people

evaluate a possession, they need some frame of reference or point of comparison in

order to come up with a judgment; with experiences, the experience itself, on its

own merits, provides the lion’s share of the evaluation process (Carter & Gilovich,

2010 ; Hsee, Yang, Li, & Shen, 2008 ; Ma & Roese, 2013 ). Thus, even when negative

comparative information is salient, experiences are relatively immune to its inﬂ u-

ence. For instance, in an experiment where participants were given either a material

prize (a good pen) or an experiential prize (chips) in the context of either much

10 Psychological Science of Spending Money

232

better or much worse prizes, the context played a big role in how participants evaluated

the pen—rating it lower when it was worse than the other prizes—but had no impact

on how much they enjoyed the chips (Carter & Gilovich, 2010 ). Even when that

information is made quite salient, such as when participants in other experiments

were told that the price had dropped on a purchase they had made, or that new and

better options were now available, that information sapped participants’ satisfaction

with material purchases but not experiential purchases (Carter & Gilovich, 2010 ).

This evidence suggests two hedonic advantages experiences have when it comes

to the act of spending money. First, experiences nudge people into using decision

strategies that are less comparative, and thus more conducive to happiness. Second,

because they are relatively immune to potentially invidious comparisons, when

negative comparative information inevitably does arise, it has a much smaller detri-

mental impact on satisfaction. Of course, you cannot live on vacations and concerts

alone, so when you are making material purchase decisions, try to treat them more

like experiences: make your choices using something closer to a satisﬁ cing process,

use comparisons only when they’re most helpful—between alignable attributes

when actually making the decision, not after the decision is made—and do your best

to evaluate your purchase on its own merits.

Making Meaning

Some of the purchases that offer the most enduring satisfaction are those that become

personally meaningful, which make some contribution to our sense of self (see Belk,

1988 ). Experiences, more so than possessions, seem to embody this principle as well

(Carter & Gilovich, 2012 ) . W h y m i g h t t h i s b e t h e c a s e ? O n e r e a s o n h a s t o d o w i t h

how the different types of purchases persist over time. As mentioned above, experi-

ences persist only as memories, and memories of an event tend to be rosier than the

actual experience (Mitchell et al., 1997 ). With a little temporal distance, you’ll forget

about the ravenous mosquitos and the overcooked eggs on your camping trip, but you

will retain the memory of the incredible starry sky and the sense of relaxation (even if

it didn’t feel all that relaxing at the time). Possessions, on the other hand, will be

ravaged by time just like any other physical object. Shoes get scuffed and wear out;

cell phones become obsolete. To be sure, that difference in tangibility is another rea-

son why experiences seem to retain, or even improve their value over time, whereas

satisfaction with possessions seems to decline (Carter & Gilovich, 2010 ) .

But the intangibility of experiences also means that they are more directly con-

nected to the self-concept—memories being an essential component of the self

(e.g., Kihlstrom, Beer, & Klein, 2003 ; McAdams, 2001 ; Wilson & Ross, 2003 )—

whereas possessions are more physically distant from the self. Experiments conﬁ rm

this intuition. For instance, participants in one study were ﬁ rst asked to recall a

number of both material and experiential purchases. Then, they were given an

example of the diagrams used in the independent–interdependent selves literature,

where circles representing family members are plotted around a central “self” circle,

T.J. Carter

233

with the proximity of each circle relative to the self-circle indicative of the degree to

which that family member contributes to the self-concept (see Markus & Kitayama,

1991 ). They were then given a blank self-circle and asked to use the same logic to

plot the circles representing the purchases they had recalled earlier—literally

diagramming the centrality of each purchase to their self-concept. As expected,

participants plotted their experiential purchases closer to the self-circle than their

material purchases. In another experiment, participants were more likely to include

experiential than material purchases in a narrative telling their life story. These two

experiments together suggest that people do consider their experiences more central

to the self-concept, but more importantly, is centrality to the self- concept part of the

reason why they are more satisfying? Participants in another experiment were asked

to recall either a material or an experiential purchase, and then were asked to imag-

ine that they could go back in time and make a different choice, selecting a different

option instead, but without changing their current circumstances—essentially swap-

ping out their memories for new ones. Participants were less willing to make that

memory swap for an experience than a possession, and that relative willingness did

indeed explain why the possessions were less satisfying than the experiences (Carter

& Gilovich, 2012 ). Experiences did more to create participants’ sense of self, so

changing an experience meant changing the very nature of their self-concept, some-

thing people strongly resist (Gilovich, 1991 ). Indeed, it’s no accident that people

talk of “formative experiences” and not “formative possessions.”

O v e r a l l , i t s e e m s t h a t m o n e y s p e n t o n p u r c h a s e s t h a t a r e p e r s o n a l l y m e a n i n g f u l ,

or contribute to our sense of self, is going to produce greater hedonic returns, and

choosing experiences over possessions is just one easy way to accomplish this. There

are certainly other types of purchases that are likely to be personally meaningful.

Other work suggests that purchasing products that are aligned with your own ethical

code, such as environmentally friendly products, can be associated with greater well-

being (Welsch & Kühling, 2010 ; X i a o & L i , 2010 ; c f . G r i s k e v i c i u s , T y b u r , & Va n d e n

Bergh, 2010 ) . P u r c h a s e s t h a t r e q u i r e y o u t o i n v e s t a b i t o f y o u r s e l f i n t o t h e m , s u c h a s

self-assembled furniture, also seem to provide more enduring satisfaction, partly

because they create a feeling of competence, fulﬁ lling another basic psychological

need (Mochon, Norton, & Ariely, 2012 ; N o r t o n , M o c h o n , & A r i e l y , 2012 ) . I n f a c t ,

people are willing to give up higher wages in exchange for the feeling that the work

they’re doing is meaningful (Ariely, Kamenica, & Prelec, 2008 ). Clearly, meaning

matters. When deciding how to spend your money, you should take into consider-

ation whether any given purchase is likely to provide meaning—to contribute to

your sense of self.

Social Relationships

Probably the single most robust predictor of well-being is having strong social

relationships (e.g., Diener & Seligman,

2002 ; Myers, 2000 ), so spending money in

service of nurturing your social relationships is nearly always going to be money

10 Psychological Science of Spending Money

234

well spent. A difference in the social nature of purchases also helps to explain why

experiences seem to be so satisfying. First, experiences are simply more likely to

involve other people than possessions. After all, many experiential purchases are

expressly meant to foster social interaction or to spend time with loved ones,

whereas many possessions are meant to be enjoyed alone. If you go see the Rolling

Stones in concert, it’s likely that you’ll share the experience with a good friend or

spouse (not to mention 20,000 strangers), but it’s unlikely that a new sweater will be

used by more than one person (certainly at any given time). Indeed, many posses-

sions can do more to isolate us from, rather than connect us to, our social surround-

ings. Even though a smartphone’s primary use is ostensibly as a telephone—an

inherently social purpose—daily train commuters know just how common it is to

see the entire train car full of people sitting silently, staring at their phones, playing

games or attempting to keep up with their work email. Perhaps it’s no surprise that

when people are experimentally induced to leave their gadgets in their pockets and

actually talk to the other passengers, making even a ﬂ eeting social connection, their

commutes are considerably more pleasant. In a telling study, daily train commuters

in Chicago either were asked to do what they normally did during their commute

(which was almost universally solitary, reading or working, often on some kind of

electronic device) or were asked to start a conversation with a total stranger. But as

daunting as making small talk for 15–30 min might have seemed (and indeed the

commuters generally believed that this would not be pleasant), in fact it was those

participants who had a conversation who enjoyed their commutes the most, and

even considered it at least as productive as if they’d read or worked as they normally

did (Schroeder & Epley, 2013 ).

P a r t i c i p a n t s i n a n o t h e r s t u d y w h o r e ﬂ ected on an experiential purchase, compared

with participants who reﬂ ected on a material purchase, reported greater happiness

not only with the purchase that they had made but also greater satisfaction of the

higher-order psychological need of relatedness (Howell & Hill, 2009 ) . M e e t i n g t h i s

need for relatedness may even be quite crucial to enduring satisfaction from a pur-

chase; social purchases, whether experiential or material, foster considerably more

happiness and satisfaction than solitary purchases (Caprariello & Reis, 2013 ) . I n f a c t ,

spending money on other people has shown to be more satisfying than spending a

larger amount of money on oneself. In one study, participants were given an envelope

with either $5 or $20 inside and were assigned to spend that money either on them-

selves or on another person by 5 pm. Incredibly, participants who spent their money

on someone else were happier than participants who spent the money on themselves,

but how much money they were given didn’t make a difference (Dunn, Aknin, &

Norton, 2008 ) . T h i s b a s i c p h e n o m e n o n h a s b e e n r e p l i c a t e d i n a v a r i e t y o f o t h e r c o u n -

tries (Aknin et al., 2013 ) , a n d e v e n 2 - y e a r - o l d c h i l d r e n a r e h a p p i e r w h e n g i v i n g t h e i r

own resources (in this case, Goldﬁ sh crackers) to others than when they receive the

treats themselves (Aknin, Hamlin, & Dunn, 2012 ) . T h e r e ’ s e v e n e v i d e n c e t h a t t h i s

prosocial spending is self-reinforcing—the happier participants in one study were,

the more likely they were to spend a windfall on others (Aknin, Dunn, & Norton,

2011 ) . T h u s , i f y o u a r e g o i n g t o s p e n d m o n e y o n p o s s e s s i o n s i n s t e a d o f e x p e r i e n c e s ,

you’re probably better off buying them for someone else.

T.J. Carter

235

Other work has shown that experiences confer a social beneﬁ t even further

downstream, when conversing with people who were not directly involved in the

purchase itself. For instance, participants in one experiment were asked to have a

conversation with a stranger (also a participant), but were limited in their conversa-

tion topics. Half of the pairs were conﬁ ned to talking about experiences they’d pur-

chased, and the other half were conﬁ ned to talking about their possessions. After the

conversation was over, participants who had talked about experiences felt the con-

versation went better and liked their conversation partner more (Van Boven,

Campbell, & Gilovich, 2010 ). In other words, while you might be excited to talk

about your shiny new laptop, people will be much more receptive to hearing the

stories from your recent trip to San Francisco. Part of the reason may be that experi-

ences are more resistant to social comparisons than possessions, so talking about

your experiences with others is less likely to incite feelings of jealousy (Carter &

Gilovich, 2010 ; s e e a l s o S o l n i c k & H e m e n w a y , 1998 ) . T h e r e ’ s a l s o e v i d e n c e t h a t

people are more likely to spontaneously talk about their experiences than their pos-

sessions, which not only provides the opportunity to make meaningful social connec-

tions as described above but also helps people to “reconsume” that experience,

embellishing and improving the memory (Kumar & Gilovich, 2013 ) . W h a t ’ s m o r e ,

people seem to cherish that mechanism of sharing. In an experiment, after ranking

either a variety of beach vacations (experiential condition) or electronic gadgets

(material condition), participants were asked to imagine that they had to choose

between getting their top-ranked option, but with the caveat that they weren’t allowed

to talk about it with anyone, or their second-ranked option, which had no restrictions.

Participants in the material condition apparently didn’t care about sharing—they

simply wanted their top choice and were perfectly happy to forgo the social ele-

ment in order to get it, further illustrating the more solitary nature of possessions.

Not so with participants in the experiential condition: the ability to talk about their

experience with others was far more important, so they greatly preferred the

socially unrestricted second-ranked option (Kumar & Gilovich, 2013 ) .

T h u s , a b i g p a r t o f t h e r e a s o n w h y e x p e r i e n c e s e n d u p b e i n g m o r e s a t i s f y i n g w a y s

to spend money than possessions is that they confer greater social beneﬁ ts both

during and long after the purchase itself. Given how important other people are to our

well-being, spending money that reinforces your social relationships, or helps you

feel a sense of connectedness to the world, is going to be money well spent—even if

you don’t get to consume it yourself.

Conclusion

The act of spending money is an emotional decision, with hedonic consequences

that can last far into the future. Greater attention to how we approach that act, and

especially the processes by which we make our decisions, can help one accomplish

the overarching goal of improving one’s well-being. The attention one pays need

not be exhausting, however. The approaches outlined above offer a few ways that

10 Psychological Science of Spending Money

236

may help reduce the anxiety many people feel when pondering an act of spending—

worrying about the prospect of buyer’s remorse—that robs the moment of some of

its excitement. It may not be easy to make peace with the fact that spending money

is always going to involve a loss and focus instead on what you’ll gain, but perhaps

a good way to start is simply to choose to take a good friend out to share a nice meal,

savor each bite, and make a memory that you’ll cherish for a lifetime.

References

Ahuvia, A. (2008). If money doesn’t make us happy, why do we act as if it does? Journal of

Economic Psychology, 29 (4), 491–507. doi:

10.1016/j.joep.2007.11.005 .

Aknin, L. B., Barrington-Leigh, C. P., Dunn, E. W., Helliwell, J. F., Burns, J., Biswas-Diener, R.,

et al. (2013). Prosocial spending and well-being: Cross-cultural evidence for a psychological

universal. Journal of Personality and Social Psychology, 104 (4), 635–652. doi:

10.1037/

a0031578 .

Aknin, L. B., Dunn, E. W., & Norton, M. I. (2011). Happiness runs in a circular motion: Evidence

for a positive feedback loop between prosocial spending and happiness. Journal of Happiness

Studies, 13 (2), 347–355. doi:

10.1007/s10902-011-9267-5 .

Aknin, L. B., Hamlin, J. K., & Dunn, E. W. (2012). Giving leads to happiness in young children.

(A. H. Kemp, Ed.). PLoS One, 7 (6), 1–4. doi:

10.1371/journal.pone.0039211.g002 .

Aknin, L. B., Norton, M. I., & Dunn, E. W. (2009). From wealth to well-being? Money matters,

but less than people think. The Journal of Positive Psychology, 4 (6), 523–527.

doi:

10.1080/17439760903271421 .

Alba, J. W., & Williams, E. F. (2013). Pleasure principles: A review of research on hedonic con-

sumption. Journal of Consumer Psychology, 23 (1), 2–18. doi:

10.1016/j.jcps.2012.07.003 .

Andrade, E. B., & Ariely, D. (2009). The enduring impact of transient emotions on decision

making. Organizational Behavior and Human Decision Processes, 109 (1), 1–8. doi: 10.1016/j.

obhdp.2009.02.003 .

Ariely, D. (2000). Controlling the information ﬂ ow: Effects on consumers’ decision making and

preferences. Journal of Consumer Research, 27 (2), 233–248. doi:

10.1086/314322 .

Ariely, D., Huber, J., & Wertenbroch, K. (2005). When do losses loom larger than gains? Journal

of Marketing Research, 42 (2), 134–138.

A r i e l y , D . , K a m e n i c a , E . , & P r e l e c , D . ( 2 0 0 8 ) . M a n ’ s s e a r c h f o r m e a n i n g : T h e c a s e o f L e g o s . Journal

of Economic Behavior and Organization, 67 ( 3 – 4 ) , 6 7 1 – 6 7 7 . d o i : 1 0 . 1 0 1 6 / j . j e b o . 2 0 0 8 . 0 1 . 0 0 4 .

Arkes, H., & Blumer, C. (1985). The psychology of sunk cost. Organizational Behavior and

Human Decision Processes, 35 , 124–140. doi:

10.1016/0749-5978(85)90049-4 .

Avnet, T., & Higgins, E. T. (2006). How regulatory ﬁ t affects value in consumer choices and

opinions. Journal of Marketing Research, 43 (1), 1–10. doi: 10.2307/30163364 .

Belk, R. W. (1988). Possessions and the extended self. Journal of Consumer Research, 15 ,

139–168.

Bell, D. E. (1985). Disappointment in decision making under uncertainty. Operations Research,

33 (1), 1–27. doi:

10.1287/opre.33.1.1 .

Berger, J., Draganska, M., & Simonson, I. (2007). The inﬂ uence of product variety on brand

perception and choice. Marketing Science, 26 (4), 460–472. doi:

10.1287/mksc.1060.0253 .

Biswas-Diener, R., & Diener, E. (2001). Making the best of a bad situation: Satisfaction in the

slums of Calcutta. Social Indicators Research, 55 (3), 329–352.

Botti, S., & McGill, A. L. (2006). When choosing is not deciding: The effect of perceived respon-

sibility on satisfaction. Journal of Consumer Research, 33 (2), 211–219. doi:

10.1086/506302 .

Boyce, C. J., Wood, A. M., Banks, J., Clark, A. E., & Brown, G. D. A. (2013). Money, well-being,

and loss aversion: Does an income loss have a greater effect on well-being than an equivalent

income gain? Psychological Science, 24 (12), 2557–2562. doi:

10.1177/0956797613496436 .

T.J. Carter

237

Brown, S., Taylor, K., & Price, S. W. (2005). Debt and distress: Evaluating the psychological cost

of credit. Journal of Economic Psychology, 26 (5), 642–663. doi:

10.1016/j.joep.2005.01.002 .

Buehler, R., & McFarland, C. (2001). Intensity bias in affective forecasting: The role of temporal

focus. Personality and Social Psychology Bulletin, 27 (11), 1480–1493.

Caprariello, P. A., & Reis, H. T. (2013). To do, to have, or to share? Valuing experiences over

material possessions depends on the involvement of others. Journal of Personality and Social

Psychology, 104 (2), 199–215. doi:

10.1037/a0030953 .

Carmon, Z., & Ariely, D. (2000). Focusing on the forgone: How value can appear so different to

buyers and sellers. The Journal of Consumer Research, 27 (12), 360–370.

Carmon, Z., Wertenbroch, K., & Zeelenberg, M. (2003). Option attachment: When deliberating makes

choosing feel like losing. Journal of Consumer Research, 30 ( 1 ) , 1 5 – 2 9 . d o i : 1 0 . 1 0 8 6 / 3 7 4 7 0 1 .

Carter, T. J. (2013). The abstract and concrete nature of experiences and possessions . Unpublished

manuscript.

Carter, T. J., & Gilovich, T. (2010). The relative relativity of material and experiential purchases.

Journal of Personality and Social Psychology, 98 (1), 146–159. doi:

10.1037/a0017145 .

Carter, T. J., & Gilovich, T. (2012). I am what I do, not what I have: The differential centrality of

experiential and material purchases to the self. Journal of Personality and Social Psychology,

102 (6), 1304–1317. doi:

10.1037/a0027407 .

Carter, T. J., & Gilovich, T. (2013). Getting the most for the money: The hedonic return on experi-

ential and material purchases. In M. Tatzel (Ed.), Consumption and well-being in the material

world . New York, NY: Springer. doi:

10.1007/978-94-007-7368-4_3 .

Chancellor, J., & Lyubomirsky, S. (2011). Happiness and thrift: When (spending) less is (hedoni-

cally) more. Journal of Consumer Psychology, 21 (2), 131–138. doi:

10.1016/j.jcps.2011.02.004 .

Chernev, A. (2003). Product assortment and individual decision processes. Journal of Personality

and Social Psychology, 85 (1), 151–162. doi:

10.1037/0022-3514.85.1.151 .

Csikszentmihalyi, M. (2000). The costs and beneﬁ ts of consuming. Journal of Consumer Research,

27 (2), 267–272. doi:

10.1086/314324 .

Dhar, R., Nowlis, S. M., & Sherman, S. J. (1999). Comparison effects on preference construction.

Journal of Consumer Research, 26 (3), 293–306. doi:

10.1086/209564 .

Diehl, K., & Poynor, C. (2010). Great expectations?! Assortment size, expectations and satisfac-

tion. Journal of Marketing Research, 47 (2), 312–322.

Diener, E., & Biswas-Diener, R. (2002). Will money increase subjective well-being?: A literature

review and guide to needed research. Social Indicators Research, 57 (2), 119–169.

D i e n e r , E . , & F u j i t a , F . ( 1 9 9 5 ) . R e s o u r c e s , p e r s o n a l s t r i v i n g s , a n d s u b j e c t i v e w e l l - b e i n g : A n o m o thetic

and idiographic approach. Journal of Personality and Social Psychology, 68 (5), 926–935.

doi:

10.1037/0022-3514.68.5.926 .

Diener, E., Lucas, R. E., & Scollon, C. N. (2006). Beyond the hedonic treadmill. American

Psychologist, 61 (4), 305–314.

Diener, E., & Seligman, M. E. P. (2002). Very happy people. Psychological Science, 13 (1), 81–84.

Diener, E., Tay, L., & Oishi, S. (2013). Rising income and the subjective well-being of nations.

Journal of Personality and Social Psychology, 104 (2), 267–276. doi:

10.1037/a0030487 .

Dunn, E. W., Aknin, L. B., & Norton, M. I. (2008). Spending money on others promotes happiness.

Science, 319 (5870), 1687–1688. doi:

10.1126/science.1150952 .

Dunn, E. W., Gilbert, D. T., & Wilson, T. D. (2011). If money doesn’t make you happy, then you

probably aren’t spending it right. Journal of Consumer Psychology, 21 (2), 115–125.

doi:

10.1016/j.jcps.2011.02.002 .

Dunn, E. W., & Norton, M. I. (2013). Happy money . New York, NY: Simon & Schuster.

Frederick, S., & Loewenstein, G. F. (1999). Hedonic adaptation. In D. Kahneman, E. Diener, &

N. Schwartz (Eds.), Well-being: The foundations of hedonic psychology (pp. 302–329).

New York, NY: Russell Sage.

Frederick, S., Novemsky, N., Wang, J., Dhar, R., & Nowlis, S. (2009). Opportunity cost neglect.

Journal of Consumer Research, 36 (4), 553–561. doi:

10.1086/599764 .

Gentner, D., & Markman, A. B. (1994). Structural alignment in comparison: No difference without

similarity. Psychological Science, 5 (3), 152–158. doi:

10.1111/j.1467-9280.1994.tb00652.x .

10 Psychological Science of Spending Money

238

Gilbert, D. T., Giesler, R. B., & Morris, K. A. (1995). When comparisons arise. Journal of

Personality and Social Psychology, 69 (2), 227–236.

Gilbert, D. T., Pinel, E. C., Wilson, T. D., Blumberg, S. J., & Wheatley, T. P. (1998). Immune

neglect: A source of durability bias in affective forecasting. Journal of Personality and Social

Psychology, 75 (3), 617–638. doi:

10.1037/0022-3514.75.3.617 .

Gilovich, T. (1983). Biased evaluation and persistence in gambling. Journal of Personality and

Social Psychology, 44 (6), 1110–1126. doi:

10.1037/0022-3514.44.6.1110 .

Gilovich, T. (1991). How we know what isn’t so: The fallibility of human reason in everyday life .

New York, NY: The Free Press.

Gilovich, T., Kerr, M., & Medvec, V. H. (1993). Effect of temporal perspective on subjective

conﬁ dence. Journal of Personality and Social Psychology, 64 (4), 552–560.

doi:

10.1037/0022-3514.64.4.552 .

Goldstein, R., Almenberg, J., Dreber, A., Emerson, J. W., Herschkowitsch, A., & Katz, J. (2008).

Do more expensive wines taste better? Evidence from a large sample of blind tastings. Journal

of Wine Economics, 3 (1), 1–9.

Grifﬁ n, J. G., & Broniarczyk, S. M. (2010). The slippery slope: The impact of feature alignability

on search and satisfaction. Journal of Marketing Research, 47 (2), 323–334.

Grifﬁ n, D. W., Dunning, D., & Ross, L. D. (1990). The role of construal processes in overconﬁ dent

predictions about the self and others. Journal of Personality and Social Psychology, 59 (6),

1128–1139. doi:

10.1037/0022-3514.59.6.1128 .

Griskevicius, V., Tybur, J. M., & Van den Bergh, B. (2010). Going green to be seen: Status, reputa-

tion, and conspicuous conservation. Journal of Personality and Social Psychology, 98 (3), 392–

404. doi:

10.1037/a0017346 .

Hastie, R. (1984). Causes and effects of causal attribution. Journal of Personality and Social

Psychology, 46 (1), 44–56.

Haynes, G. A. (2009). Testing the boundaries of the choice overload phenomenon: The effect of

number of options and time pressure on decision difﬁ culty and satisfaction. Psychology and

Marketing, 26 (3), 204–212. doi:

10.1002/mar.20269 .

Herrmann, A., Heitmann, M., Morgan, R., Henneberg, S. C., & Landwehr, J. (2009). Consumer

decision making and variety of offerings: The effect of attribute alignability. Psychology and

Marketing, 26 (4), 333–358. doi:

10.1002/mar.20276 .

Higgins, E. T. (1997). Beyond pleasure and pain. American Psychologist, 52 (12), 1280–1300.

doi:

10.1037/0003-066X.52.12.1280 .

Howell, R. T., & Hill, G. (2009). The mediators of experiential purchases: Determining the impact

of psychological needs satisfaction and social comparison. The Journal of Positive Psychology,

4 (6), 511–522. doi:

10.1080/17439760903270993 .

Howell, R. T., Pchelin, P., & Iyer, R. (2012). The preference for experiences over possessions:

Measurement and construct validation of the Experiential Buying Tendency Scale. The Journal

of Positive Psychology, 7 (1), 57–71. doi:

10.1080/17439760.2011.626791 .

Hsee, C. K. (1996). The evaluability hypothesis: An explanation for preference reversals between

joint and separate evaluations of alternatives. Organizational Behavior and Human Decision

Processes, 67 (3), 247–257. doi:

10.1006/obhd.1996.0077 .

Hsee, C. K. (1998). Less is better: When low-value options are valued more highly than high-value

options. Journal of Behavioral Decision Making, 11 (2), 107–121.

Hsee, C. K., & Leclerc, F. (1998). Will products look more attractive when presented separately or

together? The Journal of Consumer Research, 25 (2), 175–186.

Hsee, C. K., Loewenstein, G. F., Blount, S., & Bazerman, M. H. (1999). Preference reversals

between joint and separate evaluations of options: A review and theoretical analysis.

Psychological Bulletin, 125 (5), 576–590.

Hsee, C. K., Yang, Y., Li, N., & Shen, L. (2008). Wealth, warmth and wellbeing: Whether happiness is

relative or absolute depends on whether it is about money, acquisition or consumption. Journal of

Marketing Research, 46 (3), 396–409. doi:

10.1509/jmkr.46.3.396 .

Hsee, C. K., & Zhang, J. (2004). Distinction bias: Misprediction and mischoice due to joint evaluation.

Journal of Personality and Social Psychology, 86 ( 5 ) , 6 8 0 – 6 9 5 . d o i : 1 0 . 1 0 3 7 / 0 0 2 2 - 3 5 1 4 . 8 6 . 5 . 6 8 0 .

Hsee, C. K., Zhang, J., Cai, C. F., & Zhang, S. (2013). Overearning. Psychological Science, 24 (6),

852–859. doi:

10.1177/0956797612464785 .

T.J. Carter

239

Isen, A. M. (2001). An inﬂ uence of positive affect on decision making in complex situations:

Theoretical issues with practical implications. Journal of Consumer Psychology, 11 (2), 75–85.

Iyengar, S. S., & Lepper, M. R. (2000). When choice is demotivating: Can one desire too much of

a good thing. Journal of Personality and Social Psychology, 79 (6), 995–1006.

Iyengar, S. S., Wells, R. E., & Schwartz, B. (2006). Doing better but feeling worse: Looking for the

“best” job undermines satisfaction. Psychological Science, 17 (2), 143–150.

Johnson, E. J., & Payne, J. W. (1985). Effort and accuracy in choice. Management Science, 31 (4),

395–414.

Kahn, B. E., & Lehmann, D. R. (1991). Modeling choice among assortments. Journal of Retailing,

67 (3), 274–299.

Kahneman, D., Krueger, A. B., Schkade, D., Schwarz, N., & Stone, A. (2004). A survey method

for characterizing daily life experience: The day reconstruction method. Science, 306 ,

1776–1780.

Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk.

Econometrica, 47 (2), 263–292.

Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39 (4),

341–350.

Kassam, K. S., Gilbert, D. T., Boston, A., & Wilson, T. D. (2008). Future anhedonia and time

discounting. Journal of Experimental Social Psychology, 44 (6), 1533–1537. doi:

10.1016/j.

jesp.2008.07.008 .

K a s s e r , T . ( 2 0 1 1 ) . C a n t h r i f t b r i n g w e l l - b e i n g ? A r e v i e w o f t h e r e s e a r c h a n d a t e n t a t i v e t h e o r y . Social

and Personality Psychology Compass, 5 ( 1 1 ) , 8 6 5 – 8 7 7 . d o i : 1 0 . 1 1 1 1 / j . 1 7 5 1 - 9 0 0 4 . 2 0 1 1 . 0 0 3 9 6 . x .

Kasser, T., Cohn, S., Kanner, A. D., & Ryan, R. M. (2007). Some costs of American corporate

capitalism: A psychological exploration of value and goal conﬂ icts. Psychological Inquiry,

18 (1), 1–22.

Kasser, T., & Ryan, R. M. (1993). A dark side of the American dream: Correlates of ﬁ nancial

success as a central life aspiration. Journal of Personality and Social Psychology, 65 (2),

410–422.

Kihlstrom, J. F., Beer, J. S., & Klein, S. B. (2003). Self and identity as memory. In M. R. Leary &

J. Tagney (Eds.), Handbook of self and identity (pp. 68–90). New York, NY: Guilford Press.

Knutson, B., Rick, S., Wimmer, G. E., Prelec, D., & Loewenstein, G. (2007). Neural predictors of

purchases. Neuron, 53 (1), 147–156. doi:

10.1016/j.neuron.2006.11.010 .

Koob, G. F., & Le Moal, M. (2001). Drug addiction, dysregulation of reward, and allostasis.

Neuropsychopharmacology, 24 (2), 97–129. doi:

10.1016/S0893-133X(00)00195-0 .

Kumar, A., & Gilovich, T. (2013). We’ll always have Paris: Differential story utility from experi-

ential and material purchases . Manuscript under review.

Lee, B.-K., & Lee, W.-N. (2004). The effect of information overload on consumer choice quality

in an on-line environment. Psychology and Marketing, 21 (3), 159–183. doi:

10.1002/mar.20000 .

Lerner, J. S., Small, D. A., & Loewenstein, G. (2004). Heart strings and purse strings: Carryover

effects of emotions on economic decisions. Psychological Science, 15 (5), 337–341.

doi:

10.1111/j.0956-7976.2004.00679.x .

Loewenstein, G. (1996). Out of control: Visceral inﬂ uences on behavior. Organizational Behavior

and Human Decision Processes, 65 (3), 272–292.

Loewenstein, G., & Lerner, J. S. (2003). The role of affect in decision making. In R. J. Davidson,

K. R. Scherer, & H. H. Goldsmith (Eds.), Handbook of affective sciences (pp. 619–642).

Oxford, UK: Oxford University Press.

Loewenstein, G. F., Weber, E., Hsee, C. K., & Welch, N. (2001). Risk as feelings. Psychological

Bulletin, 127 (2), 267–286.

Lyubomirsky, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reli-

ability and construct validation. Social Indicators Research, 46 (2), 137–155.

Ma, J., & Roese, N. J. (2013). The countability effect: Comparative versus experiential reactions

to reward distributions. Journal of Consumer Research, 39 (6), 1219–1233. doi:

10.1086/668087 .

Markman, A. B., & Medin, D. L. (1995). Similarity and alignment in choice. Organizational

Behavior and Human Decision Processes, 63 (2), 117–130.

10 Psychological Science of Spending Money

240

Markus, H. R., & Kitayama, S. (1991). Culture and the self: Implications for cognition, emotion,

and motivation. Psychological Review, 98 (2), 224–253.

Martin, K. D., & Hill, R. P. (2012). Life satisfaction, self-determination, and consumption ade-

quacy at the bottom of the pyramid. Journal of Consumer Research, 38 (6), 1155–1168.

doi:

10.1086/661528 .

Mattila, A., & Wirtz, J. (2000). The role of preconsumption affect in postpurchase evaluation of

services. Psychology and Marketing, 17 (7), 587–605.

McAdams, D. P. (2001). The psychology of life stories. Review of General Psychology, 5 (2), 100–

122. doi:

10.1037//I089-2680.5.2.100 .

Mellers, B. A., & McGraw, A. P. (2001). Anticipated emotions as guides to choice. Current

Directions in Psychological Science, 10 (6), 210–214. doi:

10.1111/1467-8721.00151 .

Mellers, B., Schwartz, A., & Ritov, I. (1999). Emotion-based choice. Journal of Experimental

Psychology: General, 128 , 332–345.

Mitchell, T., Thompson, L., Peterson, E., & Cronk, R. (1997). Temporal adjustments in the evalu-

ation of events: The “rosy view”. Journal of Experimental Social Psychology, 33 (4), 421–448.

doi:

10.1006/jesp.1997.1333 .

Mochon, D., Norton, M. I., & Ariely, D. (2012). Bolstering and restoring feelings of competence

via the IKEA effect. International Journal of Research in Marketing, 29 (4), 363–369.

doi:

10.1016/j.ijresmar.2012.05.001 .

Myers, D. G. (2000). The funds, friends, and faith of happy people. American Psychologist, 55 (1),

56–67. doi:

10.1037//0003-066X.55,1.56 .

Nelson, L. D., & Meyvis, T. (2008). Interrupted consumption: Disrupting adaptation to hedonic

experiences. Journal of Marketing Research, 45 (6), 654–664.

N e w b y - C l a r k , I . R . , R o s s , M . , B u e h l e r , R . , K o e h l e r , D . J . , & G r i f ﬁ n, D. (2000). People focus on opti-

mistic scenarios and disregard pessimistic scenarios while predicting task completion times.

Journal of Experimental Psychology. Applied, 6 ( 3 ) , 1 7 1 – 1 8 2 . d o i : 1 0 . 1 0 3 7 / / 1 0 7 6 - 8 9 8 X . 6 . 3 . 1 7 1 .

Nickerson, C. C., Schwarz, N. N., Diener, E., & Kahneman, D. D. (2003). Zeroing in on the dark side

of the American dream: A closer look at the negative consequences of the goal for ﬁ nancial

success. Psychological Science, 14 ( 6 ) , 5 3 1 – 5 3 6 . d o i : 1 0 . 1 0 4 6 / j . 0 9 5 6 - 7 9 7 6 . 2 0 0 3 . p s c i _ 1 4 6 1 . x .

Nicolao, L., Irwin, J. R., & Goodman, J. K. (2009). Happiness for sale: Do experiential purchases

make consumers happier than material purchases? Journal of Consumer Research, 36 (2),

188–198. doi:

10.1086/597049 .

Norton, M. I., Mochon, D., & Ariely, D. (2012). The IKEA effect: When labor leads to love.

Journal of Consumer Psychology, 22 (3), 453–460. doi:

10.1016/j.jcps.2011.08.002 .

Novemsky, N., & Kahneman, D. (2005). The boundaries of loss aversion. Journal of Marketing

Research, 42 (2), 119–128.

Nowlis, S., Mandel, N., & McCabe, D. (2004). The effect of a delay between choice and consump-

tion on consumption enjoyment. The Journal of Consumer Research, 31 (3), 502–510.

Oliver, R. L. (1980). A cognitive model of the antecedents and consequences of satisfaction deci-

sions. Journal of Marketing Research, 17 , 460–469.

Patrick, V. M., Macinnis, D. J., & Park, C. W. (2007). Not as happy as I thought I’d be? Affective

misforecasting and product evaluations. Journal of Consumer Research, 33 (4), 479–489.

doi:

10.1086/510221 .

Pham, M. T. (1998). Representativeness, relevance, and the use of feelings in decision making.

Journal of Consumer Research, 25 (2), 144–159. doi:

10.1086/209532 .

Phillips, D. M., & Baumgartner, H. (2002). The role of consumption emotions in the satisfaction

response. Journal of Consumer Psychology, 12 ( 3 ) , 2 4 3 – 2 5 2 . d o i : 1 0 . 1 2 0 7 / S 1 5 3 2 7 6 6 3 J C P 1 2 0 3 _ 0 6 .

Prelec, D., & Loewenstein, G. F. (1998). The red and the black: Mental accounting of savings and

debt. Marketing Science, 17 (1), 4–28.

P r e l e c , D . , & S i m e s t e r , D . ( 2 0 0 1 ) . A l w a y s l e a v e h o m e w i t h o u t i t : A f u r t h e r i n v e s t i g a t i o n o f t h e c r e d i t -

card effect on willingness to pay. Marketing Letters, 12 ( 1 ) , 5 – 1 2 . d o i : 1 0 . 1 0 2 3 / A : 1 0 0 8 1 9 6 7 1 7 0 1 7 .

Quoidbach, J., Dunn, E. W., Petrides, K. V., & Mikolajczak, M. (2010). Money giveth, money

taketh away: The dual effect of wealth on happiness. Psychological Science, 21 (6), 759–763.

doi:

10.1177/0956797610371963 .

T.J. Carter

241

Reutskaja, E., & Hogarth, R. M. (2009). Satisfaction in choice as a function of the number of

alternatives: When “goods satiate”. (B. Scheibehenne & P. M. Todd, Eds.). Psychology and

Marketing, 26 (3), 197–203. doi:

10.1002/mar.20268 .

Rick, S. I., Cryder, C. E., & Loewenstein, G. F. (2008). Tightwads and spendthrifts. Journal of

Consumer Research, 34 (6), 767–782. doi:

10.1086/523285 .

Rosenzweig, E., & Gilovich, T. (2012). Buyer’s remorse or missed opportunity? Differential

regrets for material and experiential purchases. Journal of Personality and Social Psychology,

102 (2), 215–223. doi:

10.1037/a0024999 .

Sacks, D. W., Stevenson, B., & Wolfers, J. (2012). The new stylized facts about income and subjec-

tive well-being. Emotion, 12 (6), 1181–1187. doi:

10.1037/a0029873 .

Scheibehenne, B., Greifeneder, R., & Todd, P. M. (2009). What moderates the too-much-choice

effect? Psychology and Marketing, 26 (3), 229–253. doi:

10.1002/mar.20271 .

Scheibehenne, B., Greifeneder, R., & Todd, P. M. (2010). Can there ever be too many options? A

meta‐analytic review of choice overload. Journal of Consumer Research, 37 (3), 409–425.

doi:

10.1086/651235 .

Schroeder, J., & Epley, N. (2013). Mistakenly seeking solitude. Manuscript under review.

Schwartz, B. (2004). The paradox of choice: Why more is less . New York, NY: Harper Perennial.

Schwartz, B., Ward, A., Monterosso, J., Lyubomirsky, S., White, K., & Lehman, D. R. (2002).

Maximizing versus satisﬁ cing: Happiness is a matter of choice. Journal of Personality and

Social Psychology, 83 (5), 1178–1197.

Sela, A., Berger, J., & Liu, W. (2009). Variety, vice, and virtue: How assortment size inﬂ uences

option choice. Journal of Consumer Research, 35 (6), 941–951. doi:

10.1086/593692 .

Sevdalis, N., & Harvey, N. (2006). Determinants of willingness to pay in separate and joint evalu-

ations of options: Context matters. Journal of Economic Psychology, 27 (3), 377–385.

doi:

10.1016/j.joep.2005.07.001 .

Sheldon, K. M., & Lyubomirsky, S. (2012). The challenge of staying happier: Testing the hedonic

adaptation prevention model. Personality and Social Psychology Bulletin, 38 (5), 670–680.

doi:

10.1177/0146167212436400 .

Shiv, B., & Huber, J. (2000). The impact of anticipating satisfaction on consumer choice. The

Journal of Consumer Research, 27 , 202–216.

Shugan, S. M. (1980). The cost of thinking. Journal of Consumer Research, 7 (2), 99–111.

Simon, H. A. (1955). A behavioral model of rational choice. Quarterly Journal of Economics,

69 (1), 99–118.

Solnick, S., & Hemenway, D. (1998). Is more always better?: A survey on positional concerns.

Journal of Economic Behavior and Organization, 37 , 373–383.

Soman, D. (2001). Effects of payment mechanism on spending behavior: The role of rehearsal and

immediacy of payments. Journal of Consumer Research, 27 (4), 460–474. doi:

10.1086/319621 .

Sparks, E. A., Ehrlinger, J., & Eibach, R. P. (2012). Failing to commit: Maximizers avoid commit-

ment in a way that contributes to reduced satisfaction. Personality and Individual Differences,

52 (1), 72–77. doi:

10.1016/j.paid.2011.09.002 .

Tian, K. T., Bearden, W. O., & Hunter, G. L. (2001). Consumers’ need for uniqueness: Scale devel-

opment and validation. The Journal of Consumer Research, 28 (1), 50–66.

Trope, Y., & Liberman, N. (2003). Temporal construal. Psychological Review, 110 (3), 403–421.

doi:

10.1037/0033-295X.110.3.403 .

Tsiros, M., Mittal, V., & Ross, W. T., Jr. (2004). The role of attributions in customer satisfaction:

A reexamination. Journal of Consumer Research, 31 (2), 476–483. doi:

10.1086/422124 .

Tversky, A., & Kahneman, D. (1991). Loss aversion in riskless choice: A reference-dependent

model. The Quarterly Journal of Economics, 106 (4), 1039–1061.

Ubel, P., Loewenstein, G. F., & Jepson, C. (2005). Disability and sunshine: Can hedonic predic-

tions be improved by drawing attention to focusing illusions or emotional adaptation? Journal

of Experimental Psychology. Applied, 11 (2), 111–123. doi:

10.1037/1076-898X.11.2.111 .

Van Boven, L., & Ashworth, L. (2007). Looking forward, looking back: Anticipation is more

evocative than retrospection. Journal of Experimental Psychology: General, 136 , 289–300.

doi:

10.1037/0096-3445.136.2.289 .

10 Psychological Science of Spending Money

242

Van Boven, L., Campbell, M. C., & Gilovich, T. (2010). Stigmatizing materialism: On stereotypes

and Impressions of materialistic and experiential pursuits. Personality and Social Psychology

Bulletin, 36 (4), 551–563. doi:

10.1177/0146167210362790 .

Van Boven, L., & Gilovich, T. (2003). To do or to have? That is the question. Journal of Personality

and Social Psychology, 85 (6), 1193–1202. doi:

10.1037/0022-3514.85.6.1193 .

Van Praag, B. M., & Frijters, P. (1999). The measurement of welfare and well-being: The Leyden

approach. In Well-being: The foundations of hedonic psychology (pp. 413–433).

Wang, J., Novemsky, N., & Dhar, R. (2009). Anticipating adaptation to products. Journal of

Consumer Research, 36 (2), 149–159. doi:

10.1086/597050 .

Welsch, H., & Kühling, J. (2010). Pro-environmental behavior and rational consumer choice:

Evidence from surveys of life satisfaction. Journal of Economic Psychology, 31 (3), 405–420.

doi:

10.1016/j.joep.2010.01.009 .

Wilson, T. D., Centerbar, D., Kermer, D., & Gilbert, D. T. (2005). The pleasures of uncertainty:

Prolonging positive moods in ways people do not anticipate. Journal of Personality and Social

Psychology, 88 (1), 5–21. doi:

10.1037/0022-3514.88.1.5 .

Wilson, T. D., & Gilbert, D. T. (2003). Affective forecasting , 35 , 345–411.

Wilson, T. D., & Gilbert, D. T. (2008). Explaining away: A model of affective adaptation.

Perspectives on Psychological Science, 3 (5), 370–386. doi:

10.1111/j.1745-6924.2008.00085.x .

Wilson, T. D., Lisle, D. J., Kraft, D., & Wetzel, C. G. (1989). Preferences as expectation-driven

inferences: Effects of affective expectations on affective experience. Journal of Personality and

Social Psychology, 56 (4), 519–530. doi:

10.1037/0022-3514.56.4.519 .

Wilson, A., & Ross, M. (2003). The identity function of autobiographical memory: Time is on our

side. Memory, 11 (2), 137–149. doi:

10.1080/741938210 .

Wilson, T. D., Wheatley, T., Meyers, J., Gilbert, D. T., & Axsom, D. (2000). Focalism: A source of

durability bias in affective forecasting. Journal of Personality and Social Psychology, 78 (5),

821–836.

Wirtz, D., Kruger, J., Scollon, C., & Diener, E. (2003). What to do on spring break? The role of

predicted, on-line, and remembered experience in future choice. Psychological Science, 14 (5),

520–524.

Xiao, J. J., & Li, H. (2010). Sustainable consumption and life satisfaction. Social Indicators

Research, 104 (2), 323–329. doi:

10.1007/s11205-010-9746-9 .

Zhang, S., & Markman, A. B. (2001). Processing product unique features: Alignability and

involvement in preference construction. Journal of Consumer Psychology, 11 (1), 13–27.

T.J. Carter

Psychological Science

22(8) 1011 –1018

Reprints and permission:

sagepub.com/journalsPermissions.nav

DOI: 10.1177/0956797611414726

http://pss.sagepub.com

How do people decide which political candidate to support, or

whether their country goes to war? In the social science litera-

ture, it has traditionally been assumed that political behavior

reflects a thoughtful and rational analysis of the pros and cons

of the options (e.g., Baum & Jamison, 2006; Downs, 1959;

Lau & Redlawsk, 1997). Recent work in social and cognitive

psychology suggests, however, that political behavior can also

be unconsciously influenced by contextual cues, such as vot-

ing location (Berger, Meredith, & Wheeler, 2008) and the

facial characteristics of candidates (Todorov, Mandisodza,

Goren, & Hall, 2005).

But how robust and durable is the influence of such inci-

dental cues on political decisions and behavior? In the research

reported here, we examined one of the most iconic political

symbols of a nation—its flag—and tested the direction and

durability of its influence on political behavior, attitudes, and

judgment.

National flags are pervasive cues in the political landscapes

of many nations, appearing on houses, schools, government

buildings, and the lapels of political candidates (Gellner, 2005).

Flags constitute particularly powerful political cues because

they may reinforce national sentiments without being con-

sciously noticed by the citizenry (e.g., Billig, 1995). Although

social scientists have speculated that national flags might exert

an unnoticed influence on political thought and behavior, there

is little empirical evidence to support this claim.

How might a national flag influence the political behavior

of the citizenry? National flags have traditionally been seen as

rallying symbols that bring citizens together (Baker & O’Neal,

2001; Mueller, 1970). For instance, citizens and members of

government often intentionally display the national flag dur-

ing wartime in an effort to unify the populace behind the war

efforts (Skitka, 2005). Recent research has shown that even

subtle exposure to a national flag can have unifying effects.

Hassin, Ferguson, Shidlovski, and Gross (2007) found that

subliminal exposure to a national flag led citizens to vote in a

manner that reflected politically moderate views, such that

participants at each end of the political spectrum moved

Corresponding Authors:

Travis J. Carter, Center for Decision Research, University of Chicago Booth

School of Business, C74 Harper Center, Chicago, IL 60637

E-mail: travis.carter@chicagobooth.edu

Melissa J. Ferguson, Department of Psychology, 230 Uris Hall, Cornell

University, Ithaca, NY 14853

E-mail: mjf44@cornell.edu

A Single Exposure to the American Flag

Shifts Support Toward Republicanism up

to 8 Months Later

Travis J. Carter1, Melissa J. Ferguson2, and Ran R. Hassin3

1Center for Decision Research, Booth School of Business, University of Chicago; 2Department of Psychology, Cornell University;

and 3Department of Psychology and The Center for the Study of Rationality, Hebrew University

Abstract

There is scant evidence that incidental cues in the environment significantly alter people’s political judgments and behavior

in a durable way. We report that a brief exposure to the American flag led to a shift toward Republican beliefs, attitudes, and

voting behavior among both Republican and Democratic participants, despite their overwhelming belief that exposure to the

flag would not influence their behavior. In Experiment 1, which was conducted online during the 2008 U.S. presidential election,

a single exposure to an American flag resulted in a significant increase in participants’ Republican voting intentions, voting

behavior, political beliefs, and implicit and explicit attitudes, with some effects lasting 8 months after the exposure to the prime.

In Experiment 2, we replicated the findings more than a year into the current Democratic presidential term. These results

constitute the first evidence that nonconscious priming effects from exposure to a national flag can bias the citizenry toward

one political party and can have considerable durability.

Keywords

political psychology, priming, voting behavior, American ﬂag

Received 12/13/10; Revision accepted 3/8/11

Research Article

at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from

1012 Carter et al.

toward the ideological center. This was the first evidence that

national flags can change people’s political behavior in a sub-

tle, nonconscious fashion.

Yet the psychological effects of exposure to a national flag

are likely to vary considerably according to a given country’s

characteristics, such as its culture, history, and political atmo-

sphere. Although there may be cases in which a national flag

unifies people by pushing them toward the center of the ideo-

logical spectrum, there may be other cases in which a national

flag instead moves people toward one end of the spectrum. We

argue that this possibility is particularly likely in a country in

which the political landscape is polarized by what is largely a

two-party system, and in which one of the two major parties

has come to be more associated with the flag. In these cases,

the flag may bias the citizenry toward a particular political

party, potentially without their awareness (Billig, 1995).

We tested this prediction in the United States, a country in

which the political system is sharply divided between Demo-

crats and Republicans. To examine the associations between

the flag and each political party, we conducted a pilot study

in which we asked 51 participants which party “tends to

brandish the American flag more often (e.g., by wearing it,

waving it, holding it, having it on their house).” Participants

in our sample strongly believed that the tendency to display

the flag was more common among Republicans; responses

differed significantly from the midpoint of the scale, t(50) =

6.50, p = .001 (see also Carney, Jost, Gosling, & Potter,

2008). The same sample of participants overwhelmingly

(90.2%) believed that their voting behavior would not be

influenced by the presence of a flag, and the few who thought

it might did not agree on the direction of its influence. Thus,

despite associating the American flag more strongly with one

political party than with the other, participants in our pilot

study did not believe that exposure to the flag would have

any effect on their behavior.

In contrast to the beliefs of the participants in the pilot study,

the results from the experiments reported here show that expo-

sure to the American flag introduces a bias toward the Republi-

can Party over the Democratic Party. In one experiment, we

tested whether subtle exposure to the American flag shifted

people’s beliefs, attitudes, and behaviors toward the Republican

end of the political continuum. We found that a single exposure

to a small American flag during deliberation about voting inten-

tions prior to a general election led to significant and robust

changes in participants’ voting intentions, voting behavior, and

political attitudes, all in the politically conservative direction. In

a separate experiment, we replicated these patterns more than a

year into a Democratic presidential term.

We also tested the longevity of this priming effect on

judgment and attitudes. Flag-priming effects may be espe-

cially potent if priming occurs while a person is consciously

deliberating about politics and voting intentions. We exposed

participants to the American flag once during such an argu-

ably critical psychological window and found that the effects

from this single exposure lasted up to 8 months later. This

prolonged influence represents one of the most durable prim-

ing effects in the cognitive sciences literature, and shows not

only that contextual effects can influence important political

decisions, but also that this influence can be robust and long

lasting.

Experiment 1

In this experiment, we tested whether a single exposure to the

American flag would lead participants to shift their attitudes,

beliefs, and behavior in the politically conservative direction.

We conducted a multisession study during the 2008 U.S. presi-

dential election. Starting in September 2008, we recruited

American adults across the United States to participate in a

paid online study of political beliefs and attitudes. We col-

lected measures from the same sample of participants at four

times over a period of 8 months.

Participants and recruitment

Between September 19 and October 10, 2008 (Session 1), 396

participants were recruited through advertising in online

social-networking sites (e.g., Facebook.com) to participate in

an online survey in exchange for a $10 Amazon.com gift cer-

tificate. In order to avoid the possibility that our priming

manipulation might alter the outcome of the election, we used

measurements from Session 1 to identify participants (n =

235) from the initial pool who planned to vote in a state where

polling indicated that a significant margin separated Obama

and McCain. These participants were randomly assigned to

either the flag-prime or the control condition.

The participants who were in solidly Republican or Demo-

cratic states were contacted to complete questionnaires for

Session 2 (starting on October 11, 2008, and ending on the day

before the election, November 3, 2008) and Session 3 (Novem-

ber 5 through November 12, 2008) in exchange for a $15

Amazon.com gift certificate. Of the participants contacted,

197 completed Session 2, and 191 completed Session 3. More

than 79% of participants completed Session 2 by October 21;

thus, the vast majority of participants voted at least 2 weeks

after their exposure to the prime. In early July of 2009, the

participants who had completed Session 3 were contacted to

complete Session 4 in exchange for a 1 in 20 chance to win a

$25 Amazon.com gift certificate. Seventy-one participants

completed this session (37.2%). We attribute this relatively

high rate of attrition to the use of a lottery rather than guaran-

teed payment.

There were no significant differences on any variables of

importance (e.g., political ideology, voting intentions, beliefs

about specific political issues, religiousness, nationalism, need

for cognition) between the participants who did and did not

complete the 8-month follow-up.

We excluded 8 participants (4 in each of the two condi-

tions) from the analyses because they completed the measures

in Session 1 in less than 10 min (median time = 36 min).

at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from

Long-Term Effects of U.S. Flag Exposure on Republicanism 1013

Materials and procedure

Session 1. Measures directly or potentially relevant to our

hypotheses were embedded within a larger set of personality

measures that participants completed in Session 1. Relevant

measures included the Patriotism and Nationalism subscales

of the Patriotism and Nationalism Scale (Kosterman &

Feshbach, 1989), a measure of warmth toward the candidates,

a demographics questionnaire, a measure of political orienta-

tion and exposure to news media, and a survey of attitudes

regarding specific political issues (to view the survey, see

Instrument Details in the Supplemental Material available

online). Participants also completed measures of intention to

vote for Barack Obama and Joseph Biden, and for John

McCain and Sarah Palin, using separate 11-point scales (from

1, definitely not, to 11, absolutely). Surveys were presented in

random order. None of these measures moderated the effects

observed in subsequent sessions.

Session 2. In Session 2, all participants first reported their

voting intentions, using the same 11-point scales used in

Session 1. For participants assigned to the flag-prime condition,

a small picture (72 × 45 pixels) of an American flag was present

in the top left corner of the survey. For participants in the control

condition, there was nothing in the corner of the survey (to view

the survey, see Experimental Manipulations in the Supplemen-

tal Material). Except for this single presentation of the Ameri-

can flag on this particular survey, the procedure and materials

in all sessions were identical for all participants.

Participants also answered several questions unrelated to the

present hypothesis. They then rated their warmth toward the

Democratic and Republican Parties, presidential candidates,

and vice presidential candidates (using 500-point analog sliding

scales); completed measures of political orientation, news-

consumption habits, and exposure to specific news sources;

answered the same questions about political issues asked in Ses-

sion 1; and rated the importance of those political issues.

After completing all of the surveys, participants completed

a number of Implicit Association Tests (IATs; Nosek, Green-

wald, & Banaji, 2007), presented in random order. The IAT

measures that were directly relevant to the current hypothesis

included a Barack Obama/John McCain IAT, a Joseph Biden/

Sarah Palin IAT, and a Democrat/Republican IAT. These tests

were presented and scored in accordance with the procedures

outlined by Nosek, Greenwald, and their colleagues (following

Lane, Banaji, Nosek, & Greenwald, 2007). Higher scores repre-

sent greater positivity toward the Republican candidate or party.

Session 3. In Session 3, participants were first asked to report

which candidate they voted for, selecting their choice from a

list that included the major- and minor-party candidates who

appeared on the ballots in most states, as well as “other” and

“did not vote.” Participants also answered questions about

their vote choice and the attributes of Barack Obama and John

McCain. They then rated how fairly they felt the media had

treated each presidential and vice presidential candidate, using

9-point scales (−4 = very unfairly negatively, −2 = somewhat

unfairly negatively, 0 = accurately, +2 = somewhat unfairly

positively, +4 = very unfairly positively).

Finally, participants completed measures about their news-

consumption habits and their exposure to specific television,

print, and radio news sources. After completing Session 3, par-

ticipants were referred to a Web site containing questions that

probed for suspicion about the experiment. Once participants

had answered these questions, they were debriefed on the

nature of the study. No participants expressed any suspicion

about the presence of the American flag during Session 2.

Session 4. In the final session, participants first answered

a number of questions about their current feelings about

President Obama and his job performance to date, using

11-point Likert scales. Next, participants indicated how

warmly they felt toward a variety of liberal and conservative

leaders using the same analog sliding scales used previously,

and answered the same questions about political beliefs used

in previous sessions. Participants were also asked to report

their personal political ideology, their religiousness, the impor-

tance of being an American to their identity, their media-

consumption habits, and their exposure to the same variety of

news sources asked about in Session 3.

Participants were then thanked and presented with further

debriefing information about the study.

Session 2 results

Voting intentions. We created composite measures of voting

intentions for both Sessions 1 and 2 by calculating the differ-

ence between intentions to vote for McCain and intentions to

vote for Obama; higher numbers indicate a greater intention to

vote for McCain than for Obama. We then regressed the cen-

tered Session 2 intentions on centered Session 1 intentions and

used the residuals from this analysis as our main measure of

voting intentions. Thus, we measured the impact of the flag

prime on voting intentions during Session 2 that could not be

explained by voting intentions from Session 1.

As predicted, participants in the flag-prime condition (M =

0.072, SD = 0.47) reported a greater intention to vote for

McCain than did participants in the control condition (M =

−0.070, SD = 0.48), t(181) = 2.02, p = .04, d = 0.298 (see

Fig. 1).

Explicit attitudes. We created a composite score of partici-

pants’ ratings of warmth toward the Republican and Demo-

cratic Parties, presidential candidates, and vice presidential

candidates, controlling for the same measures administered at

Session 1. Higher scores indicate more positive feelings

toward the Republican Party and candidates than toward the

Democratic Party and candidates. As predicted, participants in

the flag-prime condition (M = 0.424, SD = 2.73) felt relatively

more warmth toward the Republican Party and Republican

at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from

1014 Carter et al.

candidates than did participants in the control condition (M =

−0.410, SD = 2.37), t(181) = −2.21, p = .03, d = 0.354 (see

Fig. 2).

Implicit attitudes. We created a composite measure from

scores on the three political IATs to represent the aggregate

positivity toward the Republican Party and Republican candi-

dates relative to the Democratic Party and Democratic candi-

dates. Participants in the flag-prime condition (D = −0.006)

showed significantly more positivity toward the Republican

Party and candidates than did participants in the control condi-

tion (D = −0.102), t(173) = 2.03, p < .05, d = 0.313, an effect

that was mirrored in each of the IATs separately.

Political beliefs. Participants’ responses were reverse-scored

when needed and then averaged into a composite measure of

political beliefs (α = .84). This index was correlated with self-

reported party affiliation and political ideology (r = .73, p <

.001), which confirmed that reported political beliefs did cor-

respond with participants’ reported political ideology.

Participants in the flag-prime condition reported margin-

ally more conservative beliefs (M = 3.25, SD = 0.82) than did

participants in the control condition (M = 3.03, SD = 0.79),

t(181) = 1.80, p = .07, d = 0.274. This result held, and even

increased slightly, when we controlled for responses to mea-

sures of political beliefs in Session 1, β = 0.141, t(180) = 1.84,

p = .06 (see Fig. 1).

Session 3 results

Voting behavior. To maximize statistical power in measuring

voting behavior, we analyzed data only from participants who

reported voting for McCain or Obama (n = 166). Although

participants in the control condition generally tended to vote

for Obama (83.5% for Obama, 16.5% for McCain), this ten-

dency was significantly reduced in the flag-prime condition

(72.8% for Obama, 27.2% for McCain), χ2(1, N = 166) = 2.81,

p < .05, one-tailed (see Fig. 3). This pattern held when we

analyzed the data from all participants, although the signifi-

cance level dropped. It is worth noting that voting behavior

was highly predicted by voting intentions reported in Session 2.

Indeed, when we included voting intentions and priming

condition as predictors of voting behavior in a regression anal-

ysis, voting intentions remained reliably predictive, β = 3.26,

–0.15 –0.10 –0.05 00.050.100.15

Voting Intentions

Political Beliefs

Standardized Residual Score

Control Condition Flag-Prime Condition

Fig. 1. Voting intentions and political attitudes at Session 2 in Experiment

1 as a function of condition (flag prime or control). The graph presents

standardized residual scores that control for responses to the same measures

administered at Session 1. Higher numbers indicate a greater intention to

vote for the Republican candidates relative to the Democratic candidates

and greater support for the politically conservative position relative to the

politically liberal position. Error bars indicate ±1 SEM.

–1.5 –1.0 –0.5 00.5 1.

Session 2

Session 4

Warmth Toward Candidates and Parties

Control Condition Flag-Prime Condition

Fig. 2. Relative preference for the Republican and Democratic Parties and

presidential and vice presidential candidates as a function of condition (flag

prime or control), at Sessions 2 and 4 in Experiment 1. The graph presents

standardized residual scores that control for responses to the same measures

administered at Session 1. Higher numbers indicate greater preference for

the Republican Party and candidates relative to the Democratic Party and

candidates. Error bars indicate ±1 SEM.

16.5%

83.5%

27.2%

72.8%

Control

Condition

Flag-Prime

Condition

McCain Obama

Fig. 3. Percentage of participants in the control and flag-prime conditions

who reported voting for McCain and for Obama in Session 3 of Experiment

1 (n = 166).

at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from

Long-Term Effects of U.S. Flag Exposure on Republicanism 1015

χ2(1, N = 166) = 27.67, p < .0001, whereas the effect of prim-

ing condition dropped to nonsignificance (p = .25). These

results suggest that the effect of priming condition on voting

behavior was mediated by voting intentions, rather than that

priming condition had an unmediated, direct effect on voting

behavior (see also Hassin et al., 2007).

Treatment in the media. We created a composite index of

how fairly participants believed the media treated the candi-

dates; on this index, positive values indicate the belief that the

media treated the Republican candidates better than they

treated the Democratic candidates, and negative numbers indi-

cate the opposite belief. Although participants in the control

condition generally believed that the media were unduly harsh

in their treatment of the Republican candidates (M = −1.39,

SD = 3.54), this tendency was significantly greater in the flag-

prime condition (M = −2.69, SD = 4.43), t(181) = 2.20, p =

.029, d = 0.370.

Session 4 results

Obama’s job performance. We averaged the ratings of

Obama’s job performance to create a composite measure (α =

.97). As predicted, participants in the flag-prime condition felt

less positively about Obama’s job performance at the 8-month

follow-up (M = 6.76, SD = 2.88) than did participants in the

control condition (M = 8.01, SD = 2.25), t(69) = 2.04, p < .05,

d = 0.44.

Explicit attitudes. We created a composite attitude index by

subtracting the average rating of warmth toward liberal lead-

ers from the average rating of warmth toward conservative

leaders. Participants in both conditions generally felt more

warmth toward the Democrats than toward the Republicans,

but participants in the flag-prime condition (M = −54.76, SD =

182.18) were less warm toward Democrats than were partici-

pants in the control condition (M = −193.47, SD = 176.16),

t(69) = 3.26, p = .002, d = 0.80. We found the same pattern of

results using the composite measure used in Session 2 (partici-

pants’ ratings of warmth toward the political parties, presiden-

tial candidates, and vice presidential candidates, controlling

for the same measures administered at Session 1), t(69) = 2.77,

p < .01, d = 0.71 (see Fig. 2).

Political beliefs. As was the case in Session 2, participants in

the flag-prime condition exhibited significantly more conser-

vative beliefs (M = 3.35, SD = 0.85) than did participants in

the control condition (M = 2.85, SD = 0.88), t(68) = 2.43, p <

.02, d = 0.60.1

Discussion

Our results demonstrate that a single exposure to an unobtrusive

American flag shifted participants’ voting intentions, voting

behavior, attitudes, and beliefs toward the Republican end of the

ideological spectrum. It is important to note that political ideol-

ogy and party affiliation did not moderate these effects. That is,

both liberal and conservative participants were influenced by

the flag prime, and in the same (conservative) direction. These

effects lasted 8 months after the initial exposure. Why did they

last so long? One possibility is that voting behavior (Session 3)

had an especially influential effect on beliefs and attitudes

reported in Session 4. Indeed, voting behavior did significantly

predict beliefs about policy and warmth toward political leaders

and parties at Session 4—beliefs: t(60) = 4.71, p < .001; warmth:

t(61) = 6.7, p < .001. This pattern raises the question of whether

the effects observed in Session 4 could be explained entirely by

a self-perception account, whereby participants at Session 4

merely recalled their voting choice. The data do not support this

account. Controlling for voting behavior at Session 3, priming

condition still significantly predicted warmth toward Demo-

crats and Republicans in Session 4 (p < .01), and marginally

significantly predicted attitudes regarding political issues (p <

.09). Moreover, analyses controlling for voting intentions as

measured in Session 2 also showed that priming condition still

significantly predicted warmth (p < .01) and marginally signifi-

cantly predicted attitudes regarding political issues (p < .08).

These results suggest that the flag prime’s initial influence was

not restricted to voting intentions but also extended to attitudes

and beliefs more broadly, and that it was the accumulation and

perhaps rolling influence of these influences that affected voting

behavior at Session 3 and attitudes and beliefs at Session 4.

It is noteworthy that the size of the priming effect was con-

siderably larger in Session 4 than in the earlier sessions. Might

this have been due to the selective attrition of participants? Of

the participants who completed Session 4, those in the flag-

prime and control conditions did not differ in their political ide-

ology or voting intentions as measured in Session 1; this

suggests that any between-condition differences in Session 4

were not the product of a particular coincidence of attrition of

liberal participants from the flag-prime condition and attrition

of conservative participants from the control condition. Further-

more, participants who chose to take part in Session 4 showed

no baseline differences (on more than 20 variables) from those

who did not. It is of course impossible to definitively rule out

the possibility of selective attrition, as participants may have

differed on some unmeasured variable. There is some evidence

that people who have been exposed to persuasive appeals show

increasingly strong effects of those appeals over time (i.e.,

“sleeper effects”; Kumkale & Albarracín, 2004; see also Cook

& Flay, 1978; Pratkanis, Greenwald, Leippe, & Baumgardner,

1988), although the applicability of that evidence to the current

findings remains speculative.

Experiment 2

Before concluding that exposure to the American flag pro-

duces a bias toward Republicanism, we tested whether the flag

creates a shift specifically toward Republicanism, rather than

toward whichever party currently controls the executive

at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from

1016 Carter et al.

branch of the government. Thus, we conducted Experiment 2

in the spring of 2010, more than a year after the election of

President Obama and while the Democrats still had the major-

ity in both houses of Congress.

Participants and recruitment

Seventy participants completed the experiment for either $5 or

extra credit in a psychology class. Four participants were

excluded from the analyses: 1 who had previously taken part

in a highly similar experiment, 1 who did not complete the part

of the experiment that contained the priming, and 2 who

guessed the hypothesis.

Materials and procedure

Once participants arrived at the lab, they completed a task that

they were told concerned the ability to discern the time of day

that a photograph had been taken. They were presented with

four photographs of buildings and asked to estimate whether

they thought each photograph had been taken during the morn-

ing, afternoon, or evening (for examples, see Experimental

Manipulations in the Supplemental Material). For participants

randomly assigned to the flag-prime condition, two of the four

photographs had American flags in them (on flag poles or

hanging from the front of the building). For participants in the

control condition, the flags were digitally removed. After this

task, participants completed a short (eight-item) version of the

political belief survey used in Experiment 1; responses were

made on a 7-point scale.

Results and discussion

The responses were reverse-coded when needed and averaged

together (α = .67). Attitudes of participants in the flag-prime

condition (M = 3.10) were significantly closer to the Republi-

can end of the scale than were attitudes of participants in the

control condition (M = 2.65), t(64) = −2.04, p < .05. This find-

ing suggests that the American flag introduced a shift toward

the Republican worldview, even during a Democratic admin-

istration. Again, the effect was not moderated by political ide-

ology or any other measured variable, which suggests that the

flag produced the same conservative shift for both liberal and

conservative participants.

General Discussion

Although the American flag is assumed to represent the entire

country, our findings suggest that the psychological processes

put in motion by flag priming yield increased support for the

beliefs of a particular political party. Subtle exposure to the

American flag significantly shifted both Democratic and

Republican participants’ beliefs, attitudes, and voting behavior

toward Republicanism.

These findings provide the first empirical evidence that a

national flag can push citizens toward a specific end of the

ideological spectrum, rather than having the unifying effect

documented extensively in the social sciences literature (Baker

& O’Neal, 2001; Hassin et al., 2007; Mueller, 1970). Why did

a national flag have an ideologically specific effect (i.e., creat-

ing a bias toward Republicanism) in our study, even though

previous research has shown a unifying effect? As we noted in

the introduction, the American flag seems to be perceived (at

least in our samples) as more closely linked with the Republi-

can than with the Democratic Party, and this “flag branding”

may be especially influential in a two-party system in which

there are typically only two viable voting choices. In other

words, the American flag conjures up Republican beliefs and

attitudes, and these primes collectively push people in the

Republican direction. By contrast, if any flag branding of a

particular party or viewpoint exists in a political system that

allows for multiple parties and viewpoints, such branding may

be relatively diluted and thus less influential.

It is possible that the American flag does indeed have a uni-

fying influence that can manifest itself as increased Republican-

ism. In other words, the flag might trigger concepts of unity or

political moderation that move people toward the center of the

ideological spectrum, but because the samples in our studies

were relatively Democratic and liberal, their movement toward

the center was a move toward Republicanism. The effects of the

American flag observed in our experiments are therefore con-

sistent with the flag either having a unifying effect or inducing a

movement toward conservative beliefs and attitudes. If the for-

mer explanation is correct, exposing a highly conservative sam-

ple to an American flag prime would lead to a shift toward the

Democratic end of the spectrum. If the latter explanation is cor-

rect, participants already located at the Republican end of the

ideological spectrum would show little movement toward the

center if exposed to an American flag prime.

The mechanism may be more nuanced than either of these

possibilities, however. As we have argued elsewhere (Hassin

et al., 2009), national flags may be strongly associated specifi-

cally with prototypes of national citizens and may influence

people by shifting their attitudes toward those of the (imagi-

nary) prototypical citizen. The direction of the shift for a given

sample of people would depend on whether those people

believe the prototypical citizen is more liberal or more conser-

vative than they are themselves. In a way, this would be a uni-

fying effect, because the flag would move people toward what

they perceive to be the typical or average citizen. And yet, as

long as people believe that the typical American is more con-

servative than they are, this “unifying” effect would result in a

shift toward Republicanism. We do have some evidence that

our participants generally believed that the prototypical Amer-

ican is more conservative than they are themselves. At the end

of Session 4, we asked participants in Experiment 1 about their

views of the “typical American.” Although participants gener-

ally anchored on their own beliefs in estimating those of the

at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from

Long-Term Effects of U.S. Flag Exposure on Republicanism 1017

typical American, they felt that the typical American would

feel more warmly toward Republican politicians than they did

themselves, paired t(68) = 2.34, p < .03, and that the typical

American would give more Republican answers to the specific

policy questions than they had themselves, paired t(68) = 7.07,

p < .001. Future research can test more directly how people’s

beliefs about the prototypical citizen predict the effect of flag

priming on political thought and behavior (see Hassin et al.,

2009, for a more detailed discussion).

Our results also demonstrate that a single exposure to a

national flag can have wide-ranging effects. Why did a single,

brief exposure to the American flag in Experiment 1 have such

an enduring impact? Indeed, considering how often Ameri-

cans are exposed to their flag, why would this one exposure

have any impact at all? In contrast with the vast majority of

instances in which people are exposed to the American flag,

this particular exposure occurred when participants were

reporting their voting intentions, an act that has been shown to

strongly predict and shape voting behavior (Greenwald, Car-

not, Beach, & Young, 1987). For some participants, explicitly

declaring voting intentions may have been a rare event that

further crystallized their stated intentions and attitudes, incor-

porating any bias introduced by the presence of the flag at that

critical moment. Indeed, when we controlled for participants’

voting intentions at Session 2, the effect of the flag exposure

on voting behavior dropped to nonsignificance (see also

Hassin et al., 2007). Thus, exposure to the American flag may

have an especially strong influence when it occurs immedi-

ately before or during a person’s consideration of political

issues or declaration of political decisions (e.g., in the voting

booth).

It is also important to note that exposure to the American

flag can have a range of short-term effects that are not depen-

dent on conscious declarations, and are not even overtly politi-

cal (Carter, Ferguson, & Hassin, 2011; Ferguson & Hassin,

2007). For example, Ferguson and Hassin (2007) found that

brief exposure to the American flag increased aggressive

thoughts and behavior, specifically among people who fol-

lowed news about politics.

Our data suggest that American people are not aware of

this effect: Participants in our pilot study erroneously believed

that exposure to the American flag would not influence their

political behavior or attitudes. This mistaken belief is in line

with the standard claim in psychology and political science

that important political behavior results from careful and

rational deliberation (Baum & Jamison, 2006; Downs, 1959;

Lau & Redlawsk, 1997). Thus, our findings challenge lay-

people’s assumptions as well as the standard claim in the lit-

erature, and extend recent research showing that subtle cues

in the environment—from polling locations (Berger et al.,

2008), to the facial characteristics of political candidates

(Greenwald, Smith, Sriram, Bar-Anan, & Nosek, 2009;

Rule et al., 2010; Todorov et al., 2005), to the presence of

national flags (Hassin et al., 2007)—can significantly influence

how people vote.

Declaration of Conflicting Interests

The authors declared that they had no conflicts of interest with

respect to their authorship or the publication of this article.

Supplemental Material

Additional supporting information may be found at http://pss.sagepub

.com/content/by/supplemental-data

Note

1. In Session 4, participants responded to an additional item about

extreme interrogation techniques that was not included in previous

sessions. Including this measure in the composite measure did not

change the results.

References

Baker, W. D., & O’Neal, J. R. (2001). Patriotism or opinion

leadership? The nature and origins of the “rally ’round the

flag” effect. Journal of Conflict Resolution, 45, 661–687. doi:

10.1177/0022002701045005006

Baum, M. A., & Jamison, A. S. (2006). The Oprah effect: How soft

news helps inattentive citizens vote consistently. The Journal of

Politics, 68, 946–959. doi:10.1111/j.1468-2508.2006.00482.x

Berger, J., Meredith, M., & Wheeler, S. C. (2008). Contextual

priming: Where people vote affects how they vote. Proceedings

of the National Academy of Sciences, USA, 105, 8846–8849.

doi:10.1073/pnas.0711988105

Billig, M. (1995). Banal nationalism. London, England: Sage.

Carney, D. R., Jost, J. T., Gosling, S. D., & Potter, J. (2008). The

secret lives of liberals and conservatives: Personality profiles,

interaction styles, and the things they leave behind. Political Psy-

chology, 29, 807–840.

Carter, T. J., Ferguson, M. J., & Hassin, R. R. (2011). Supporting

the American system: The relationship between implicit American

nationalism and system justification. Social Cognition, 29, 341–359.

Cook, T. D., & Flay, B. R. (1978). The persistence of experimentally-

induced attitude change. In L. Berkowitz (Ed.), Advances in

experimental social psychology (Vol. 11, pp. 1–57). New York, NY:

Academic Press.

Downs, A. (1959). An economic theory of democracy. New York,

NY: Harper and Row.

Ferguson, M. J., & Hassin, R. R. (2007). On the automatic associa-

tion between America and aggression for news watchers. Person-

ality and Social Psychology Bulletin, 33, 1632–1647.

Gellner, E. (2005). Nations and nationalism. Reno: University of

Nevada Press.

Greenwald, A. G., Carnot, C. G., Beach, R., & Young, B. (1987).

Increasing voting behavior by asking people if they expect to vote.

Journal of Applied Psychology, 72, 315–318. doi:10.1037/0021-

9010.72.2.315

Greenwald, A. G., Smith, C. T., Sriram, N., Bar-Anan, Y., & Nosek,

B. A. (2009). Implicit race attitudes predicted vote in the 2008

U.S. presidential election. Analyses of Social Issues and Public

Policy, 9, 241–253. doi:10.1111/j.1530-2415.2009.01195.x

Hassin, R. R., Ferguson, M. J., Kardosh, R., Porter, S. C., Carter,

T. J., & Dudareva, V. (2009). Précis of implicit nationalism. In

at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from

1018 Carter et al.

O. Vilarroya, S. Atran, A. Navarro, K. Ochsner, & A. Tobena

(Eds.), Values, empathy, and fairness across social barriers

(pp. 135–145). Hoboken, NJ: Wiley-Blackwell.

Hassin, R. R., Ferguson, M. J., Shidlovski, D., & Gross, L. (2007).

Subliminal exposure to national flags affects political thought

and behavior. Proceedings of the National Academy of Sciences,

USA, 104, 19757–19761. doi:10.1073/pnas.0704679104

Kosterman, R., & Feshbach, S. (1989). Toward a measure of patriotic

and nationalistic attitudes. Political Psychology, 10, 257–274.

Kumkale, G. T., & Albarracín, D. (2004). The sleeper effect in persua-

sion: A meta-analytic review. Psychological Bulletin, 130, 143–172.

Lane, K. A., Banaji, M. R., Nosek, B. A., & Greenwald, A. G. (2007).

Understanding and using the Implicit Association Test: IV.

What we know (so far) about the method. In B. Wittenbrink &

N. Schwarz (Eds.), Implicit measures of attitudes (pp. 59–102).

New York, NY: Guilford Press.

Lau, R. R., & Redlawsk, D. P. (1997). Voting correctly. American

Political Science Review, 91, 585.

Mueller, J. E. (1970). Presidential popularity from Truman to John-

son. American Political Science Review, 64, 18–34.

Nosek, B. A., Greenwald, A. G., & Banaji, M. R. (2007). The implicit

association test at age 7: A methodological and conceptual review.

In J. A. Bargh (Ed.), Social psychology and the unconscious: The

automaticity of higher mental processes (pp. 265–292). New

York, NY: Psychology Press.

Pratkanis, A. R., Greenwald, A. G., Leippe, M. R., & Baumgardner,

M. H. (1988). In search of reliable persuasion effects: III. The

sleeper effect is dead. Long live the sleeper effect. Journal of

Personality and Social Psychology, 54, 203–218.

Rule, N. O., Ambady, N., Adams, R. B., Jr., Ozono, H., Nakashima,

S., Yoshikawa, S., & Watabe, M. (2010). Polling the face: Predic-

tion and consensus across cultures. Journal of Personality and

Social Psychology, 98, 1–15. doi:10.1037/a0017673

Skitka, L. J. (2005). Patriotism or nationalism? Understanding

post-September 11, 2001, flag-display behavior. Journal of

Applied Social Psychology, 35, 1995–2011. doi:10.1111/j.1559-

1816.2005.tb02206.x

Todorov, A., Mandisodza, A. N., Goren, A., & Hall, C. C. (2005).

Inferences of competence from faces predict election outcomes.

Science, 308, 1623–1626. doi:10.1126/science.1110589

at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from

0 views·78 pages

Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF Free Download

Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF free Download. Think more deeply and widely.

Uploaded by a-owen on 5/2/2026

/78

100%