Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF Free Download

1 / 78
0 views78 pages

Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF Free Download

Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League Umpires PDF free Download. Think more deeply and widely.

1
Title: Accusations of Unfairness Bias Subsequent Decisions: A Study of Major
League Umpires
Authors: Travis J. Carter1*, Erik G. Helzer2.
Affiliations:
1Department of Psychology, Colby College.
2The Johns Hopkins Carey Business School.
*Correspondence to: travis.carter@colby.edu
Abstract: What happens when decision-makers are accused of bias by an aggrieved party? We
examined the ball-and-strike calls of Major League Baseball umpires before and after arguments
from players or managers resulting in ejection. Prior to ejection, the accusing team was, in fact,
disadvantaged by the home plate umpire’s calls. After the ejection, umpires did not revert to
neutrality—they exhibited the opposite bias, advantaging the accusing team. This pattern was
only evident when the ejection was related to pitch location, not other kinds of ejections. Using a
laboratory analogue of the umpires’ situation, we replicated this post-accusation tendency with
experimental participants. This study further revealed that decision-makers were unaware of the
shifts in their behavior in response to the accusations, and another survey indicated that this
tendency violates beliefs about fairness. These results suggest that performance following
accusations may unwittingly succumb to this insidious tendency to favor the accusing party.
One Sentence Summary: After being (rightly) accused of biased behavior toward one team,
MLB Umpires responded by committing the opposite bias, now giving more favorable calls to
the accuser’s team.
MANUSCRIPT UNDER REVIEW—PLEASE DO NOT CITE WITHOUT PERMISSION
2
Main Text:
Among the many responsibilities leaders bear is a commitment to fairness. Perceptions of
fairness are important for many organizational and interpersonal outcomes (13), and leaders, as
decision-makers, find themselves in the unique position to uphold fairness standards. Even the
best-intentioned leaders, however, will occasionally have their decisions questioned on the
grounds of fairness. In this paper, we ask how accusations of bias (one type of fairness violation)
affect the subsequent judgments of decision-makers.
Herein, we examine the perceived fairness of repeated judgments. In such cases, a
decision-maker is expected to base her decisions on an evaluation of the evidence, applying
some pre-ordained standard consistently to each case. Conducted correctly, this procedure should
result in fair outcomes on average. Common examples include judges’ rulings for courtroom
objections, managers’ application of company standards to different job candidates, and (most
relevant to the present studies) baseball umpires’ application of a common strike zone to batters
on both teams.
Systematic bias in serial decisions may be particularly insidious: over time, small
procedural biases can compound into large absolute differences in the distribution of outcomes
(4). Although much research has examined the factors that influence judgments of fairness, as
well as recipients’ reactions to decisions that are judged as fair or unfair (5), little is known about
the effects of accusations of unfairness on decision-makers’ subsequent decisions.
In the present analysis, we view such accusations as a form of performance feedback: the
decision-maker is made aware of a perceived pattern of uneven decisions, presumably indicative
of a flawed (or biased) process. Specifically, we examined fairness-related feedback delivered to
an evaluator by a self-interested party—someone directly (and negatively) affected by those
3
decisions. For instance, an employee might complain that his performance reviews are unfairly
lower than others’, implying that the manager is exhibiting a bias against him.
In such cases, decision-makers are keenly aware of others’ biases (6), and may be
particularly apt to dismiss an accusation of bias as being self-interested, consequently making no
attempts to investigate or alter subsequent decisions. This reaction may not be warranted,
however, given that even self-interested assessments are constrained by reality (7). Even if the
accusation is considered seriously, the decision-maker may search for evidence of bias and find
none—not because it does not exist, but because the process by which the judgment is formed is
impervious to introspection (8). All of this suggests that even a pattern of unfairness exists,
decision-makers will remain unaware of the presence of bias.
We considered three major possibilities for how decision-makers would respond to an
accusation from an interested party. First, it could have no systematic effect on subsequent
decisions, either because of a successful attempt to ignore the feedback or because the decision-
making process is immune to conscious intervention (9). Second, it could exacerbate existing
biases, strengthening the existing response tendency, either due to motivational processes, such
as reactance (10), or inherent cognitive biases, such as escalation of commitment (11). Third, it
could lead to overcorrection, producing a new bias in the opposite direction, either resulting from
overzealous conscious attempts to correct for past errors or from a non-conscious correction
mechanism. In testing these possibilities, we sought to understand the role that conscious
processes, as well as explicit beliefs about feedback accuracy, play in shaping decision-makers’
responses.
We began by examining a near-ideal decision context in Study 1: the ball-and-strike
judgments of Major League Baseball (MLB) umpires. Hundreds of times in each game, the home
4
plate umpire declares a pitch a ball or a strike based on their perception of whether it was inside
or outside the strike zone, a decision that must be made immediately, and that carries opposite
consequences for the two teams involved. Umpires are regularly accused of bias in their
decisions, but we focused on the clearest and most discernible cases: when a player or manager
is ejected from a game as a result of arguing a call with the umpire. Ultimately we sought to
compare the relative favorability of the umpire’s calls toward the ejected and the non-ejected
team both before and after the ejection. We further expected that accusations of unfairness would
exert the most direct influence on subsequent judgments in the same domain. In this case, only
arguments related to pitch location (i.e. ball-and-strike calls) should lead to shifts in umpires’
subsequent ball-and-strike calls; ejections prompted by other circumstances (e.g. a close play at
third base) should lead to no such shifts, thus providing an important point of comparison, and a
crucial benchmark and for our predictions.
In order to measure the relative favorability of umpires’ ball-and-strike judgments, we
employed a data-driven approach that allows for both absolute and relative comparisons, making
use of the PITCHf/x pitch location data from 2008-2013. Specifically, we divided the x-z
coordinate plane into bins, then calculated a deviance score for each called pitch by comparing
the actual ball or strike call made by the umpire to the long-run probability of pitches in that
same location being called a ball or a strike. This approach not only allows for the aggregation
and comparison of pitches regardless of their location, it also minimizes the impact of shifts in
players’ behavior as a result of an ejection; as long as the umpire is the arbiter of judgment for a
given pitch, the prior probability should serve as a neutral baseline against which to compare any
individual judgment. The details of the calculation can be found in the Supplemental Materials,
5
but put simply, positive deviance scores reflect calls favorable to the batting team, and negative
scores reflect calls unfavorable to the batting team.
Examining deviance scores for pitches thrown during the pre- and post-ejection periods
of games featuring a single ejection using linear mixed-effects models, it was clear that for
ejections unrelated to pitch location (396 games, n = 42,414 pitches), the ejection had no impact
on the relative favorability of the umpire’s calls, b = 0.000, 95% CI [–0.012, 0.013], t(402.79) =
0.06, p = .954 (Fig. 1, bottom panel).
Pitch-related ejections (311 games, n = 34,563 pitches), however, did lead to different
patterns of favorability before vs. after the ejection, b = 0.042, 95% CI [0.026, 0.057], t(314.61)
= 5.37, p < .001 (Fig. 1, top panel). The accusation of bias was apparently made with good
reason: prior to the ejection, the ejected team received less favorable calls than the non-ejected
team, t(442.10) = 3.99, phb < .001.1 In the post-ejection period, however, umpires reversed this
bias; the ejected team received more favorable calls than the non-ejected team, t(744.20) = –3.52,
phb < .001. Further examinations indicated that this reversal remained for the rest of the game,
rather than fading shortly after the ejection (see Supplemental Materials). Thus, the data are
consistent with the third possibility outlined above: in response to an accusation of unfairness,
umpires overcorrected, introducing the opposite bias—but only when the argument was relevant
to the domain of judgment.
1 The subscript hb (i.e. Phb) indicates a p value that was adjusted using the Holm-Bonferroni method (12)
in order to account for multiple comparisons.
6
Fig. 1. Deviance scores by batting team, period of game, and type of ejection (Study 1). Top
panel: Ejections resulting from arguments related to pitch-location (396 games, n = 42,414
0.03
0.02
0.01
0.00
0.01
0.02
0.03
PreEjection Period PostEjection Period
Pre/Post Ejection
Deviance score
Ejected Team NonEjected Team
Ejection Type: Pitchrelated Ejections
0.03
0.02
0.01
0.00
0.01
0.02
0.03
PreEjection Period PostEjection Period
Pre/Post Ejection
Deviance score
Ejected Team NonEjected Team
Ejection Type: Ejections Unrelated to Pitch Location
7
pitches). Bottom panel: Ejections unrelated to pitch location (311 games, n = 34,563 pitches).
Error bars represent +/– 1 standard error.
Given that umpires’ primary goal is accurate judgments, it seems likely that the bias
would be most pronounced when there is some ambiguity about whether a given pitch should be
called a ball or strike—pitches near the edge of the strike zone. To test this hypothesis, we
created a proxy variable for ambiguity based on each pitch’s prior probability of being called a
ball or strike. Indeed, the bias exhibited by umpires was moderated by the ambiguity of the pitch
location, b = 0.134, 95% CI [0.086, 0.181], t(34429.69) = 5.49, p < .001 (see Table S5). The bias
against the ejected team pre-ejection (Fig. 2, top panel), and in favor of the ejected team post-
ejection (Fig. 2, bottom panel), was strongest for the most ambiguous pitches.
0.10
0.05
0.00
0.05
0.10
0.50 0.25 0.00 0.25 0.50
Ambiguity of pitch location
Deviance score
Ejected Team NonEjected Team
Game period: Preejection
8
Fig. 2. Deviance scores by batting team, ambiguity of pitch location, and period of game (Study
1). Top panel: Pre-ejection period. Bottom panel: Post-ejection period. Depicted scores and 95%
confidence intervals (shaded regions) derived from model predictions.
Given that umpires have the most control over the ball and strike calls, we consider the
analysis of deviance scores to be the primary test of our hypothesis. It is, of course, interesting to
consider whether the observed changes in umpires’ post-ejection judgments had downstream
consequences, such as changing the likelihood of batters getting on base (as measured by on-
base percentage; OBP) and scoring runs. However, because these outcomes are less directly
under the influence of the home plate umpire, we would expect to observe smaller effects
compared to deviance scores.
Using the at-bat as the unit of analysis, we tested for the effects of batting team and
period of game on the likelihood of getting on base (OBP; Fig. S4) and runs scored per at-bat
(R/AB; Fig. S5) using linear mixed-effects models. In both cases, the ejected team experienced
poorer outcomes than the non-ejected team in the pre-ejection period (OBP: z = 5.44, phb < .001;
0.10
0.05
0.00
0.05
0.10
0.50 0.25 0.00 0.25 0.50
Ambiguity of pitch location
Deviance score
Ejected Team NonEjected Team
Game period: Postejection
9
R/AB: z = 6.31, phb < .001), just like deviance scores. However, unlike deviance scores, which
showed a reversal in fortunes in the post-ejection period, the ejected team’s disadvantage relative
to the non-ejected team was merely eliminated for OBP (z = 0.21, phb = .831), and merely
attenuated for R/AB (z = 2.31, phb = .021). Baseball games are certainly dynamic systems in that
players and managers act and react as circumstances change, but the lack of a reversal in the
post-ejection period for these two measures cannot be written off as simple regression to the
mean. Indeed, a mediation analysis indicated that the improvements in the ejected teams’ OBP
(indirect effect: 0.039, 95% CI [0.022, 0.058]) and R/AB indirect effect: 0.016, 95% CI [0.008,
0.027]) following pitch-related ejections were at least partially due to changes in the umpires’
balls-and-strikes calls after the ejection, even if those improvements were insufficient to overtake
the non-ejected team.
In order to examine the effects of fairness accusations in a setting that allows for causal
inferences, in Study 2 we created a laboratory task mimicking the situation that umpires face,
with 100 participants randomly assigned to receive accusatory feedback (or not). The task
involved viewing a series of images and judging whether the number of dots on each image was
higher or lower than a target number (see Fig. S6). Participants were all told they had been
assigned to the role of Judge, and would actually perform the dot-estimation task. Ostensibly,
they had been partnered with another participant (the Observer) whose job was to observe the
Judge’s performance and provide feedback after each block of trials. Both the Judge and the
Observer were paid a bonus based on the Judge’s performance, but with misaligned incentives.
That is, the Judge (participant) was paid based on the number of accurate responses she gave,
whereas the Observer (partner) was paid based on the number of directional responses (i.e. the
number of “higher” vs. “lower” responses, counterbalanced) given by the Judge, regardless of
10
accuracy. Thus, any feedback by the Observer suggesting more directional responses could be
seen as purely self-serving—to the detriment of the participant’s own interests.
After a period of relatively neutral “feedback” from the Observer, participants in the
Critical Feedback condition began receiving feedback accusing them of giving too few
directional responses—easily interpreted as an attempt to garner a more favorable outcome for
themselves. Participants in the Control condition received neutral feedback throughout the
experiment.
Mimicking the umpires’ response to an accusation of bias, the feedback manipulation
impacted the proportion of participants’ directional responses, F(1,91) = 4.09, p = .046, η"
# =
.043 (see Fig. 3, top panel). Participants in the critical feedback condition shifted their judgments
to be more favorable to the observer after the critical feedback began, t(91) = –2.26, phb = .052,
whereas participants in the control condition, showed no systematic change, t(91) = 0.66, phb =
.513. What’s more, examining participants’ explicit estimates after each block of trials showed
no evidence that they were aware of the shifts in their directional responses (all Fs < 1.05; see
Fig. 3, bottom panel) or in accuracy (see Supplemental Materials). Those findings, coupled with
participants’ strong belief that the Observer’s judgment was both biased (p < .001) and inferior to
their own (p < .001), strongly suggest that the shifts were not intentional (see Supplemental
Materials).
11
Fig. 3. Actual (top panel) and Estimated (bottom panel) proportion of directional responses by
period of the study and feedback condition (N = 93; see Materials and Methods). Error bars
represent +/– 1 standard error.
0.40
0.45
0.50
0.55
0.60
Pre Post
Pre/Post Critical Feedback
Actual Proportion Directional Responses
Control Critical Feedback
0.40
0.45
0.50
0.55
0.60
Pre Post
Pre/Post Critical Feedback
Estimated Proportion Directional Responses
Control Critical Feedback
12
The tendency to respond to accusations of unfairness by offsetting one bias with another
poses interesting questions about the nature of fairness. To see how the pattern of behavior we
observed comports with folk intuitions about fairness we presented another group of participants
with a scenario where there was a clear pattern of bias in a decision-maker’s past decisions, and
asked them what would constitute a fair outcome going forward. The vast majority (73%)
indicated that the fairest response would be to eliminate bias toward both parties, and only 19%
of indicated that the pattern we observed in both studies (instituting a new bias in favor of the
aggrieved party) would be fair. Thus, there may be important implications for this work with
regard to perceptions of procedural fairness following an accusation of bias.
To be sure, the precise mechanisms underlying changes in cognitive and behavioral
responses following accusations of bias are not easily identified by the present studies. As such,
it is unclear how decision-makers might avoid the negative consequences of feedback. Our
experimental participants appeared to have minimal access to the effects feedback had on their
judgments, suggesting that these biases may be particularly insidious and resistant to conscious
control (8). Objective performance feedback following accusations of bias (such as a computer-
generated report on umpires’ accuracy calling balls and strikes), followed by recalibration and
practice may help reduce these biases in the long-run (13); however, it is unclear how feasible
such interventions would be in the context of real-world serial decisions.
All in all, the present studies provide some insight into an overlooked aspect of the
psychology of fairness. Despite their best intentions, decision-makers charged with upholding
fairness may from time to time slip in their duties. These studies suggest that the most obvious
resolution to this problem—making decision-makers aware of such slips through informal
channels—may promote new patterns of unfairness instead of eliminating the underlying
13
problem. This may be very welcome news to the aggrieved party, though certainly not to the
decision-maker who must hear a new round of complaints.
References and Notes:
1. Y. Cohen-Charash, P. E. Spector, The role of justice in organizations: A meta-analysis.
Organ. Behav. Hum. Decis. Process. 86, 278–321 (2001).
2. B. A. Mellers, J. Baron, Eds., Psychological perspectives on justice: Theory and
applications (Cambridge University Press, Cambridge, 1993;
http://ebooks.cambridge.org/ref/id/CBO9780511552069).
3. D. T. Miller, Disrespect and the experience of injustice. Annu. Rev. Psychol. 52, 527–553
(2001).
4. L. Babcock, G. Loewenstein, Explaining bargaining impasse: The role of self-serving
biases. J. Econ. Perspect. 11, 109–126 (1997).
5. T. R. Tyler, What is procedural justice?: Criteria used by citizens to assess the fairness of
legal procedures. Law Soc. Rev. 22, 103 (1988).
6. E. Pronin, D. Y. Lin, L. Ross, The bias blind spot: Perceptions of bias in self versus others.
Pers. Soc. Psychol. Bull. 28, 369–381 (2002).
7. Z. Kunda, The case for motivated reasoning. Psychol. Bull. 108, 480–498 (1990).
8. T. D. Wilson, N. Brekke, Mental contamination and mental correction: Unwanted
influences on judgments and evaluations. Psychol. Bull. 116, 117–142 (1994).
14
9. R. E. Nisbett, T. D. Wilson, Telling more than we can know: Verbal reports on mental
processes. Psychol. Rev. 84, 231–259 (1977).
10. J. Brehm, A theory of psychological reactance. (Academic Press, Oxford, England, 1966).
11. B. M. Staw, Knee-deep in the big muddy: A study of escalating commitment to a chosen
course of action. Organ. Behav. Hum. Perform. 16, 27–44 (1976).
12. S. Holm, A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70
(1979).
13. Y.-W. Chien, D. T. Wegener, R. E. Petty, C.-C. Hsiao, The flexible correction model: Bias
correction guided by naïve theories of bias: theory-based bias correction. Soc. Personal.
Psychol. Compass. 8, 275–286 (2014).
14. Major League Baseball (Organization), The official rules of Major League Baseball.
(Triumph Books, Chicago, 2014).
15. B. M. Mills, Technological innovations in monitoring and evaluation: Evidence of
performance impacts among Major League Baseball umpires (2015), (available at
http://www.brianmmills.com/uploads/2/3/9/3/23936510/full_revised_manuscript.pdf).
16. R Development Core Team, R: A language and environment for statistical computing
(2016; https://www.r-project.org/index.html).
17. D. Bates, M. Mächler, B. Bolker, S. Walker, Fitting linear mixed-effects models using
lme4. J. Stat. Softw. 67, 1–48 (2015).
15
18. A. Kuznetsova, P. B. Brockhoff, R. H. B. Christensen, lmerTest: Tests in linear mixed
effects models (2016; https://CRAN.R-project.org/package=lmerTest).
19. D. J. Barr, R. Levy, C. Scheepers, H. J. Tily, Random effects structure for confirmatory
hypothesis testing: Keep it maximal. J. Mem. Lang. 68, 255–278 (2013).
20. T. J. Moskowitz, L. J. Wertheim, Scorecasting: The hidden influences behind how sports
are played and games are won (Three Rivers Press, New York, First paperback edítion.,
2011).
21. R. P. Larrick, T. A. Timmerman, A. M. Carton, J. Abrevaya, Temper, temperature, and
temptation: Heat-related retaliation in baseball. Psychol. Sci. 22, 423–428 (2011).
Acknowledgments:
The authors declare no conflicts of interest. The data used in all studies will be made
available via the Open Science Framework (OSF). The complete PITCHf/x data are
available from the MLB website (http://mlb.mlb.com/gdcross/components/game/mlb/).
The authors wish to thank Devin Pope for his advice on the calculation of the deviance
measure.
16
Supplementary Materials for
Accusations of Unfairness Bias Subsequent Decisions: A Study of Major League
Umpires
Travis J. Carter, Erik G. Helzer
correspondence to: travis.carter@colby.edu
Materials and Methods (Study 1)
Data Sources and Experimental Design
Data pertaining to pitch location and game conditions for every regular season Major
League Baseball game from 2008-2013 were drawn from the several sources. The Major League
Baseball website makes available data on the location (x and z coordinates) of each pitch
(PITCHf/x), game event information (runners on base, steals, and ejections at each point in each
game), and game personnel (batters, pitchers, home plate umpires). We obtained data on Win
Expectancy (WE; the calculated probability of a team winning the game based upon the score,
inning, number of outs, runners on base, and game environment) and Leverage Index (LI; an
index of the amount of pressure in a game based upon its current conditions, such as score and
inning) from Fangraphs.com. Attendance and temperature information for each game was
obtained from Retrosheet.com. Many of the analyses that control for these secondary game-level
variables can be found in the supplemental materials.
Critically, a list of ejections was populated from these data, including all relevant
information about who was ejected by whom and at what point in the game the ejection
occurred. This list was then compared against the data and descriptions found on the Umpire
Ejection Fantasy League (UEFL) website (http://portal.closecallsports.com), which catalogues
ejections, to create a list of 1,126 ejections occurring across regular season games between 2008
and 2013, ranging from 164 (2009) to 207 (2008) per year.
Ultimately, this created a 2 (Ejection Type: Pitch-related vs. Other) × 2 (Batting Team:
Ejected vs. Non-ejected) × 2 (Period of Game: pre- vs. post-ejection) with the latter two factors
varying within each game, and the first factor varying between games. The primary dependent
variable is the pitch deviance measure (described in detail below), which occurred at the level of
the individual pitch. We also examined two major offensive outcomes: on-base percentage
(OBP) and runs scored per at-bat (R/AB), which occurred at the level of the at-bat.
Calculation of Pitch Deviance Measure:
The rulebook strike zone is defined as a box with horizontal limits (on the x-axis) that range
from one edge of home plate to the other, with vertical (on the z-axis) limits defined based on the
batter: the upper limit is defined as “a horizontal line at the midpoint between the top of the
shoulders and the top of the uniform pants,” and the lower limit is defined as “a line at the
hollow beneath the kneecap” (14). In practice, however, umpires call pitches according to a
strike zone that differs from the rulebook strike zone in systematic ways, ignoring the corners
and making adjustments for right- and left-handed batters, for instance (15). Thus, using the
rulebook strike zone is clearly inadequate as a neutral and valid reference point to assess
umpires’ judgment. Although it might seem straightforward to simply use the “practical strike
zone” instead, there is simply too much ambiguity in the data to identify hard boundaries
17
between strike and ball. Additionally, it is important to account for the possibility that, if umpires
are altering their practical strike zone based on arguments leading to ejections, then batters and
pitchers might similarly be altering their behavior to adapt to that new strike zone. For instance,
if a batter argues after being called out on a third strike that was well above the strike zone, a
pitcher may attempt to throw more pitches in the same location expecting the same result, and
batters may feel it necessary to swing at those pitches to avoid a similar called-strikeout.
Although it may be impossible to completely account for these changes in players’ behavior, we
employed a data-driven approach that should minimize its influence while simultaneously
defining the practical strike zone in a much more fluid way: calculating the probability that a
pitch in a given location would be called a ball or a strike based on what call umpires typically
make for a pitch in that specific location. Thus, regardless of any shifts in behavior on the part of
batters or pitchers as a result of an ejection or perceptions of an umpire’s shifting strike zone, as
long as the umpire is the arbiter of judgment for a given pitch, the prior probability should serve
as a neutral baseline against which to compare any individual judgment.
In order to group pitches based on their location, the x-z coordinate plane was divided into a
grid of bins. To calculate the bin size, we used the size of a baseball as a guide. The size of an
official MLB baseball is defined as measuring “not less than nine nor more than 9 1/4 inches in
circumference” (14). With the circumference defined by the rulebook as a range, we used the
midpoint of that range (9.00-9.25 in., 22.86-23.50 cm) as the value for the circumference, 9.125
in. (23.178 cm). Using basic geometry to find the diameter from the circumference, 9.125/π =
2.905 in. (7.379 cm), we calculated bins based on one-quarter (0.726 in., 1.844 cm) the diameter
of a baseball. This size was intended to reflect a balance between the precision of the location
and the precision of the probability. Smaller bins provide more meaningful discrimination based
on location, but larger bins provide more confident estimates of the prior probability by ensuring
a sufficient sample size of pitches located within the bin. It is worth noting, however, that the
analyses were robust to different bin sizes. The results were the same using both larger (one-half
baseball diameter; 1.452 in., 3.689 cm) or smaller (one-eighth baseball diameter; 0.363 in., 0.922
cm) bins.
Translating the PITCHf/x data in its raw form onto a grid of bins of a fixed size required
further calculations. To understand how we accomplished this, it’s helpful to know a bit more
about how the raw data reflect the rulebook strike zone (defined above). Having a fixed
definition for the horizontal dimension (the width of home plate) allows for all bins to have an
equal width. The PITCHf/x data scale the x-axis relative to the strike zone’s horizontal
boundaries (-1 and +1 correspond to the left and right edges of the strike zone). In order to
determine the horizontal boundaries of individual bins based on an absolute size (fractional
diameter of a baseball, as described above), we translated this into an absolute width. Because a
pitch is still considered a strike if only part of the baseball crosses the plate, we calculated this as
the width of home plate (17.00 in., 43.18 cm) plus the full width of one baseball (2.905 in., 7.379
cm)—allowing for pitches where the middle of the ball touches any part of the plate to be within
the horizontal boundaries of the strike zone—for a total of 19.905 in. (50.559 cm). Thus,
knowing that 2.0 units on the relative scale (the range of –1 to +1) corresponds to 19.905 in.
(50.559 cm) on the absolute scale, allows for an easy translation of the absolute bin widths to the
relative scale.
This same basic approach was applied to the z-axis, but because the vertical dimension of
the strike zone is defined based on each player’s height and stance, which can even vary from
pitch to pitch, it was problematic both practically and theoretically to keep a constant bin height.
18
The most reasonable solution, in our minds, was to ensure that x-z coordinate plane would be
divided into the same number of bins for each batter, even if that meant the absolute bin height
would vary from pitch to pitch. Fortunately, the PITCHf/x system identifies and reports the
absolute height of the strike zone on each pitch, which can be used to normalize the z-axis to be
on the same relative scale as the x-axis, with the vertical boundaries of the strike zone defined as
–1 and +1. We used the average absolute height of the strike zone across the entire data set
(21.761 in., 55.272 cm) to translate the intended absolute bin heights to the relative scale.
Having defined the bin boundaries, we identified which bin each pitch in the entire data set
of 2,212,150 called pitches (2008-2013) fell into. Next, we calculated the percentage of pitches
located in each bin that were called balls, effectively identifying the probability that a pitch in
that location would be called a ball. Each bin’s probability ranged from 0 (100% strikes) to 1
(100% balls; see Fig. S1, which depicts a heat-map of these prior probabilities). Note that the
probabilities were calculated separately for left and right-handed batters, given known
differences in the practical strike zones based on handedness (15). Thus, for any given pitch, the
actual call made by the umpire can be compared to the long-run probability of pitches in that
same location being called a ball or a strike. This allows us to calculate the degree to which a call
deviated from that probability by subtracting the probability of a called ball or strike from the
actual call (called ball = 1; called strike = 0). For instance, a called ball located in a bin where
80% of the called pitches were called balls (bin-probability = .80) would have a deviance value
of 0.20 (1.0 – 0.80 = 0.20). A called strike located in that same bin would have a deviance value
of –0.80. Thus, the range of possible deviance scores for a pitch goes from –0.999 (an extremely
unlikely called strike) to +0.999 (an extremely unlikely called ball). Put more simply, positive
deviance values reflect calls that were favorable to the batting team, and negative deviance
values reflect calls that were unfavorable to the batting team, relative to what would be expected.
Across the entire sample of pitches, the deviance scores form a distribution that is, by
definition, centered on zero, meaning that zero is the expected deviance value for any given pitch
or collection of pitches, and provides an absolute baseline to test for evidence of bias. That is, if
the average deviance score for a given collection of pitches is significantly different from zero,
that would be evidence of systematic bias in favor of one team, with the magnitude and valence
of the average indicating the amount and direction of bias, respectively. For instance, if the
average deviance for pitches thrown to the visiting team over the course of a single game was
+.03, that would roughly translate to ball-and-strike calls being an average of 3% more favorable
than expected.
Having an absolute comparison point is helpful, but relative comparisons are the most
relevant for the present analyses—particularly if a given umpire has a slightly more expansive
(or constrictive) strike zone than the league average. For instance, imagine a situation where the
visiting and home teams had average deviance scores of +.03 and +.06, respectively, for a single
game. The fact that both scores are positive would indicate that the umpire generally employed a
slightly smaller strike zone (more called balls) than the league average, but the home team’s
relatively larger score indicates that it received more favorable calls than the visitors, on average.
Thus, regardless of how a given umpire’s strike zone compares to the league as a whole,
comparing the deviation scores of opposing teams can reveal which team, if any, was the
recipient of undue generosity. (We also deal with umpire- and game-level variation statistically.)
Thus, examining these deviance values in the aggregate allows us to detect systematic shifts
in the umpires’ calls before and after an ejection, and whether they favor one team over another.
It is worth noting that, because umpires are generally quite accurate in their calls—as depicted by
19
the relatively small band of ambiguity in Fig. S1—the vast majority of pitches show a fairly
small amount of deviation from their prior probability. Thus, examining aggregated outcomes
represents a very conservative test of a hypothesis of shifting favorability.
Calculation of Bin Ambiguity
For each pitch, the prior probability (P) of being called a ball ranges from 0 (all strikes) to 1
(all balls), with 0.5 representing equal likelihood of being called a ball or strike. Thus, pitches in
bins with a prior probability closer to 0.5 would be considered more ambiguous. The ambiguity
index was thus calculated using the following formula:
$ 0.5 2 + 0.5
Although this index would theoretically range from –0.5 (no ambiguity) to +0.5 (complete
ambiguity—equal likelihood of called ball or strike), because pitches located in bins with zero
variability were excluded from the analyses, the lower end of the range of possible ambiguity
scores was actually –0.4967. It is worth noting that the prior probability for each pitch factors
into the calculation of both the ambiguity index and the deviation measure, and thus factors into
both sides of the regression equation. By its very definition, there are limits to the amount of
variation that can be observed for very high and low probability values (i.e. the highly
unambiguous pitches). Nonetheless, observing differences in the predictive power of that pitch’s
ambiguity based on whether the ejected or non-ejected team was batting and whether it occurred
in the pre- or post-ejection period should be informative.
Calculation of Offensive Statistics
To calculate on-base percentage (OBP), each at-bat was categorized as resulting in the
batter getting on base (coded as 1) or not (coded as 0). This allows us to examine OBP at the
level of the at-bat—essentially the likelihood that batter reached base. According to standard
scorekeeping procedure, some at-bats are not figured into the calculation of OBP, such as when
the batter successfully executes a sacrifice bunt, or is given a base due to catcher’s interference.
These instances were coded as missing, and were thus not included in the analysis.
We also examined the number of runs scored per at-bat (R/AB), which is simply the number
of runs scored by the batting team during a given at-bat. Note that we counted runs scored for
any reason, not just runs resulting from the batter getting a hit, or even a sacrifice (bunt or fly). A
run scored from a player stealing home on a passed ball, or from the umpire issuing a walk with
the bases loaded, were also counted.
Coding Ejection Type
Although baseball players and managers frequently express their displeasure with
unfavorable calls, it is not obvious exactly how one should quantify such an expression, nor how
to identify those instances where the umpire is specifically accused of being biased. We believe
that the least ambiguous examples of such expressions are cases when a player or manager
argues a call with the umpire, and is subsequently ejected from the game. Although it is not
possible to know the exact content of the arguments that lead to ejections, the authors’ collective
lifetime of experience watching baseball2 certainly suggests that the arguments typically involve
the player or manager accusing the umpire of a pattern of unfavorable (biased) calls, with the
latest pitch being only the most recent example.
In order to identify the underlying reason for each ejection, which is not included in the data
provided by MLB, an independent coder categorized each ejection based on the UEFL
descriptions of the game context in which the ejection was made. Each ejection was coded as
2 The Atlanta Braves and Seattle Mariners, respectively.
20
resulting from an argument that was either related or unrelated to pitch location. The coder was
not blind to the hypotheses, but because the descriptions of the ejection event were completely
separate from the PITCHf/x data—meaning that the coder had no knowledge of the location of
the antecedent or subsequent pitches when making the determination—there was virtually no risk
of this knowledge biasing the coder’s judgments. The vast majority of the ejections were
unambiguous, such as a batter arguing that a called third strike should have been called a ball
(pitch-related ejection), or the manager arguing that a runner called safe at first base should have
been called out (other ejection). Some cases, however, introduced some difficulty, such as an
argument about a batter who begins to swing and then attempts to hold back the swing (a “check
swing”). In this case, if the umpire rules the ambiguous motion as a swing, then it is a strike
regardless of the location. If instead it is ruled as a non-swing, then the umpire must judge it a
ball or a strike based on the pitch location—meaning that the pitching team could argue about the
lack of a check swing call, and either team could argue about a called ball or strike. In these
cases, the coder attempted to discern, based on the context of the ejection and the description of
the ejection, whether the true crux of the argument was about the location of the pitch or about
something else. Any ambiguity was resolved through discussion with the authors prior to
examining the actual pitch data.
Selection of Games and Pitches:
Of the 1,126 ejections, 489 (43.4%) were coded as being related to pitch-location, and 637
(56.6%) were coded as “other.” We excluded from the analysis any game with more than one
ejection event, or when members of both teams were ejected, so that it would be clear which
team suffered the ejection, and exactly when it occurred. That is, if several members of the same
team were ejected as a result of the same at-bat (e.g. the umpire ejects both the batter and
manager of the batting team for arguing the same called third strike), this would be considered a
single ejection event. If two members of the same team were ejected after different at-bats (e.g. a
batter was ejected immediately after arguing a called third strike, but the manager was ejected
after arguing about a call several at-bats later), then it would be considered two ejection events.
Of the 815 (72.4%) games that met the definition of a single ejection event, we further excluded
games where either team did not have at least one at-bat with a called pitch before and after the
ejection, leaving a total of 707 games in the final data set (n = 311 involving pitch-related
ejections, n = 396 involving other kinds of ejections). It is worth noting that the results do not
change if the full set of 815 games is included in the analyses.
From that set of games, there were 110,806 called pitches with valid x and z coordinates.3
To ensure the robustness of the analysis, we excluded pitches falling in bins that contained too
few pitches (fewer than 100) to be confident that the prior probability was reasonably accurate
(15,398 pitches). These pitches were virtually all called balls (99.94%), indicating that they were
well outside the strike zone. We also excluded pitches from the analysis where there was zero
variability in the bin (100% called balls or 100% called strikes; 17,195 pitches), the vast majority
of which (95.00%) were called balls. After these exclusions, the final data set we examined
consisted of 79,220 pitches, which we can be certain required at least some interpretation on the
part of the umpire, defined quite conservatively.
Statistical Analysis:
3 Some data were missing due to a combination of the occasional technical error on the part of the
PITCHf/x system, and because the system was not yet fully deployed in every stadium until mid-way
through the 2008 season.
21
Examining the effects of these independent variables involves comparing the outcomes of
opposing teams within the same game, with many of the outcomes determined by the umpire
behind home plate, who also officiated other games in the data set. By using outcomes between
opposing teams within the same game, any fixed bias in outcomes for that particular game or that
particular umpire should not be problematic (such as a given umpire’s tendency to have a larger
or smaller strike zone than the league average). However, because observations from the same
game came from the same umpire, they would violate the assumption of independence
underlying standard linear models or ANOVA. Thus, we employed linear mixed-effects models
for all analyses for Study 1, treating the three main independent variables as fixed effects.
To aid the interpretation of model parameters, we used contrast coding rather than dummy
coding for all categorical variables (i.e. Pitch-related ejections: +0.5, Other types of ejections: –
0.5; Ejected team: +0.5, Non-ejected team: –0.5; Post-ejection period: +0.5; Pre-ejection period:
–0.5).
All analyses were conducted in R (16), using the lme4 package (17) for the linear mixed-
effects models and confidence intervals, and the lmerTest package (18) to calculate p values
(using Satterthwaite’s approximations for the degrees of freedom), estimate cell means, and to
perform any post hoc or pairwise comparisons. For any post hoc or pairwise comparisons, p
values were corrected for multiple comparisons using the Holm-Bonferroni procedure (12);
corrected p values are indicated with a subscript (i.e. phb).
For the random-effects structure, we began with a maximal model (19), which we identified
as having random intercepts for home plate umpire and for game (nested within umpire),
allowing for the effects of Batting Team, Period of Game, and their interaction to vary within
individual games (i.e. random slopes). Although the maximal model also allowed for correlated
slopes and intercepts, the model would not reliably converge unless those were dropped from the
model. It is worth noting that in the few models with correlated slopes and intercepts that did
converge, the model fit was no better than a model without correlated slopes and intercepts, as
evidenced by a likelihood ratio test, c2 (6) = 6.86, p = .334. The final model we employed should
thus account for any non-independence of the individual observations, while also allowing us to
examine or control for the impact of pitch, at-bat, and game-level variables (e.g. whether there
was an impact of game attendance on deviance scores).
The structure defined above was used for all analyses of the pitch-level deviance scores.
Because the two at-bat-level measures (OBP and R/AB) involved non-normal data, the analyses
for those measures involved generalized linear mixed-effects models, specifically a mixed-
effects binary logistic regression for OBP, and mixed-effects Poisson regression for R/AB. In
both cases, even the slightly reduced version of the maximal model failed to converge, so the
complexity of the random-effects structure was selectively reduced until convergence could be
achieved. This resulted in a model with random intercepts for game, and random slopes for
Batting Team, Period of Game, and their interaction within games (without allowing for
correlated slopes and intercepts). In other words, only the random intercepts for umpires were
dropped from the model.
For the pitch-level deviance measure, we first tested a linear mixed-effects model featuring
a 2 (Batting Team: Non-ejected team vs. Ejected team) × 2 (Game Period: Pre-ejection vs. Post-
ejection) × 2 (Ejection Type: Pitch-related vs. Other kinds of ejections) design for the fixed
effects. The results of this model are presented in Table S1. Based on the significant three-way
interaction, we conducted a linear mixed-effects model testing the effects of Batting Team and
Game Period separately for pitch-related ejections (see Table S2) and other kinds of ejections
22
(see Table S3), which is what is reported in the main text. Based on the results of the deviance
measure, for the at-bat-level measures, we only conducted the analyses on the subset of data with
pitch-related ejections.
Mediation analysis. As reported in the main text, we conducted a mediation analysis to
confirm that the observed relative improvements in at-bat-level offensive outcomes (OBP and
R/AB) by the ejected team after the ejection do in fact result from shifts in the umpires’ behavior
(as measured by the deviance scores), rather than being solely due to regression to the mean.
That is, the observed pattern of effects for OBP and R/AB—a mere attenuation of the pre-
ejection bias—could be explained by regression to the mean. The pitch-level deviance scores,
however, show a reversal (not attenuation) of the pre-ejection bias in the post-ejection period, so
a regression-to-the-mean explanation does not apply. Thus, if we can demonstrate that the effect
of the ejection on the at-bat-level variables was statistically mediated by deviance scores, then
we can be reasonably sure that those effects were not merely a regression to the mean. In order to
ensure that the mediator (deviance scores) and the dependent variables (OBP and R/AB) were all
operating at the level of the at-bat, we calculated at-bat-level deviance scores as the mean
deviance score for each at-bat. For this analysis, we treated the batting team × period of game
interaction as the independent variable. Consistent with the pitch-level analysis, there was a
significant effect of the independent variable (batting team × period of game) on the mediator
(at-bat-level deviance scores), b = 0.040, 95% CI [0.023, 0.058], t(328.20) = 4.52, p < .001.
When the mediator was included in the same analyses described above predicting OBP (mixed-
effects binary logistic regression) and R/AB (mixed-effects Poisson regression), it was a strong
predictor in both cases (ps < .001). To test the significance of the indirect effect (the product of
the effect of the independent variable on the mediator, and the effect of the mediator on the
dependent variable, controlling for the independent variable), we calculated Monte Carlo
confidence intervals with 50,000 repetitions (MacKinnon, Lockwood, & Williams, 2004;
Preacher & Selig, 2012), which did not include zero in either case, as reported in the main text
(OBP: 0.039, 95% CI [0.022, 0.058]; R/AB: 0.016, 95% CI [0.008, 0.027]).
Supplementary Text (Study 1)
Examining the At-Bat Prompting the Ejection.
For all analyses, we examined only outcomes occurring during the pre- and post-ejection
periods, excluding the at-bat that prompted the ejection (hereafter referred to as the ejection at-
bat). To confirm that the ejection at-bat featured particularly egregious calls from the perspective
of the team that suffered the ejection, at least for pitch-related ejections, we examined the
deviance scores of pitches thrown during those at-bats (n = 1,236 pitches). This would most
clearly be evident in highly unlikely strike (vs. ball) calls when the ejected team was batting (vs.
pitching) during the ejection at-bat. Indeed, there was a significant interaction between Ejection
Type (Pitch-related vs. Other) and Ejected Team Role (Batting vs. Pitching) on the deviance
measure, b = 0.384, 95% CI [0.298, 0.470], t(901.90) = 8.74, p < .001 (see Fig. S2). For pitch-
related ejections, the average deviance score was considerably lower when the ejected team was
batting (M = –0.210, 95% CI [–0.241, –0.179]) compared to when the ejected team was pitching
(M = 0.194, 95% CI [0.145, 0.243]), t(944.0) = –13.80, phb < .001. For other types of ejections,
there was no difference in deviation scores depending on whether the ejected team was batting
(M = –0.007, 95% CI [–0.050, 0.036]) or pitching (M = 0.013, 95% CI [–0.035, 0.061]) at the
time of the ejection, t(1007.7) = –0.62, phb = .535. As expected, the at-bat prior to a pitch-related
ejection featured calls that were highly unfavorable for the team that was ultimately ejected
23
(unlikely strike calls for the batting team, unlikely ball calls for the pitching team), suggesting
that it was these egregious calls that ultimately prompted the ejection-inducing argument.
Control variables.
The interaction between batting team (ejected vs. non-ejected) and period of game (pre- vs.
post-ejection) holds when controlling for characteristics of the pitch location (prior probability
within the bin, number of pitches in the bin; p < .0001), characteristics of the current at-bat and
game situation (current count of balls and strikes, number of outs, number of runners in scoring
position; p < .0001), as well as other metrics of situational pressure and importance, such as Win
Expectancy (WE; probability the batting team would win), Run Expectancy (RE; probability the
batting team would score a run during this at-bat), and Leverage Index (LI; an index of the
importance of the situation), p < .0001). Although the home plate umpire issued pitch-related
ejections in all but two cases (99.4% of games), the interaction also held when limiting the
analysis to ejections issued by the home plate umpire (p < .0001).
Considering a continuous measure of game period.
The models described above and reported in the main text treat Period of Game as a
categorical variable (i.e. pre- vs. post-ejection), largely as a matter of simplicity. Such an
approach implicitly assumes that any bias exhibited by the umpire (i.e. favoring one team over
another) is constant within the pre- and post-ejection periods, but it’s worth considering whether
the reality is more complicated than can be accommodated by a dichotomous variable. For
instance, it could be that the apparent reversal in bias in the post-ejection period is limited to a
few at-bats immediately following the ejection—the umpire could issue a few make up calls to
assuage the ejected team’s anger before reverting to a more neutral (unbiased) baseline, or
perhaps even revert back to the previous pattern of bias.
To test this possibility, we first created a continuous measure of game period by calculating
the distance from the current outcome to the ejection event (hereafter referred to as AB distance)
by subtracting the current at-bat from the ejection at-bat. For instance, if the ejection occurred
during the 25th at-bat of the game, then pitches thrown during the 29th at-bat would all have a
distance score of +4 (29 – 25), and all pitches thrown during the 12th at-bat would have a
distance score of –17 (29 – 12). Thus, all events in the pre-ejection period have negative values,
and all events in the post-ejection period have positive values.
First, in a model with batting team, AB distance, and their interaction predicting deviance
scores (pitch-related ejections only), the batting team × AB distance interaction was significant,
b = 0.0005, 95% CI [0.0002, 0.0007], t(237.13) = 3.62, p < .001, which is consistent with the
categorical variable (see Fig. S3, top panel). However, because it’s likely that any irregularities
would be non-linear, we also considered a model testing linear, quadratic, and cubic versions of
the AB distance measure (and each version’s interaction with batting team). Intriguingly, all
three of the interaction terms were significant (Linear: p < .001; Quadratic: p = .033; Cubic: p =
.012; see Table S4). As can be seen in Fig. S3 (bottom panel), which depicts the predicted values
from this model 50 at-bats before and after the ejection, there is some evidence of non-linearity
in the relative favorability of the umpire’s calls, but those seem to occur primarily at more
extreme values of AB distance, which are also the least represented in the data (hence the
increasingly large confidence intervals).
To take a slightly different approach, we examined the post-ejection period separately to see
if the bias in favor of the ejected team (relative to the non-ejected team) remained constant as the
distance from the ejection at-bat increased. Consistent with a constant effect, in a model with
batting team, AB distance, and their interaction, there was the expected main effect of batting
24
team, b = 0.021, 95% CI [0.033, 0.009], t(255.95) = 3.51, p < .001, but no interaction between
batting team and AB distance, b = –0.0002, 95% CI [–0.0009, 0.0005], t(2482.39) = –0.50, p =
.615. A similar model including linear, cubic, and quadratic terms for AB distance showed the
same result: a main effect for batting team (p < .001), but not for any terms involving AB
distance (all ps > .14). Based on these two results, it appears that the shift in bias after the
ejection remains—and remains relatively constant—throughout the game, and that treating game
period as a categorical variable is a valid approach.
Testing potential moderators.
We began by considering whether properties of the ejection itself may have moderated the
observed reversal in bias exhibited by the umpires in games featuring pitch-related ejections.
Although there is evidence that much of the proverbial home team advantage is due to more
favorable calls from the umpires (20), there was no evidence that the shift in the umpires’ favor
after an ejection was solely granted to the home team. Indeed, the shift in favor of the ejected
team was evident whether it was the home team or the away team that was ejected (both ps <
.001). None of the other properties of the ejection we examined moderated the effect, such as
whether the ejected team was batting or pitching at the time of the ejection, or whether the
person ejected was a manager or a player. In each case, the interaction between batting team
(Ejected vs. Non-Ejected) and period of game (Pre vs. Post Ejection) remained significant (all ps
< .001), but there was no significant three-way interaction with the moderator (all ps > .17).
We also tested whether variables related to the game itself might have mattered, including
the ambient temperature (see 21), game duration (in minutes; log transformed due to right skew),
and game attendance. Neither temperature nor game duration moderated the effect (both ps >
.11), nor diminished it (interaction: all ps < .0001). Game attendance (centered on the grand
mean, 31,311.74), however, did show some promise as a moderator. Although the interaction
between batting team and period of game remained significant (p < .0001), there was also a
three-way interaction with game attendance (p = .018). The interaction was such that the basic
effect—bias against the ejected team prior to the ejection, and in favor of the ejected team after
the ejection—was larger when attendance was high, and nearly absent when it was low.
However, we hesitate to draw strong conclusions from this finding, as attendance is no doubt
correlated with other relevant variables, such as the home team’s current record, or the
importance of the game.
We also tested whether variables related to the importance and impact of the situation
surrounding the ejection at-bat. That is, it’s possible that umpires may exhibit a stronger bias in
favor of the ejected team when the ejection came at a particularly bad time for that team.
Specifically, we tested the score differential prior to the ejection, and situational importance
metrics associated with the ejection at-bat, including RE, RE24 (the change in RE as a result of
the ejection at-bat), LI, WE, and WPA (Win Probability Added; the change in WE as a result of
the ejection at-bat). In every case, the interaction between batting team (ejected vs. non-ejected)
and period of game (pre- vs. post-ejection) remained significant (all ps < .0001), but there was no
significant three-way interaction with the moderator (all ps > .34).
Finally, we tested whether the umpires’ calls might be sensitive to the importance of the
current situation (specifically the current at-bat’s WE, LI, and RE), especially depending on
which team is batting. For instance, the shift in bias from the pre- to post-ejection period might
only occur for relatively unimportant situations, perhaps to avoid having an undue influence on
the game in very important situations. However, there was no evidence of any sensitivity to
context. For all three of the variables we tested, the interaction between batting team (ejected vs.
25
non-ejected) and period of game (pre- vs. post-ejection) remained significant (all ps < .0001), but
the three-way interaction was not significant (all ps > .16).
Materials and Methods (Study 2)
Experimental Design
The study employed a 2 (Feedback: Control vs. Critical) × 2 (Period of Study: Pre- vs. Post-
Critical Feedback) factorial design, with the first factor manipulated between-participants, and
the last factor manipulated within-subjects.
Participants.
We recruited 100 participants (64 male, 36 female) from Amazon.com’s Mechanical Turk
to play a “visual estimation game” with incentives for accuracy. Data collection stopped upon
reaching the target sample size (N = 100), and the data were not examined prior to that point.
We excluded participants who did not appear to make a reasonable effort at the task by
setting minimum standards for accuracy (at least 55%) and variable responding (at most 90% of
responses could be in the same direction). Based on these criteria, which were the only criteria
we considered, seven participants were excluded from the analyses, though including them does
not change the outcome of any analysis.
Dot-Estimation Task.
After consenting to participate, a game designed to test “perceptual acuity” was introduced
to participants. The game required participants to make perceptual judgments similar to those
made by umpires in Study 1. Participants completed 10 blocks of 10 trials. To begin, for each
block, participants were assigned a target number that varied randomly between 12 and 20. Then,
for each of the ten trials within the block, an array of dots was flashed on the screen and
participants had to judge whether the number of dots in the array was higher or lower than the
target number (indicating their response with a key press). The dot arrays were randomly
generated for each trial such that the actual number of dots was within a certain range (between
5% and 25%) of the target number (but never equal to the target number), ensuring variability in
difficulty. Because the number of dots was generated randomly, the number of trials in each
block where the correct response was “higher” also varied (ranging from 0-10, but typically 3-7).
This was made explicit to participants to ensure they were not deliberately giving an equal
number of higher and lower responses.
On each trial, the target number was displayed on the screen for 600ms, followed by the dot
image for 400ms. Participants were required to respond within 2 seconds, or they had to repeat
the trial with a newly generated dot array. This was intended both to make the task somewhat
difficult and to encourage snap judgments. Prior to starting the game, participants completed a
practice block of easy trials, on which they were given feedback about their performance, to
ensure they understood the game.
Over the 100 trials, participants averaged 75.91% accuracy, 95% CI [74.46%, 77.37%],
which was significantly greater than chance, t(92) = 35.40, p < .001, but nowhere near a ceiling
effect. Thus, the difficulty of the task was neither impossible nor trivial.
Partner Description and Incentive Structure.
Participants were told that they had been assigned the role of Judge, and were paired with a
partner who had been assigned to the role of Observer, whose role was to watch the Judge’s
performance and provide feedback after each block of trials. In truth, the partner did not exist,
and all feedback provided by the Observer was bogus.
26
The incentives for participants assigned to the role of Judge (i.e. all actual participants) were
based on accuracy. In addition to their base pay, participants received an additional $0.02 for
each correct response. To penalize guessing, this bonus was only awarded for the number of
correct responses above chance (50%). Thus, over the course of 100 trials, participants with
perfect accuracy would earn a bonus of $1.00 ($0.02 for each of 50 correct responses above
chance), and participants with a mere 60% accuracy would earn $0.20 ($0.02 for each of 10
correct responses above chance). Accuracy of 50% or less would earn no bonus. Participants
were not given feedback about their performance, and therefore their bonus, until the very end of
the experiment.
The Observer’s monetary incentives were also explained to participants. Whereas the Judge
(participant) was paid for accuracy, the Observer (partner) was paid based on the direction of the
response (i.e. higher or lower) given by the Judge, regardless of its accuracy. For instance, for
each “higher” response, the Observer’s bonus would increase by $0.01, and for each “lower”
response, it would decrease by $0.01. The particular response associated with a positive or
negative outcome was counterbalanced. As in the main text, the response that yielded a higher
monetary bonus for the Observer is referred to as the “directional” response. Thus, if the
participant gave 64 directional responses out of 100 trials, the Observer would earn a bonus of
$0.14. With 50 directional responses or fewer, the Observer’s bonus would be zero. This
misaligned incentive structure was designed to make it clear that the Observer’s feedback,
particularly any instruction to give more directional responses, might be purely self-serving—to
the detriment of the participant’s own interests.
To ensure that participants clearly understood both their own and their partner’s incentives,
participants were required to pass a “quiz” about the incentive structure before the first block of
trials.
Feedback Manipulation.
At the end of each block of trials, participants received some feedback ostensibly written by
their partner. In the beginning, the feedback was relatively neutral (e.g. “man those dots move
fast. Nice!”) or served to remind the participant of the partner’s incentives (e.g. “ur getting
quick! Remember, higher is better! jk”). After block 5, the feedback diverged depending on
condition. Participants in the critical feedback condition began receiving feedback critical of
their performance, but always suggesting that the participant’s errors were systematically biased
in a direction that hurt the partner’s bonus (e.g. “too many lows! I think you might have missed a
couple of highs there! help us both out!”). This critical feedback continued and intensified (e.g.
“what, do you have something against highs??” and “you’re killing me here!”) until the final
block of trials. Participants in the control condition received feedback that did not point to a
particular directional bias (e.g. “so many highs and lows! it’s hard to keep up with all of them.
sorry i can't be more help!” and “i’m getting tired! keep your head in the game, ur almost done”).
The final feedback, after the last block of trials, was somewhat neutral and identical in both
conditions. Thus, participants responded to 50 trials before and 50 trials after the critical
feedback began, allowing us to compare responses not just between conditions, but also over
time.
Explicit Beliefs and Manipulation Checks
After each block of trials, following the partner feedback, participants estimated the number
of correct responses (from 0 to 10) they gave, as well as the number of higher/lower responses
they gave in that block. These explicit estimates allowed us to ascertain whether participants
were aware of any bias creeping into their responses.
27
After the final block of trials and the last round of feedback, participants evaluated both
their own and their partner’s overall ability at the dot estimation task (1 = Very Poor; 6 =
Average; 11 = Very Good), and estimated the number of trials, out of 100, that they thought their
partner would have answered correctly. These items were intended to rule out the possibility that
participants began to doubt their own abilities in the face of critical feedback. To confirm that
this is not a likely explanation for the results, we conducted a 2 (Feedback: Control vs. Critical)
× 2 (Target: Self vs. Other) mixed-model ANOVA, with feedback as a between-participants
variable, and target as a within-participants variable. As expected, there was a strong main effect
of target, F(1,91) = 29.18, p < .001, η"
# = .092, indicating that participants clearly thought that
they were more skilled than their partner at the game. More importantly, this main effect was
qualified by a significant feedback × target interaction, F(1,91) = 4.47, p = .037, η"
# = .047 (see
Fig. S7). Specifically, although participants in the control condition did indeed think more highly
of their own abilities, t(91) = 2.12, phb = .036, this tendency was exaggerated in the critical
feedback condition, t(91) = 5.40, phb < .001, suggesting that participants only became more sure
of their own abilities (and more skeptical of their partner’s abilities) as a result of the critical
feedback. Furthermore, there was no difference in participants’ estimates of how many trials
their partner would have gotten correct (Critical feedback: M = 61.92, 95% CI [57.17, 66.67];
Control: M = 60.00, 95% CI [55.72, 64.28]), t < 1, ns. Thus, across multiple measures, we found
no support for the idea that critical feedback led participants to doubt their abilities. In fact, just
the opposite appeared to be true: as described above, participants gave higher estimates of their
own abilities in the critical feedback condition compared to participants in the control condition,
t(91) = 2.86, phb = .010.
During the last block of questions, participants also indicated the degree to which they
thought their “partner was judging the dot images accurately, or had a biased perspective” (1 =
Definitely biased to judge Lower; 6 = Partner’s judgment was accurate; 11 = Definitely biased to
judge Higher). This item, intended as a check on the incentive structure, was reverse-scored for
participants whose partner’s incentive was for “lower” responses, so that higher numbers always
indicated greater bias. (The direction of the partner’s incentive did not impact perceptions of
bias, p = .106.) Overall, participants did perceive a great deal of bias in their partner (M = 7.58,
95% CI [7.07, 8.09]), as confirmed by a one-sample t-test against the scale midpoint, t(92) =
6.21, p < .001. However, this belief was stronger for participants in the critical feedback
condition (M = 8.44, 95% CI [7.67, 9.21]) than participants in the control condition (M = 6.58,
95% CI [6.07, 7.09]), t(82.47) = –4.05, p < .001, d = 0.814. This confirms that participants did in
fact perceive a bias consistent with the partner’s monetary incentives, and that the perception of
bias was especially large when the partner appeared to give feedback consistent with her own
self-interest.
Participants also answered two questions related to the cover story, one about the degree to
which their partner motivated them (1 = Not at all; 11 = A great deal), and one about how
pleasant it was to have someone watching their performance (1 = Extremely Unpleasant; 11 =
Extremely Pleasant). Although it was not explicitly intended as such, participants’ responses to
this last question can speak to the possibility that participants liked their partner, and made more
directional responses in order to ensure that she got a reasonable bonus. Contradicting that
account, participants in the critical feedback condition reported that it was less pleasant to have
someone else watching their performance (M = 5.06, 95% CI [4.28, 5.84]), compared to
participants in the control condition (M = 6.33, 95% CI [5.75, 6.90]), t(91) = 2.56, p = .012, d =
.680.
28
Finally, after providing basic demographic information (gender, age, race, household
income), participants were asked for “any comments or thoughts you might have about your
experience in the task, the experiment, the incentives, or about your partner.” Some participants
expressed mild suspicion about whether their partner actually existed, but none with certainty, so
no one was excluded from the analyses based on suspicion.
Statistical Analysis
All analyses were conducted in R (16). For all tests of the simple-effects, p values were
corrected for multiple comparisons using the Holm-Bonferroni procedure (12); corrected p
values are indicated with a subscript (i.e. phb).
As reported in the main text, we tested the effect of the feedback manipulation by
conducting a 2 (Feedback: Critical Feedback vs. Control) × 2 (Timing: Pre- vs. Post-Feedback)
mixed-model ANOVA, with feedback as a between-subjects variable and timing as a within-
subjects variable.
Supplementary Text (Study 2)
Effects on Accuracy
Although our main prediction was about the shift in directional responses, we considered
whether the shift in directional responses had an impact on participants’ accuracy. The results of
a similar mixed-model ANOVA predicting accuracy of responses also yielded an interaction
between feedback condition and timing, F(1,91) = 4.97, p = .028, η"
# = .052. Participants in the
critical feedback condition became less accurate after the introduction of the critical feedback
(Mpre = 0.772, 95% CI [0.747, 0.798]; Mpost = 0.746, 95% CI [0.720, 0.771]), though this
difference was not statistically significant, t(91) = –1.76, phb = .164. Conversely, participants in
the control condition showed a non-significant improvement in accuracy (Mpre = 0.747, 95% CI
[0.720, 0.775]; Mpost = 0.771, 95% CI [0.744, 0.798]), t(91) = 1.44, phb = .164.
To see if participants were able to detect the shifts in accuracy, we conducted a 2
(Feedback: Critical Feedback vs. Control) × 2 (Timing: Pre- vs. Post-Feedback) × 2 (Response
Type: Actual vs. Estimated) mixed-model ANOVA on the proportion of correct responses, with
feedback as a between-subjects variable, and timing and response type as within-subjects
variables. This analysis revealed main effects of feedback condition (p = .016) and response type
(p < .001, indicating a general tendency to underestimate accuracy), a two-way interaction
between feedback condition and response type (p = .022), all of which were qualified by a three-
way interaction, F(1,91) = 9.22, p = .003, η"
# = .092. The interaction was such that, despite
actually becoming less accurate after the critical feedback started (as described above),
participants in the critical feedback condition estimated that their accuracy increased, t(91) =
2.18, phb = .064. Participants in the control condition, however, did not change their estimates
over time, t(91) = 0.60, phb = .547.
Materials and Methods (Explicit Beliefs Study)
We recruited 90 participants from MTurk for a brief study about fairness. Participants were
randomly assigned to read one of three versions of the scenario: Disadvantage-to-self,
Advantage-to-self, or Self-as-reviewer. Each version described a situation where a manager is
found to be exhibiting a biased pattern of behavior (similar to that of the umpires in Study 1). As
an example, the disadvantage-to-self scenario is presented below:
Suppose you and a coworker are being evaluated for a promotion at work. Only one of
you will be promoted. To decide who gets the promotion, an independent reviewer from
29
Human Resources (HR) will be monitoring both your and your coworker’s performance
over a two-week period and judging it for its quality.
The reviewer evaluates your and your coworker’s work in this way: She keeps close
tabs on the amount that each of you worked and what you accomplished each day, and
assigns a score for each of you based on both criteria. Then, at the end of each day, you and
your coworker receive a “scorecard,” which displays the rating that each of you received
for the day. For example, on a particular day, you might receive a score of 4 to 3,
indicating that you outscored your coworker on that day. This scorecard system is intended
to keep both of you performing at your best. It’s also worth noting that the reviewer’s only
goal is to render a fair and accurate judgment.
By the halfway point in the judging period, you have noticed a bias in the daily scores.
It seems to you that your coworker and you are performing at more or less at the same
level, yet at the end of most days, your coworker receives a higher score than you. You are
concerned that if this continues you will be passed up for the promotion based upon these
skewed scores.
You don’t want to get the reviewer in trouble, but you decide to bring up the issue of
bias to the reviewer’s boss in HR. The boss takes a look at the scores, alongside the
concrete work that each of you has done, and decides that he, too, sees a bias that disfavors
you. He decides to get in touch with the reviewer and let her know that her reviews appear
to be biased.
Suppose that after talking to her boss, the reviewer agrees that her ratings have shown
bias. Because of the way the system works, she cannot change her past ratings, so she is
faced with a question about what is the right thing to do. There is one week remaining in the
review process and then the promotion decision will be made. What is the fair thing for the
reviewer to do?
a. To be fair, the reviewer should simply attempt to rid herself of bias, judging
you and your coworker on the merits of the work you both do for the next week
b. To be fair, the reviewer should keep using the same criteria to judge you and
your coworker, even if there is a bias in those criteria
c. To be fair, the reviewer should “reverse” her bias, so that she now favors you
over your coworker for an equal number of evaluations
What varied between conditions was the role in which participants imagined themselves to
be. In the disadvantage-to-self condition (above), participants imagined that they had been
disadvantaged as a result of the manager’s decisions; in the advantage-to-self condition,
participants imagined themselves to be the co-worker who had benefitted from the manager’s
bias; in the self-as-reviewer condition, participants imagined themselves in the role of the
manager. In every case, they answered a version of the question (above) appropriate to their role.
The different conditions were intended to allow for the possibility that self-interest would
color participants’ views of what was fair. Although the pattern of responses is consistent with
that notion, because the three conditions did not significantly differ, we collapsed across the
three versions of the scenario.
30
Fig. S1.
Heat map depicting the likelihood of ball or strike call based on pitch location as viewed from
the umpire’s perspective (Study 1). Probabilities are calculated from 2,212,150 called pitches in
regular season games between 2008-2013.
LeftHanded Batters
RightHanded Batters
3
2
1
0
1
2
3
321 0 1 2 3 321 0 1 2 3
X (normalized units)
Z (normalized units)
0% Balls
(100% Strikes)
25% Balls
(75% Strikes)
50% Balls
(50% Strikes)
75% Balls
(25% Strikes)
100% Balls
(0% Strikes)
Proportion of Called Balls
31
Fig. S2
Pitch-level deviance scores from pitches thrown during the ejection at-bat (Study 1). Error bars
represent +/– 1 standard error.
0.3
0.2
0.1
0.0
0.1
0.2
0.3
Pitchrelated ejections Other kinds of ejections
Ejection Type
Deviance score
Batting Team Pitching Team
32
Fig. S3
Predicted values from a linear mixed-effects model predicting pitch-level deviance scores by
Batting Team and AB Distance (top panel; Study 1). The bottom panel depicts a model that
includes linear, quadratic, and cubic terms for AB distance. Shaded regions represent 95%
confidence intervals.
0.04
0.03
0.02
0.01
0.00
0.01
0.02
0.03
0.04
50 45 40 35 30 25 20 15 10 5 0 5 10 15 20 25 30 35 40 45 50
Atbat from ejection
(Ejection atbat = 0)
Deviation score
Ejected Team NonEjected Team
0.25
0.00
0.25
50 45 40 35 30 25 20 15 10 5 0 5 10 15 20 25 30 35 40 45 50
Atbat from ejection
(Ejection atbat = 0)
Deviation score
Ejected Team NonEjected Team
33
Fig. S4
On-base percentage (OBP; Study 1). Error bars represent +/- 1 standard error.
Fig. S5
Runs scored per at-bat (R/AB; Study 1). Error bars represent +/- 1 standard error.
0.26
0.28
0.30
0.32
0.34
0.36
0.38
0.40
PreEjection Period PostEjection Period
Pre/Post Ejection
OnBase Percentage (OBP)
Ejected Team NonEjected Team
0.00
0.05
0.10
0.15
0.20
PreEjection Period PostEjection Period
Pre/Post Ejection
Runs scored per atbat
Ejected Team NonEjected Team
34
Fig. S6
Example image used in the dot-estimation task (Study 2).
35
Fig. S7
Participants’ perceptions of their own and their partner’s ability in the game by feedback
condition (Study 2). Error bars represent +/– 1 standard error.
4
5
6
7
8
Control Critical Feedback
Critical Feedback Condition
Task Ability Evaluation
Self Partner
213
E. Bijleveld and H. Aarts (eds.), The Psychological Science of Money,
DOI 10.1007/978-1-4939-0959-9_10, © Springer Science+Business Media New York 2014
Abstract This chapter discusses the psychological research related to the act of
spending money, with the aim of understanding the underlying psychological
processes involved. To that end, the emotions involved in spending money before,
during, and after the money changes hands are explored, including the role of antici-
pated and anticipatory emotions, different orientations to the gains and losses inher-
ent in an act of spending, and the process of hedonic adaptation. Additionally, given
how fundamental choice is to the act of spending money, factors that infl uence
the decision- making process are discussed, including the role that comparative
processes and expectations play in the process of making decisions and evaluating
their outcomes. In each case, particular attention is paid to the psychological forces
that infl uence the ultimate goal underlying any act of spending: happiness. Finally,
several concrete strategies for making purchases most likely to lead to success on
this goal are identifi ed, including purchasing experiences over possessions, spending
pro-socially, and making meaningful purchases.
The Act of Spending Money
The act of spending money is absolutely ubiquitous in modern life. It is the primary
way that we meet our basic needs, spending it on food, clothing, shelter, health care,
transportation, and entertainment, and is so ingrained in modern life that we rarely
refl ect on what that act represents. At its most basic level, the act of spending is
nothing more than an exchange: one person gives money to another and receives
some good or service in return. This defi nition is serviceably descriptive, but omits
Chapter 10
The Psychological Science of Spending Money
Travis J. Carter
T. J. Carter (*)
Department of Psychology , Colby College ,
5550 Mayfl ower Hill Drive , Waterville , ME 04901-8855 , USA
e-mail: tjcarter@colby.edu
214
any psychological antecedents or consequences for the spender. For one thing, it
leaves out the element of choice. Money isn’t spent by accident, the result of tripping
over an errant shoelace; one chooses to exchange money for some particular pur-
chase instead of other possible purchases—or instead of purchasing nothing at all.
Choices are made with a purpose, intended to create some outcome. That particular
choice is based on the belief that the purchase will produce a greater hedonic
benefi t—for oneself, or for others—than the alternatives over some period of time
(Mellers & McGraw, 2001 ; Mellers, Schwartz, & Ritov, 1999 ). In addition to that
expected hedonic gain, spending money also inherently involves costs. There is
obviously the direct monetary cost, but also the opportunity cost: all of the other
ways that one could have spent this money must now be foregone. Thus, a more
psychological defi nition of the psychological act of spending money would be a
simultaneous loss (of money and opportunity) and gain (of some good or service)
for oneself and/or someone else that one chooses to undertake based on some beliefs
about future hedonic states.
To see the implications, it’s worth unpacking the various components of this defi -
nition further. First, gains and losses are inherently affectively laden constructs;
they are important because they create feelings of pleasure and pain, even when
merely anticipating a potential gain or loss (see Knutson, Rick, Wimmer, Prelec, &
Loewenstein, 2007 ). Although it can be seen as the output of some cost–benefi t
analysis, the choice to spend money is not merely some cold cognitive calculation;
it is an affective event involving some balance of pleasure and pain paid out over
some period of time. Purchases are certainly made with the intention of producing
an emotional experience, but emotions felt during the act of considering a purchase
can also infl uence the decision-making process and its outcome (Andrade & Ariely,
2009 ; Isen, 2001 ; Lerner, Small, & Loewenstein, 2004 ; Mattila & Wirtz, 2000 ).
Second, the exact nature of the pleasure and pain experienced as a result of a given
purchase is by no means certain. Rather, it is how we anticipate we will feel as a
result of the purchase, a forecast based on some imagined future. Making a forecast
requires that we rst imagine what the basic facts of the situation will be like before
estimating how that imagined situation will make us feel. Unfortunately, we tend to
be overconfi dent and optimistic in our predictions about the basic facts of a future
situation (e.g., Griffi n, Dunning, & Ross, 1990 ; Newby-Clark, Ross, Buehler,
Koehler, & Griffi n, 2000 ), so perhaps it is not surprising that predictions of future
emotional states are also typically inaccurate (Wilson & Gilbert, 2003 ). This is
especially important because of a third aspect of the act of spending: choice. The act
of spending inherently involves an act of choosing—choosing not only if but also
which thing to purchase. Thus, forecasting a single imagined future is insuffi cient.
In order to choose which option to purchase, we must imagine a future scenario for
each possible choice we might make, and predict how each one will make us feel.
The uncertainties and biases involved can multiply quite quickly, turning what
could have been a simple exchange into a daring act of mentalism. Fourth, the self
is an important component to any purchase (see Belk, 1988 ). The decisions we
make help make us who we are, and purchase decisions are no different. Indeed
some purchases are explicitly intended to refl ect or convey aspects of our personalities
T.J. Carter
215
(Tian, Bearden, & Hunter, 2001 ). Finally, and relatedly, other people are certainly
present in our forecasted futures. In addition to predicting how something will make
you feel, you must often imagine how a given purchase will make someone else
feel—a spouse or friend who might share in the outcome, for instance—and factor
these other feelings into decision-making process.
The remainder of this chapter will explore these facets of the act of spending
money in greater depth, but always keeping in mind why people choose to spend
money: in order to make themselves happier (see Csikszentmihalyi, 2000 ; Diener &
Fujita, 1995 ). Indeed, based in part on the belief that accumulating wealth will allow
them to spend more money and further improve their welfare (Aknin, Norton, &
Dunn, 2009 ; K a h n e m a n , K r u e g e r , S c h k a d e , S c h w a r z , & S t o n e , 2004 ; Van Praag &
Frijters, 1999 ) , p e o p l e w o r k v e r y h a r d t o a c q u i r e m o n e y ( s e e A h u v i a , 2008 ) , 1 often
sacrifi cing time with family and friends in the pursuit of wealth (Kasser, Cohn,
Kanner, & Ryan, 2007 ; N i c k e r s o n , S c h w a r z , D i e n e r , & K a h n e m a n , 2003 ), even to the
point that wealth acquisition has become a mindless enterprise (Hsee, Zhang, Cai,
& Zhang, 2013 ). This chapter will examine how each of the different aspects of the act
of spending money highlighted above connects to the broader goal of happiness, but
it’s worth rst asking the more global question: does spending money, on average,
make people happier?
One fairly straightforward approach to answering this question is simply to
examine the relationship between wealth and happiness. Having money is, after all,
a precondition to spending it (ignoring for the moment the perils of using credit
cards to spend money one doesn’t have). Thus, if spending money is effective in
serving its purpose, then the richest individuals, who have more money to spend,
should be the happiest. If not, then the pursuit of additional wealth seems futile;
having more money wouldn’t actually make people any happier. An abundance of
research over many decades shows that although there is most defi nitely a positive
relationship between wealth (typically measured as income) and happiness, it is
typically quite modest and suffers considerably from diminishing returns (for recent
reviews, see Diener, Tay, & Oishi, 2013 ; Sacks, Stevenson, & Wolfers, 2012 ). That
is, although richer people are generally happier than poorer people, the hedonic
impact of additional wealth levels off. The same amount of additional wealth has a
fairly dramatic impact on the happiness of the impoverished, but it has a fairly small
impact on the wealthy.
One of the generally accepted reasons for this has to do with how money is spent
at different levels of wealth. At lower income levels, money is generally being spent
to meet basic human needs, like food and shelter, which, not surprisingly, produces
1 It is worth noting, of course, that people accumulate wealth for reasons that have nothing to do
with specifi c planned expenditures, such as to prevent an unexpected and catastrophic life event
(like an expensive health care emergency) from destroying one’s ability to meet basic needs.
Indeed, the anxiety associated with debt has devastating effects on well-being (Brown, Taylor, &
Price,
2005 ). The status that comes with wealth is also seen by some as an end in and of itself
(Kasser & Ryan,
1993 ). While these factors undoubtedly play a role in the acquisition of wealth,
because this chapter is specifi cally exploring the act of spending money and not its acquisition,
they are better suited for discussion elsewhere.
10 Psychological Science of Spending Money
216
a fairly large hedonic return ( Biswas-Diener & Diener, 2001 ).
2 At higher income
levels, where basic needs can be taken for granted, much of the money that people
spend can be considered discretionary: spending on wants instead of needs, with the
express intention of making themselves happier. It is this general realm of spending,
where the pressures of basic survival don’t apply, and indeed where the relationship
between wealth and happiness is fairly modest, that will be the focus of this chapter,
because it is the one that requires more explanation. If money spent on discretionary
purchases seems to make a relatively small contribution to well-being, then we are
left with two possibilities. Either discretionary spending is simply ill suited to pro-
ducing happiness (despite our intuitions and intentions) or people simply have mis-
guided notions about how to spend their money to actually make themselves happier
(Dunn, Gilbert, & Wilson, 2011 ) . I n t h e s e c t i o n s t h a t f o l l o w , I w i l l f o c u s o n t h e r o l e
that emotions and choice play before, during, and after one engages in an act of
spending, and in particular identifying issues that prevent purchases from producing
their intended effect: happiness. Then, I will outline some strategies, including the
types of purchases and the recipient of the expenditure, that can maximize each indi-
vidual act of spending’s contribution toward that overarching goal of happiness.
Emotions
As described above, the mere act of spending money itself is not hedonically neu-
tral. It’s important to note, however, that equivalent gains and losses produce asym-
metrical hedonic outcomes (pleasure and pain, respectively). As put forth by
prospect theory (Kahneman & Tversky, 1979 ), from the same reference point,
losses are felt more strongly than gains (Kahneman & Tversky, 1984 ; Tversky &
Kahneman, 1991 ; cf. Novemsky & Kahneman, 2005 )—dropping $20 down a storm
sewer would feel worse than nding $20 on the street would feel good. Thus, when
considering a purchase, it is no surprise that people naturally focus on the losses that
they will incur (Carmon & Ariely, 2000 ), because that is often the more potent emo-
tional experience.
Anticipated vs. Anticipatory
However, the affect experienced as a result of a given purchase does not simply start
at the moment the money is spent; there are emotions felt well prior to the purchase,
and which continue to reverberate long into the future. That is, there is a distinction
to be made between anticipated emotions and anticipatory emotions (Loewenstein,
Weber, Hsee, & Welch, 2001 ). Anticipated emotions are the emotions you expect
2 At the extreme low end of the income spectrum, spending money might even be better thought of
as intended to decrease misery rather than increase happiness (see Martin & Hill,
2012 ).
T.J. Carter
217
to feel when you actually take possession of the new purchase—the joy you’d
experience when using a new iPhone, or the guilt you might feel after eating a tub
of popcorn at the movies—and aren’t really emotions at all. They are cognitions, a
forecast of what your experience with the purchase will be like at some point in the
future, and the emotions you predict that experience will stir up. The role of antici-
pated emotions on choice and evaluation is a largely conscious one: we decide
whether and how to spend money based on how we anticipate the various courses of
action will make us feel (Mellers et al., 1999 ; Shiv & Huber, 2000 ), and evaluate the
outcome based partly on how the actual outcome compares to our expectations
(Bell, 1985 ).
Anticipatory emotions, on the other hand, are the emotions you experience at the
very moment you are considering the purchase: imagining the pleasure you will
experience when you fi nally get to use your new iPhone might very well make you
giddy in the present, or you might feel some immediate guilt as a result of imagining
gorging yourself on buttery popcorn. Or, instead of thinking about how the purchase
you’re considering might make you feel, you might think about the opportunity
costs—purchases you’ll have to delay or forgo as a result of spending this money.
Buying a new car might mean you have less money to spend on dinners at restau-
rants, and you might feel some negative emotions while merely considering missing
those opportunities. The role of anticipatory emotions in choice and evaluation tends
to be less conscious, and as a result, people may not realize how large an impact it
might have (Andrade & Ariely, 2009 ) . T h e s e i m m e d i a t e e m o t i o n s c a n b e u s e d a s a
cue for how one should choose in normal circumstances (e.g., Pham, 1998 ) , b u t c a n
also exert a considerably more powerful (and hard to control) infl uence when the
emotions are more intense (see Loewenstein, 1996 ) .
Because they play different roles in guiding the choice and evaluation process,
the distinction between anticipated and anticipatory emotions is important to under-
standing the act of spending money. However, it can be diffi cult to tease their roles
apart in practice, largely because they infl uence each other both directly and indi-
rectly (Loewenstein & Lerner, 2003 ). The type and magnitude of the expected
(anticipated) emotions resulting from some event in the future (eating a delicious
meal, for instance) will infl uence the type and magnitude of the anticipatory emo-
tions you experience immediately upon imagining that future state. At the same
time, anticipatory emotions can infl uence exactly how that future state is imagined,
which will, in turn, infl uence the emotional experience predicted to result from it.
What’s more, because the act that sets it all in motion is imagining a future state,
that entire process will also be infl uenced by any number of other factors that are
important to future-oriented thinking. For instance, simply thinking about an event
that is close in time, as opposed to one that is further off into the future, will lead
people to imagine it very differently. The closer in time an event is, the more likely
people are to focus on its more concrete aspects (Trope & Liberman, 2003 ), to
reduce their subjective confi dence about what exactly will transpire (Gilovich, Kerr,
& Medvec, 1993 ), and to experience more intense immediate emotions (Loewenstein,
1996 ). This diffi culty notwithstanding, researchers have had a great deal of success
both measuring and manipulating the separate cognitive (anticipated) and affective
10 Psychological Science of Spending Money
218
(anticipatory) processes involved in decision-making and outcome evaluation
(see Loewenstein & Lerner, 2003 for a review). One notable issue that has arisen
relates to the pleasure and pain—both anticipated and anticipatory—evoked by the
gain and loss side of a monetary transaction, respectively, and the psychological
consequences of focusing on one side or the other.
The Pain of Paying
Because people vary in the degree to which they tend to focus on acquiring pleasurable
gains (promotion goals), rather than avoiding painful losses (prevention goals; Higgins,
1997 ), focusing on the gain rather than the loss side when pondering a purchase
decision will have a big impact on both anticipated and anticipatory emotions, and as a
result, the likelihood of actually spending money. The different spending habits of
so-called spendthrifts a n d tightwads illustrate the consequences of gain/loss focus
quite well (Rick, Cryder, & Loewenstein, 2008 ). Spendthrifts tend to focus on what
they’ll gain from spending money, and all but ignore the costs, and so end up spending
too freely on purchases whose hedonic impact is eeting at best. Tightwads generally
focus on the losses involved when spending money and will often refuse to spend
money that might nonetheless yield signifi cant hedonic gains.
3 Indeed, in addition to
concentrating on the “pain of paying” (Prelec & Loewenstein, 1998 ), tightwads worry
about opportunity costs, something that most people do not do spontaneously
(Frederick, Novemsky, Wang, Dhar, & Nowlis, 2009 ) u n l e s s t h e y a r e a c t i v e l y c o n s i d e r -
ing many different options and must forgo all but the one they choose (Carmon,
Wertenbroch, & Zeelenberg, 2003 ; s e e a l s o A r i e l y , H u b e r , & W e r t e n b r o c h , 2005 ).
T h e c o n t e x t i n w h i c h a d e c i s i o n i s m a d e c a n c r e a t e a s e n s e o f twith one’s natu-
ral focus and lead to better outcomes, such as greater satisfaction (Avnet & Higgins,
2006 ) . A s s u c h , o n e w a y t o e n c o u r a g e t i g h t w a d s t o p a r t w i t h t h e i r m o n e y i s t o e m p h a -
size aspects of the purchase situation that reduce the perceived pain of paying.
For instance, in one experiment, participants were asked to imagine that they could
choose to receive a boxed set of DVDs from Amazon.com for free, if they were will-
ing to pay $5 to cover shipping costs. In the baseline condition, true to form, spend-
thrifts were considerably more willing than tightwads to pay the $5 in order to receive
the DVDs. However, when the shipping charge was described as “a small fee,
making the amount seem insignifi cant and reducing the perceived pain of paying it,
tightwads were just as willing as spendthrifts to pay the fee (Rick et al., 2008 ).
Perhaps examining these different spending tendencies, rather than looking at
the relationship between wealth and happiness, can provide a more direct answer to
the question of whether spending money makes people happier. That is, if spending
money does increase well-being on average, then tightwads, who are generally quite
reluctant to part with their money, may be missing genuine opportunities to impact
3 Those who generally feel that they spend and save appropriately are referred to as unconfl icted
(Rick et al.,
2008 ).
T.J. Carter
219
their happiness. Conversely, spendthrifts, who engage in spending opportunities
they probably shouldn’t, might actually be measurably happier than both tightwads
and unconfl icted spenders as a result. To nd out how these different attitudes
toward spending money relate to more global measures of happiness, I recruited
participants from Amazon.com’s Mechanical Turk to complete the Spendthrift–
Tightwad scale (STTW; Rick et al., 2008 ) a n d t h e S u b j e c t i v e H a p p i n e s s S c a l e
(Lyubomirsky & Lepper, 1999 ) . E v e n w h e n c o n t r o l l i n g f o r r e l e v a n t d e m o g r a p h i c
differences (income and age), participants classifi ed as tightwads did report lower
subjective happiness ( M = 4.47, SD = 1.28) than the other two groups, β = .232,
t (309) = 2.07, p < .05, but spendthrifts ( M = 4 . 7 6 , S D = 1 . 2 2 ) a n d t h e u n c o n icted
( M = 4 . 7 6 , S D = 1 . 2 9 ) w e r e e q u a l l y h a p p y , t < 1 , n s .
Why do spendthrifts, who experience the least pain of paying, and who should
presumably be reaping some hedonic rewards from their unrestrained spending, show
no gains in happiness relative to the unconfl icted? Or, put another way, what does this
non-difference say about the ability for purchases to actually make people happy?
One reason might be related to how people adapt to hedonic events, like the short-term
shifts in happiness produced by spending money. That is, since spendthrifts are more
focused on the potential gains (or at least less concerned with the potential losses),
they may be more likely to succumb to a classic forecasting error: failing to anticipate
how quickly they will adapt to their future circumstance (Gilbert, Pinel, Wilson,
Blumberg, & Wheatley, 1998 ; W i l s o n , W h e a t l e y , M e y e r s , G i l b e r t , & A x s o m , 2000 ) ,
an issue to which I’ll return below. There is also the possibility that, by not confronting
the pain of paying, spendthrifts are not forced to fully consider whether a given
purchase’s predicted bene ts will outweigh its costs, and as a result are making the
kinds of purchases least likely to actually increase happiness.
It’s worth noting that although tightwads experience the pain of paying to a much
greater degree than most, the loss of money is an inevitable part of any purchase,
meaning that everyone will experience the pain of paying to some degree. In many
circumstances, the exchange of money for goods and services is simultaneous,
meaning that the pains and pleasures are also experienced simultaneously, the pain
thus robbing some of the pleasure. However, the exchange need not be simultane-
ous, and by temporally decoupling the gain and loss, one can reduce the chances
that pain experienced from the loss of money will negatively impact the pleasure
experienced from the new purchase (Prelec & Loewenstein, 1998 ). One way to do
this is to consume rst and delay the pain of payment for as long as possible, hoping
that it will be less painful in the future than it would be right now (Kassam, Gilbert,
Boston, & Wilson, 2008 ). To an extent, this has its intended effect: the immediate
pleasures are unspoiled by an immediate loss. The allure of this approach is evident
in the difference between paying with cash and with credit card. Cash payments are
immediate and visceral—the money literally leaves your hands and becomes some-
one else’s possession. Credit cards, on the other hand, are abstract and distant; they
allow you to put off the pain of paying until next month, often while enjoying the
benefi t immediately. Spending money this way may seem painless, and almost
certainly does reduce the negative anticipatory emotions that might prevent one from
making a purchase, but it only forestalls the inevitable. When the end of the month
10 Psychological Science of Spending Money
220
rolls around and the credit card bill comes due, that pain may actually be magnifi ed
because the pleasure you experienced is already in the past. What’s more, because
credit cards diminish the pain in the present, they can encourage reckless spending—
youre much more likely to have a “what was I thinking?!” moment for purchases
made with credit cards than with cash (e.g., Prelec & Simester, 2001 ; S o m a n , 2001 ).
A somewhat counterintuitive alternative that seems to have considerable hedonic
benefi ts is to endure the pain of paying immediately and delay consumption until
later. Paying in advance may be painful initially, but it allows two distinct benefi ts.
First, you to get the benefi ts of anticipating a positive experience (e.g., Nowlis,
Mandel, & McCabe, 2004 ; an issue discussed further below), and second, because
the pain of paying is behind you when actually consuming, there is no anticipated
pain to dampen the experience. All-inclusive resorts might cost a bundle up front,
and they do hold some risk of paying more for the same amount of consumption, but
they do effectively decouple the payment from the experience. Rather than feeling
a slight twinge of pain each time you shell out the money for a cocktail, you can feel
like you’re getting a better and better deal with each drink—putting the sunk cost
effect (Arkes & Blumer, 1985 ) to work in your favor, though with the possible side
effect of severe hangovers. If making yourself happy is the goal, then it might be
worth the risk of overpaying to feel better about the money you’re spending. In
short, it’s often far better to pay up front and delay consumption until later (for a
review, see Dunn & Norton, 2013 ).
Hedonic Adaptation
Purchases, like anything else that produces hedonic gains, are subject to one of the
fundamental facets of human experience: hedonic adaptation (Frederick &
Loewenstein, 1999 ; s e e a l s o D i e n e r , L u c a s , & S c o l l o n , 2006 ) . T h a t i s , o v e r t i m e , t h e
same experience that once made you dizzyingly happy will merely bring a smile to
your face. Hedonic adaptation to a new car may be inevitable, but it isn’t necessarily
problematic unless it’s unaccounted for in the decision-making process. Unfortunately,
when people anticipate how a given purchase will make them feel, they can recognize
that it will become less intense over time, but generally fail to consider this fact at the
time of purchase (Ubel, Loewenstein, & Jepson, 2005 ; W a n g , N o v e m s k y , & D h a r ,
2009 ) . F o c u s i n g o n l y o n t h e i m m e d i a t e s p i k e i n h a p p i n e s s a n d i g n o r i n g t h e s u b s e -
quent decline means that the anticipated experience—the one on which people base
their expectations, and thus, their decisions—may be quite different from the actual
experience, increasing the chances of disappointment. Accurately predicting not just
the initial hedonic experience that a given purchase will provide, but also how it will
change over time, is important in making sound purchase decisions.
In order to accomplish more accurate predictions, it’s helpful to know a little
more about how hedonic adaptation operates. One of the reasons our experiences
become less intense over time is through the process of satiation with repeated
experiences. For instance, people know not to eat their favorite meal seven nights in
T.J. Carter
221
a row for fear that, by the time night seven rolls around, the mere smell of it will at
best be unappetizing, and at worst will be stomach-churning. People seek variety
and novelty to prevent satiation with repeated experiences, but probably don’t do it
optimally (for a review, see Alba & Williams, 2013 ). Even with adequate intervals
between events, sometimes we gain expertise that renders the earlier experience less
impressive. For instance, many novice wine drinkers are quite happy to drink what-
ever wine is put in front of them. The fl avors that are easiest to discern (sweetness,
for instance) are often the avors characteristic of less expensive wine. But, over
time, as the palate grows more sophisticated, many wine drinkers start to crave
more complex and subtle avors, and must pay handsomely for the privilege.
4
Thus, they must spend more money to achieve the same hedonic benefi t—a certain
amount of happiness from drinking a glass of wine—than would have been neces-
sary earlier in their wine-drinking career. What was once a favorite bottle will
eventually begin to taste cloyingly sweet, or perhaps bland and muted. Indeed,
many positive life changes, like purchasing a new car or getting a raise, create
aspirations over time that make the previously great change seem unimpressive
(see Sheldon & Lyubomirsky, 2012 ) .
One obvious lesson of hedonic adaptation, of course, is that novices should not
spend a lot of money on something that requires more sophistication than they possess
to fully appreciate. Another implication is that attempting to maintain a relatively
stable level of happiness may require spending ever-increasing amounts of money.
This is, in many ways, similar to the way that drug addiction operates. Neurological
systems respond to repeated use of addictive drugs with neuroadaptation: since for-
eign chemicals (e.g., cocaine) are doing the same job as natively produced neurotrans-
mitters (e.g., acting on dopamine receptors), the systems that produce those
neurotransmitters begin to produce less and less over time. With fewer neurotransmit-
ters naturally available to bind to those receptors, those systems will require increas-
ing amounts of the drug to achieve the same level of activation. Plus, since those
systems are typically involved in the experience of pleasure, the reduced activation of
those systems during any period of abstention reduces positive affect, which fuels a
desire for the drug just to get back to baseline levels—the neurochemical equivalent
of loss aversion (Koob & Le Moal, 2001 ). In just the same way, if you decide to
upgrade from the 1994 Ford Fiesta you’ve driven for years to a new Mercedes, the fi rst
drive off the lot will be thrilling. After a year or 2, that thrill will mostly be gone, and
the feeling of luxury provided by the Mercedes will eventually begin to feel normal.
The only way to get that thrill again will be to increase your dosage with the new
model, which will not be cheap. Any abstention from that new baseline, say if you
go back to driving your old Fiesta while the Mercedes is in the shop, what was once
perfectly adequate will feel perfectly intolerable—your baseline level of activation
has changed, and you’ll jones for that new normal.
4 A recent blind taste-test study found that those with some training with wine show a positive
(though small) relationship between price and enjoyment, meaning that they enjoyed the more
expensive wines more. Novices, however, actually showed a negative correlation; they liked the
cheaper wines better (Goldstein et al.,
2008 ).
10 Psychological Science of Spending Money
222
In fact, this is one explanation for the very modest relationship between wealth
and happiness: as income rises, people adapt to their new standard of living, and
must spend more to feel the same amount of happiness they had at their old salary
(Diener & Biswas-Diener, 2002 ). A reduction in salary is now treated as a loss,
which has more severe negative consequences for well-being than the initial increase
did positive consequences ( Boyce, Wood, Banks, Clark & Brown, in press). What’s
more, new evidence suggests that wealth may actually hinder the ability to savor
positive experiences and emotions. In one study, participants given a series of
vignettes, such as discovering an amazing waterfall, and asked how they would
behave in each scenario. Wealthier participants, as well as participants who were
merely exposed to reminder of wealth (a photograph of a stack of money), were less
likely to claim that they’d use a savoring strategy, such as reminiscing or telling
friends about the experience. That reduced ability to savor seems to explain some of
the relatively weak correlation between wealth and happiness; wealthy participants
were less happy because they were less likely to engage in savoring activities
(Quoidbach, Dunn, Petrides, & Mikolajczak, 2010 ) . T h u s , i t m a y not be that spending
more money is absolutely required in order to overcome the forces of adaptation.
Rather, focusing on the experiences, savoring them each time they happen, may
prevent the need from spending an ever-increasing amount of money (Chancellor &
Lyubomirsky, 2011 ; Kasser, 2011 ).
Choices, Choices, Choices
Aside from having money to spend, the initial step toward the act of spending
money is to choose which particular good or service you’ll be purchasing. In the
simplest case, you are faced with a single purchase option, and the decision is sim-
ply whether or not to make the purchase. Presumably, as described above, that deci-
sion is based on some assessment of the expected costs compared with the expected
hedonic gains. For instance, you might hear that the new Daft Punk album just came
out, and decide whether or not it is worth $10 to own the album. The calculus is
fairly simple: if you think that you’ll get a greater hedonic gain from listening to the
synthesized singing of French robots than the other ways you can think of spending
$10, then you should choose to buy it. Otherwise, keep the money.
This extremely simple scenario is becoming less and less common, however.
The more likely case is that there are multiple options you are considering that
would ll the same need, and you must choose only one of them. When buying
lunch, for example, it’s often not a simple question of whether or not to buy a salad
(and “not” isn’t really an option, since you’re not about to go hungry). Instead,
you’ll need to decide whether to buy a salad, a burrito, a slice of pizza, a bowl of
curry, a falafel sandwich, or any of the myriad lunch options that happen to be avail-
able to you at the time. Each of these options carries with it some potential hedonic
gain, some monetary cost, and choosing any one of them requires that you forego
the other options—at least for the day.
T.J. Carter
223
Even assembling the set of options you intend to choose from—the consideration
set—is becoming an increasingly diffi cult task in and of itself (see Schwartz, 2004 ).
In theory, more options should lead to better outcomes for consumers, as the likeli-
hood of nding an option that exactly matches one’s preferences should increase
with the size of the choice set (e.g., Johnson & Payne, 1985 ; Kahn & Lehmann,
1991 ; Shugan, 1980 ), and indeed, people generally share this intuition, preferring to
have a lot of options to choose from (Chernev, 2003 ). However, the number of
options available within product categories has ballooned well past what is actually
good for consumers (Schwartz, 2004 ),
5 sapping people of the motivation to engage
in the decision-making process (Iyengar & Lepper, 2000 ).
6 In practice, the cognitive
burdens created by large choice sets and time constraints can leave people feeling
confused and unconfi dent (Haynes, 2009 ; Lee & Lee, 2004 ), even when they have
a great deal of control over the information presented to them (Ariely, 2000 ).
To illustrate how you might approach a choice from a large set of options, imagine
that you are deciding which television to buy. You should be able to narrow your
options by excluding options that are too expensive or too small (or large, for that mat-
ter) pretty easily, but you may still have hundreds of options to choose from, and no
easy way to know which one to choose. There are at least two major strategies for
whittling one’s consideration set down to a single chosen option. One approach is to
compare the relevant attributes of all of the options you’re considering, and attempt to
identify the very best option. This strategy is referred to as maximizing . An alternative
approach to making such a decision is to use a satisfi cing s t r a t e g y : s i m p l y s e t a s t a n -
dard for quality and select the very fi rst option you come across that meets this stan-
dard (Simon, 1955 ). Although maximizing should theoretically yield better
outcomes—done properly, you should always get the best option available—in prac-
tice, people who tend to engage in maximizing (rather than satisfi cing) are subject to
a host of negative psychological outcomes, such as increased depression and decreased
life satisfaction (Schwartz et al., 2002 ). What’s more, maximizers have a hard time
committing to any one option, showing less of the post-decision rationalizing that
helps us feel good about our choices no matter how good a choice it was (Sparks,
Ehrlinger, & Eibach, 2012 ) . T h i s h e l p s e x p l a i n w h y m a x i m i z e r s r e p o r t l e s s s a t i s f a c -
tion than satisfi cers despite obtaining objectively better outcomes (Iyengar, Wells, &
Schwartz, 2006 ) . T h e d i f f e r e n c e s b e t w e e n u s i n g a m a x i m i z i n g a n d a s a t i s cing
approach, and particularly the differences in the resulting psychological well-being,
help illustrate two of the big reasons why large choice sets can be problematic: the
large number of comparisons required and unreasonable expectations.
5 This is in part due to companies attempting to distinguish themselves in a crowded marketplace.
For any given brand, adding more options leads consumers to infer that the brand has expertise
in the area, and therefore that its offerings are better (Berger, Draganska, & Simonson,
2007 ) .
This approach is, of course, less effective when everyone does it, starting the arms race that created
ultra-specifi c options like Diet Caffeine-Free Cherry Vanilla Coke, and resulted in sagging store
shelves and bewildered consumers.
6 A recent meta-analysis suggests that the demotivating effect of too-much-choice may be present
in only certain circumstances, such as under time constraints or when the need to justify one’s
choice is high (see Scheibehenne et al.,
2009 , 2010 ). This is described further below.
10 Psychological Science of Spending Money
224
Comparisons
Making a choice from a large consideration set can require a large number of
comparisons, particularly when using a maximizing strategy. To be sure, it is quite
natural to engage in comparative processes (Gilbert, Giesler, & Morris, 1995 ), and
people often do need comparative information in order to evaluate something prop-
erly. In one particularly telling example, participants were willing to pay more for
7 oz of ice cream when it overfl owed a tiny cup than for 8 oz of ice cream when it
only partially fi lled an enormous cup—they used the size of the cup to inform their
judgments, when it really should be extraneous to how much the ice cream itself is
worth (Hsee, 1998 ; Sevdalis & Harvey, 2006 ). Without the ability to make certain
comparisons (e.g., the actual amount of ice cream), misleading cues (like inappro-
priately sized cups) can cause people to make poor decisions.
Indeed, some comparisons might be quite helpful, particularly when they are
easy to make, and there is little chance for error. In the television example above, it’s
quite easy to compare models on price and size, because those attributes are align-
able (e.g., Gentner & Markman, 1994 ). Clearly, cheaper is better than more expen-
sive, and larger is better than smaller (within reason, of course). If price and size
were the only attributes televisions had, it would be relatively trivial to make a
choice; you’d still need to nd the sweet spot in the apparent trade-off between price
and size, but that’s it. Unfortunately, there will quite often be other features that do
not align—a feature that is present in one option but absent in others. One set might
have a smart dimming feature, while another might have a suite of internet- connected
apps, and still another might include a camera so that you can video chat with fam-
ily and friends. How can you possibly compare these features or decide which one
you’ll appreciate more over time? Attempting to compare incomparable features
can be very frustrating, incredibly demanding (Zhang & Markman, 2001 ), and
because people tend to search for more options as they learn more about the differ-
ent nonalignable features available (Griffi n & Broniarczyk, 2010 ), it can exacerbate
the problem by making the choice set even larger. As the size of the choice set
increases, so do the number of diffi cult comparisons required, which has negative
consequences for your ultimate satisfaction with your choice (Reutskaja & Hogarth,
2009 ; Scheibehenne, Greifeneder, & Todd, 2010 ). Perhaps it is no surprise that hav-
ing more alignable features can mitigate some of the downsides of large choice sets
(Herrmann, Heitmann, Morgan, Henneberg, & Landwehr, 2009 ).
A b i g p a r t o f t h e r e a s o n t h a t n o n a l i g n a b l e f e a t u r e s a r e s u c h a n i s s u e i s r e l a t e d t o
the different modes in which we make evaluations (see Hsee, Loewenstein, Blount,
& Bazerman, 1999 ) . I n t h e s t o r e , m a k i n g a d e c i s i o n b e t w e e n t e n d i f f e r e n t t e l e v i s i o n s ,
you are in joint evaluation (JE) mode. In your living room, where you’ll actually
watch the television, youre in separate evaluation (SE) mode (Hsee & Zhang, 2004 ) .
People can rely on comparative information in JE, when the options are side by side,
but less so in SE, when the other comparison targets are not present. For instance, in
the store, you might see that Television A has a slightly better picture quality than
Television B and decide that this justi es its higher price. However, because its very
T.J. Carter
225
diffi cult to evaluate small differences in attributes like picture quality without a direct
comparison, you may not be able to appreciate that slightly better picture once you
bring the television home, removing the justifi cation for spending the extra money
spent. Attributes that may seem important on a relative level (i.e., when in JE mode)
might not matter at all on an absolute level (i.e., when in SE mode), as long as they’re
above some threshold of quality.
This can work slightly differently for nonalignable attributes, because unlike
alignable attributes, your memory for the presence or absence of some feature can
make SE mode feel like JE mode. If you decide not to spend the extra money to get
Television As better picture quality (an alignable attribute), as long as the picture
quality of Television B generally looks good to you, it is unlikely to impact your
day-to-day enjoyment. However, if you choose a set without the smart dimming
feature (a nonalignable attribute), each time you are nearly blinded by the screen
when turning on the television at night, you might recall that you could have avoided
that experience by getting a different television, and that knowledge can diminish
your satisfaction. Even though you’re not in the store anymore, because you learned
about and retained information that does not require the comparison target to be
present to evaluate, you may fi nd yourself in JE mode and lose some of the benefi ts
of getting away from comparative information. This is not to say that these non-
alignable attributes cannot contribute to enjoying the money you spend, but that
they can come with unanticipated costs. Engaging in an extensive comparison pro-
cess can haunt you later on (Dhar, Nowlis, & Sherman, 1999 )—it can even feel like
the unchosen options that you considered closely are being taken away from you
(Carmon et al., 2003 ). Without such extensive comparisons, you might remain
blissfully unaware.
Expectations
When deciding how to spend your money, your expectations will play a role in how
you decide as well as how you evaluate the outcome. While pondering whether or
not to make a particular purchase, people certainly do try to anticipate how that
purchase will ultimately make them feel and make their choices based on these
beliefs (Mellers et al., 1999 ; Shiv & Huber, 2000 ). Later, when evaluating the pur-
chase, people compare their actual experience with the purchase to their prior
expectations of its performance (e.g., Bell, 1985 ; Oliver, 1980 ) as well as how their
experienced affect matches their expected affect (Patrick, Macinnis, & Park, 2007 ;
Phillips & Baumgartner, 2002 ). It’s easy to see how people might be wrong on
either count and in either direction. In terms of performance, you might correctly
expect a new wool sweater to be warm and comfortable but fail to anticipate how
itchy it gets, or you might be pleasantly surprised that a new jacket is much better in
the rain than you expected. In terms of affect, even if your predictions about how a
new pair of shoes will feel are very close to the reality, you might nd that you get
much more or much less enjoyment out of them than you expected you would
10 Psychological Science of Spending Money
226
(particularly if you fail to consider the role of adaptation, as described above).
Money is generally considered well-spent when expectations of performance and
experience are met or exceeded, creating happiness and satisfaction, and ill-spent if
those expectations are not met, creating dissatisfaction and regret (Bell, 1985 ;
Oliver, 1980 ).
Expectations are tricky, however, because they are not completely independent
of how the event itself is experienced (Wilson, Lisle, Kraft, & Wetzel, 1989 ). For
instance, participants in one study who spent some time thinking about how great a
Hershey’s kiss would taste, thus infl ating their expectations, ended up enjoying the
chocolate more than participants who simply ate it right away (Nowlis et al., 2004 ).
Delaying consumption thus has additional benefi ts beyond decoupling the pleasures
of consumption from the pain of paying, as described above. It provides hedonic
benefi ts from the mere act of anticipating something positive, and it provides time
for positive expectations to increase enjoyment of the event. There are limits to how
much expectations can positively infl uence our experiences, of course, so it’s impor-
tant not to raise expectations well beyond what is reasonable, or dissatisfaction and
regret are the likely outcomes. That is, there is a sweet spot in which we are able to
reap the benefi ts of anticipation without succumbing to the problems of missed
expectations. This is particularly true of our affective expectations, since affective
experience is generally more intense during anticipation than recall (Van Boven &
Ashworth, 2007 ), and people aren’t particularly good at predicting the magnitude
(Buehler & McFarland, 2001 ; Gilbert et al., 1998 ) or duration (Wilson et al., 2000 )
of the emotions brought on by some future event. When people inevitably do
misforecast their affective reaction, it seems to be that feeling worse than expected
negatively impacts evaluations, but feeling better than expected doesn’t have an
equivalent positive impact (Patrick et al., 2007 ). Consistent with the notion that
losses loom larger than gains (Kahneman & Tversky, 1984 ), people spend a lot
more time thinking about why an affective experience didn’t live up to their expec-
tations, but simply accept a more positive affective experience without further
elaboration (Gilovich, 1983 ; Hastie, 1984 ) .
T h e d o w n s i d e s o f e x p e c t a t i o n s a r e e s p e c i a l l y evident in large choice sets, since the
large number of options can create the expectation that the perfect option is actually
available (Diehl & Poynor, 2010 ). This expectation certainly seems reasonable—
how could you not nd exactly the right television for you from the hundreds of
models available? Having such high expectations can lead to a more extensive
search if that perfect option does not present itself quickly, further encouraging a
maximizing approach. Plus, as described above, the more extensive your search,
the more you learn about nonalignable features (Griffi n & Broniarczyk, 2010 ) . T h a t is,
as you browse through the available television sets, you will start with a certain
number of features that you know you should be checking and comparing, such as
price, screen size, picture quality, and energy consumption. When you encounter a
set that has a smart dimming feature, something you didn’t previously realize you
might want, you now must add it to the list. Each new attribute that you encounter
teaches you something about the possibilities, and changes your expectations about
what it means to be a good choice. The longer you search, the more you learn, the
T.J. Carter
227
higher your expectations, and the less likely you are to ultimately end up being
satisfi ed with your choice (Griffi n & Broniarczyk, 2010 ).
High expectations can infl uence not just the search and decision-making process
but also what people end up choosing. When the choice is diffi cult, as it typically is
from large choice sets, many people feel a greater pressure to make a decision that
is justifi able to others, and the justifi able choice isn’t necessarily the best choice, at
least in terms of happiness. For instance, people are more likely to select a utilitarian
option than a hedonic option, since it’s easier to justify buying something that’s useful
than something that could be considered indulgent (Sela, Berger, & Liu, 2009 ).
People also place a greater emphasis on alignable features than nonalignable fea-
tures because they are easier to compare and therefore easier to justify (Markman &
Medin, 1995 ). In fact, the negative effects of choice overload may only occur when
decision-makers have some expectation of needing to justify their choice, since the
strategy most likely to produce a justifi able choice is maximizing; in the absence of
that pressure, large choice sets might not be detrimental at all (Scheibehenne,
Greifeneder, & Todd, 2009 ; S c h e i b e h e n n e e t a l . , 2010 ; s e e a l s o B o t t i & M c G i l l ,
2006 ; T s i r o s , M i t t a l , & R o s s , 2004 ) . T h e m e r e a c t o f e n g a g i n g i n a n e x t e n s i v e s e a r c h
and comparison process, with expectations for a good outcome high, the pressure to
get a really good option may be quite high. After all, if you’ve put in a great deal of
effort to nd a good option, if it doesn’t turn out well, then you can blame yourself
for not doing just a little bit more searching or comparing.
For all the reasons outlined above, it may be no surprise that the kind of exten-
sive search process that maximizers engage in, with all its comparisons and effort,
might provide an objectively better outcome, but might actually produce less
enjoyment (Iyengar et al., 2006 ). Thus, whenever possible, you should avoid large
choice sets, engage in relatively few comparisons, keep the pressure to get the
very best option low, and try to keep in mind whether the relative differences
between options will actually produce a meaningful gain in enjoyment. To be
sure, many choice contexts are set up in ways that makes it diffi cult to take that
advice. Plus, much of that advice is of the “thou shalt not” variety, which isn’t
always particularly helpful. To provide more positive approaches, the next section
specifi cally discusses purchases that, by their very nature, eliminate (or at least
lower) many of the roadblocks between the act of spending money and the expected
hedonic payout.
On What, and on Whom, Should You Spend Money?
The sections above defi ned and described the act of spending in terms of the
psychological processes involved, with a special emphasis on issues that prevent a
purchase from achieving its intended outcome: happiness. This section focuses on
specifi c types of purchases that tap more directly into the psychological processes
most likely to yield satisfaction and increase overall well-being. To start, the distinction
between material possessions (tangible objects like jewelry, clothes, and electronic
10 Psychological Science of Spending Money
228
gadgets) and experiences (intangible purchases like vacations, meals at restaurants,
and concerts) has proven quite useful (Van Boven & Gilovich, 2003 ). Generally,
research suggests that for the same amount of money, experiences tend to be more
satisfying, and make people happier, than possessions (Carter & Gilovich, 2010 ,
2012 ; Howell & Hill, 2009 ; Howell, Pchelin, & Iyer, 2012 ; Nicolao, Irwin, &
Goodman, 2009 ; Van Boven & Gilovich, 2003 ; cf. Caprariello & Reis, 2013 ).
Although there are several specifi c reasons why experiences seem to offer
hedonic benefi ts, much of the explanation has to do with the features inherent to
each type of purchase. It’s worth stating, of course, that the defi ning features vary
by degree, and thus the distinction between experiences and possessions isn’t
always clear-cut. Although most experiences are indeed intangible, there are cer-
tainly physical objects that are highly experiential when they are being used—
allowing them to change states like ice melting and refreezing. Although a good
ction book is a physical object, it is highly experiential while you are reading it:
mentally transporting you to other places, times, or even to other realities. Similarly,
owning a physical copy of your favorite movie is indeed a tangible object, but your
main interaction with it is through the experience of watching the lm. Once that
experience is over, the object goes back on the shelf, just like any other material
possession. The existence of these purchases with ambiguous properties does not,
however, impugn the importance of the distinction between material and experien-
tial purchases. Even though some purchases might seem quite slushy, not easily
categorized as solid ice or uid water, focusing attention on the ice or the water
makes different psychological processes salient, thus creating different psychologi-
cal outcomes—as if the mere act of focusing on the water melted all of the ice.
For instance, when the exact same purchase (e.g., a boxed set of music or a 3D TV)
is described in terms of its material or experiential qualities, it has the same benefi -
cial psychological effects as more canonical possessions or experiences (Carter &
Gilovich, 2010 , 2012 ; Rosenzweig & Gilovich, 2012 ). Plus, people generally have
little trouble understanding the distinction and can readily identify examples that
observers agree fi t the categories well, apparently interpreting a gradient as distinct
hues (Carter & Gilovich, 2010 ). Indeed, in the studies investigating that distinction,
recalling different types of purchases based on even the barest description of the
categories seems to have hedonic consequences for participants, suggesting that
the categories are both useful and consequential. Still, it might be better to think of the
distinction between experiences and possessions as a continuum, and the position of
any one purchase on that continuum as a function of not just its inherent properties,
but also which properties are psychologically salient at the moment (see Carter &
Gilovich, 2013 ).
S o w h a t i s i t a b o u t e x p e r i e n c e s t h a t s e e m t o m a k e p e o p l e h a p p i e r ? A l t h o u g h i t i s
undoubtedly multiply determined, there are several distinct reasons that have been
identifi ed so far. The sections below will discuss several such reasons: the benefi ts of
experiencesintangibility to issues of expectations and adaptation, the smaller role
that comparisons play in experiential decision-making and evaluation, the ability for
experiences to strengthen social bonds, and the greater contribution that experiential
purchases make to the self-concept.
T.J. Carter
229
Expectations and Adaptation
Prior to making the purchase, expectations can exert both a positive infl uence
(via positive anticipation) and a negative infl uence (when raised to unreasonable
levels) on satisfaction. How might you fi nd the sweet spot—allowing positive antic-
ipation to increase your expectations so that they increase actual enjoyment, without
setting the bar so high that disappointment is the only possible result? Experiences
seem to offer some benefi ts over possessions in this regard, both in terms of allowing
high expectations to increase enjoyment and in terms of reducing disappointment
when the outcome isn’t as positive as expected.
F o r i n s t a n c e , i n a s t u d y o f s p r i n g b r e a k e x p e r i e n c e s , p a r t i c i p a n t s r e p o r t e d t h e i r
expectations for how their vacation would go, their enjoyment while actually on the
vacation, and their retrospective memories for the event weeks later (Wirtz, Kruger,
Scollon, & Diener, 2003 ) . I n t h i s s t u d y , p a r t i c i p a n t s e x p e c t a t i o n s w e r e p o s i t i v e l y
related to both their online reports and their memories for the event, suggesting that
they were positively anticipating the event and that those increased expectations
actually improved both the experience itself and their memories of it. Why might this
be the case more so for experiences than possessions? Because an experience is
intangible, abstract, and fl eeting, with a fair amount of uncertainty about exactly how
it will transpire. A small amount of uncertainty alone can make a positive experience
more enjoyable by encouraging a pleasant elaboration on potential explanations
(Wilson, Centerbar, Kermer, & Gilbert, 2005 ) . A n d b e c a u s e e x p e r i e n c e s a r e m o r e
abstract—in fact, merely taking time to think about a recent material or experiential
purchase puts people into a more concrete or an abstract mindset, respectively
(Carter, 2013 ) t h a t p o s i t i v e e l a b o r a t i o n c a n b e m o r e e f f e c t i v e .
If your expectations for a vacation in Grand Cayman are particularly high—
indeed, it would be hard not to expect a week sipping drinks on a white sand beach
to be fantastic—even if that positive anticipation improved the experience, the odds
that the reality truly lives up to your expectation may be quite low, partly because
you won’t bother to imagine any potential downsides (Newby-Clark et al., 2000 ).
Chances are pretty good that you failed to foresee the frustration of constant sun-
screen application, the embittering effect of overpriced drinks, or the baffl ed
annoyance at a nearby couple’s decision to blast Jock Jams’96 for the entire beach
to hear. Over time, however, the actual feeling of anger created by those nuisances
will fade and seem trivial, allowing you to see it as a learning experience, or a funny
story; the more positive aspects eventually dominate memories (Mitchell, Thompson,
Peterson, & Cronk,
1997 ). Indeed, in the spring break study mentioned above, it
was only memories of the experience, not the experience itself, that predicted how
likely they were to want to repeat the experience (Wirtz et al.,
2003 ). However,
because possessions are more concrete and physically endure through time, they are
not as easily reconstrued or reimagined. Thus, if your new couch turns out not to be
the paragon of comfort and style you’d expected, it will sit in your living room each
day as a constant reminder of your folly. That greater ability to reconstrue the negative
aspects of an experience is one reason why happiness with experiences seems to
10 Psychological Science of Spending Money
230
hold steady or even improve over time, whereas happiness with possessions tends to
decline (Carter & Gilovich, 2010 ).
As described above, well before physical decline sets in, hedonic adaptation can
begin to leach away a purchase’s initial pleasure, so any disruption of adaptation
processes will help that initial pleasure endure. Here too, experiences offer a benefi t,
since they seem to do a better job than possessions in resisting hedonic adaptation
(Nicolao et al., 2009 ). One reason is because experiences are, by defi nition, tran-
sient states, it can be very diffi cult to get used to them. Possessions, being physical,
tangible objects that persist in space and time, are more prone to this sort of adapta-
tion. That initial thrill from owning a new dining room table will fade as it sits there,
unchanged, day after day. That is not to say that one cannot adapt to a transient state
if it is repeated too often. As mentioned in the example above, eating your favorite
meal too frequently can rob you of its pleasure. Adding variety, surprise, and uncer-
tainty can help prevent the natural process of affective adaptation to pleasurable
events (Wilson & Gilbert, 2008 ). For instance, adding short interruptions to experi-
ences can be suffi cient to prevent them from getting old, to the point that commer-
cials, typically derided as unpleasant, may actually increase enjoyment of a
television show (Nelson & Meyvis, 2008 ). Applying a similar logic, frequent small
purchases may actually provide a greater hedonic benefi t than a single large pur-
chase (Dunn et al., 2011 ; Dunn & Norton, 2013 ). Because pleasurable experiences
are subject to diminishing marginal utility (another insight of prospect theory;
Kahneman & Tversky, 1979 ), you can get a greater total amount of pleasure by
consuming several small experiences than one big one. Taking frequent small vaca-
tions is likely to make a bigger impact on your well-being than one big one. This is
also likely true of possessions; frequently buying small material possessions may
make you happier than one extravagant purchase. Small frequent material purchases
suffer from one signifi cant disadvantage, however: they accumulate over time and
clutter up your life.
Invidious Comparisons
As described above, large choice sets and decision-making strategies that empha-
size comparative information (i.e., maximizing) can have negative hedonic conse-
quences. However, many of these effects are much more true of possessions than
experiences. To start, maximizing appears to be the strategy that offers a more natu-
ral t for material possessions, in no small part because of the tangible nature of
possessions. It was no accident that many of the examples used to describe maxi-
mizing in the sections above were physical objects. Televisions, for instance, can
fairly easily be compared side by side, inviting comparisons that quite often don’t
matter after you’ve brought your purchase home. You might be able to see that one
television offers deeper blacks than another when they’re right next to each other
(in JE), but in your living room (in SE), that direct comparison will be impossible
and therefore will not impact your enjoyment (Hsee,
1996 ; Hsee et al., 1999 ;
T.J. Carter
231
Hsee & Leclerc, 1998 ; Hsee & Zhang, 2004 ). With possessions, because the
comparisons are so easy and prevalent, people seem inclined, perhaps even feel
obligated, to use the more comparison-oriented strategy of maximizing. Indeed,
when faced with a material purchase decision, people report that they’re more likely
to use a maximizing strategy (Carter & Gilovich, 2010 ).
E x p e r i e n c e s , o n t h e o t h e r h a n d , s e e m t o o f f e r a m o r e n a t u r a l t with the satisfi cing
approach. For instance, imagine that you’re deciding where to go on vacation. There
is certainly no shortage of places to visit, meaning that the best decision will by no
means be obvious. There is also plenty of opportunity to compare all of the various
destinations, but those comparisons are much more diffi cult than comparing two
televisions—the attributes of experiential purchases tend to be much less alignable
than the attributes of possessions. Plus, the intangible nature of experiences makes
it impossible to truly compare two vacation destinations side by side, except on the
more tangible and concrete attributes, like price. Most of the comparisons will be
either entirely hypothetical—imagining yourself on a beach is very different than
actually being at one—or even completely incomparable—comparing the sun of
Aruba to the culture of Venice is very much an apples-to-oranges proposition. If one
cannot make such comparisons, then a maximizing approach is decidedly unsuit-
able, and it makes more sense to evaluate each option on its own merits. Indeed,
participants report that they are more likely to use a satisfi cing approach for experi-
ential purchase decisions (Carter & Gilovich, 2010 ).
The different decision-making strategies evoked by material and experiential
purchase decisions show downstream consequences in line with what you’d expect:
maximizing and satisfi cing, respectively. In one experiment, participants were
assigned to recall either a material or experiential purchase they had made from a
large array of options. Consistent with a more extensive decision process, partici-
pants reported that making a material purchase decision was simply more diffi cult
than making an experiential purchase decision. If, because of the more extensive
comparison process involved in the material purchase decision, information about
the foregone options was retained, possessions might be particularly likely to
provoke the kind of negative counterfactuals that create feelings of regret and dis-
satisfaction (see Rosenzweig & Gilovich, 2012 ). Indeed, participants who recalled
a possession were still being bothered by thoughts of the foregone options, and it
was these nagging thoughts that explained why possessions were less satisfying
than experiences in the present (Carter & Gilovich, 2010 ).
Although making comparisons between experiential options is certainly more
diffi cult, comparative information is also less important for experiences, forming a
smaller part of satisfaction judgments than is the case for possessions. When people
evaluate a possession, they need some frame of reference or point of comparison in
order to come up with a judgment; with experiences, the experience itself, on its
own merits, provides the lion’s share of the evaluation process (Carter & Gilovich,
2010 ; Hsee, Yang, Li, & Shen, 2008 ; Ma & Roese, 2013 ). Thus, even when negative
comparative information is salient, experiences are relatively immune to its infl u-
ence. For instance, in an experiment where participants were given either a material
prize (a good pen) or an experiential prize (chips) in the context of either much
10 Psychological Science of Spending Money
232
better or much worse prizes, the context played a big role in how participants evaluated
the pen—rating it lower when it was worse than the other prizes—but had no impact
on how much they enjoyed the chips (Carter & Gilovich, 2010 ). Even when that
information is made quite salient, such as when participants in other experiments
were told that the price had dropped on a purchase they had made, or that new and
better options were now available, that information sapped participants’ satisfaction
with material purchases but not experiential purchases (Carter & Gilovich, 2010 ).
This evidence suggests two hedonic advantages experiences have when it comes
to the act of spending money. First, experiences nudge people into using decision
strategies that are less comparative, and thus more conducive to happiness. Second,
because they are relatively immune to potentially invidious comparisons, when
negative comparative information inevitably does arise, it has a much smaller detri-
mental impact on satisfaction. Of course, you cannot live on vacations and concerts
alone, so when you are making material purchase decisions, try to treat them more
like experiences: make your choices using something closer to a satisfi cing process,
use comparisons only when they’re most helpful—between alignable attributes
when actually making the decision, not after the decision is made—and do your best
to evaluate your purchase on its own merits.
Making Meaning
Some of the purchases that offer the most enduring satisfaction are those that become
personally meaningful, which make some contribution to our sense of self (see Belk,
1988 ). Experiences, more so than possessions, seem to embody this principle as well
(Carter & Gilovich, 2012 ) . W h y m i g h t t h i s b e t h e c a s e ? O n e r e a s o n h a s t o d o w i t h
how the different types of purchases persist over time. As mentioned above, experi-
ences persist only as memories, and memories of an event tend to be rosier than the
actual experience (Mitchell et al., 1997 ). With a little temporal distance, youll forget
about the ravenous mosquitos and the overcooked eggs on your camping trip, but you
will retain the memory of the incredible starry sky and the sense of relaxation (even if
it didn’t feel all that relaxing at the time). Possessions, on the other hand, will be
ravaged by time just like any other physical object. Shoes get scuffed and wear out;
cell phones become obsolete. To be sure, that difference in tangibility is another rea-
son why experiences seem to retain, or even improve their value over time, whereas
satisfaction with possessions seems to decline (Carter & Gilovich, 2010 ) .
But the intangibility of experiences also means that they are more directly con-
nected to the self-concept—memories being an essential component of the self
(e.g., Kihlstrom, Beer, & Klein, 2003 ; McAdams, 2001 ; Wilson & Ross, 2003 )—
whereas possessions are more physically distant from the self. Experiments confi rm
this intuition. For instance, participants in one study were rst asked to recall a
number of both material and experiential purchases. Then, they were given an
example of the diagrams used in the independent–interdependent selves literature,
where circles representing family members are plotted around a central “self” circle,
T.J. Carter
233
with the proximity of each circle relative to the self-circle indicative of the degree to
which that family member contributes to the self-concept (see Markus & Kitayama,
1991 ). They were then given a blank self-circle and asked to use the same logic to
plot the circles representing the purchases they had recalled earlier—literally
diagramming the centrality of each purchase to their self-concept. As expected,
participants plotted their experiential purchases closer to the self-circle than their
material purchases. In another experiment, participants were more likely to include
experiential than material purchases in a narrative telling their life story. These two
experiments together suggest that people do consider their experiences more central
to the self-concept, but more importantly, is centrality to the self- concept part of the
reason why they are more satisfying? Participants in another experiment were asked
to recall either a material or an experiential purchase, and then were asked to imag-
ine that they could go back in time and make a different choice, selecting a different
option instead, but without changing their current circumstances—essentially swap-
ping out their memories for new ones. Participants were less willing to make that
memory swap for an experience than a possession, and that relative willingness did
indeed explain why the possessions were less satisfying than the experiences (Carter
& Gilovich, 2012 ). Experiences did more to create participants’ sense of self, so
changing an experience meant changing the very nature of their self-concept, some-
thing people strongly resist (Gilovich, 1991 ). Indeed, it’s no accident that people
talk of “formative experiences” and not “formative possessions.
O v e r a l l , i t s e e m s t h a t m o n e y s p e n t o n p u r c h a s e s t h a t a r e p e r s o n a l l y m e a n i n g f u l ,
or contribute to our sense of self, is going to produce greater hedonic returns, and
choosing experiences over possessions is just one easy way to accomplish this. There
are certainly other types of purchases that are likely to be personally meaningful.
Other work suggests that purchasing products that are aligned with your own ethical
code, such as environmentally friendly products, can be associated with greater well-
being (Welsch & Kühling, 2010 ; X i a o & L i , 2010 ; c f . G r i s k e v i c i u s , T y b u r , & Va n d e n
Bergh, 2010 ) . P u r c h a s e s t h a t r e q u i r e y o u t o i n v e s t a b i t o f y o u r s e l f i n t o t h e m , s u c h a s
self-assembled furniture, also seem to provide more enduring satisfaction, partly
because they create a feeling of competence, fulfi lling another basic psychological
need (Mochon, Norton, & Ariely, 2012 ; N o r t o n , M o c h o n , & A r i e l y , 2012 ) . I n f a c t ,
people are willing to give up higher wages in exchange for the feeling that the work
they’re doing is meaningful (Ariely, Kamenica, & Prelec, 2008 ). Clearly, meaning
matters. When deciding how to spend your money, you should take into consider-
ation whether any given purchase is likely to provide meaning—to contribute to
your sense of self.
Social Relationships
Probably the single most robust predictor of well-being is having strong social
relationships (e.g., Diener & Seligman,
2002 ; Myers, 2000 ), so spending money in
service of nurturing your social relationships is nearly always going to be money
10 Psychological Science of Spending Money
234
well spent. A difference in the social nature of purchases also helps to explain why
experiences seem to be so satisfying. First, experiences are simply more likely to
involve other people than possessions. After all, many experiential purchases are
expressly meant to foster social interaction or to spend time with loved ones,
whereas many possessions are meant to be enjoyed alone. If you go see the Rolling
Stones in concert, it’s likely that you’ll share the experience with a good friend or
spouse (not to mention 20,000 strangers), but it’s unlikely that a new sweater will be
used by more than one person (certainly at any given time). Indeed, many posses-
sions can do more to isolate us from, rather than connect us to, our social surround-
ings. Even though a smartphone’s primary use is ostensibly as a telephone—an
inherently social purpose—daily train commuters know just how common it is to
see the entire train car full of people sitting silently, staring at their phones, playing
games or attempting to keep up with their work email. Perhaps it’s no surprise that
when people are experimentally induced to leave their gadgets in their pockets and
actually talk to the other passengers, making even a eeting social connection, their
commutes are considerably more pleasant. In a telling study, daily train commuters
in Chicago either were asked to do what they normally did during their commute
(which was almost universally solitary, reading or working, often on some kind of
electronic device) or were asked to start a conversation with a total stranger. But as
daunting as making small talk for 15–30 min might have seemed (and indeed the
commuters generally believed that this would not be pleasant), in fact it was those
participants who had a conversation who enjoyed their commutes the most, and
even considered it at least as productive as if they’d read or worked as they normally
did (Schroeder & Epley, 2013 ).
P a r t i c i p a n t s i n a n o t h e r s t u d y w h o r e ected on an experiential purchase, compared
with participants who refl ected on a material purchase, reported greater happiness
not only with the purchase that they had made but also greater satisfaction of the
higher-order psychological need of relatedness (Howell & Hill, 2009 ) . M e e t i n g t h i s
need for relatedness may even be quite crucial to enduring satisfaction from a pur-
chase; social purchases, whether experiential or material, foster considerably more
happiness and satisfaction than solitary purchases (Caprariello & Reis, 2013 ) . I n f a c t ,
spending money on other people has shown to be more satisfying than spending a
larger amount of money on oneself. In one study, participants were given an envelope
with either $5 or $20 inside and were assigned to spend that money either on them-
selves or on another person by 5 pm. Incredibly, participants who spent their money
on someone else were happier than participants who spent the money on themselves,
but how much money they were given didn’t make a difference (Dunn, Aknin, &
Norton, 2008 ) . T h i s b a s i c p h e n o m e n o n h a s b e e n r e p l i c a t e d i n a v a r i e t y o f o t h e r c o u n -
tries (Aknin et al., 2013 ) , a n d e v e n 2 - y e a r - o l d c h i l d r e n a r e h a p p i e r w h e n g i v i n g t h e i r
own resources (in this case, Goldfi sh crackers) to others than when they receive the
treats themselves (Aknin, Hamlin, & Dunn, 2012 ) . T h e r e s e v e n e v i d e n c e t h a t t h i s
prosocial spending is self-reinforcing—the happier participants in one study were,
the more likely they were to spend a windfall on others (Aknin, Dunn, & Norton,
2011 ) . T h u s , i f y o u a r e g o i n g t o s p e n d m o n e y o n p o s s e s s i o n s i n s t e a d o f e x p e r i e n c e s ,
you’re probably better off buying them for someone else.
T.J. Carter
235
Other work has shown that experiences confer a social benefi t even further
downstream, when conversing with people who were not directly involved in the
purchase itself. For instance, participants in one experiment were asked to have a
conversation with a stranger (also a participant), but were limited in their conversa-
tion topics. Half of the pairs were confi ned to talking about experiences they’d pur-
chased, and the other half were confi ned to talking about their possessions. After the
conversation was over, participants who had talked about experiences felt the con-
versation went better and liked their conversation partner more (Van Boven,
Campbell, & Gilovich, 2010 ). In other words, while you might be excited to talk
about your shiny new laptop, people will be much more receptive to hearing the
stories from your recent trip to San Francisco. Part of the reason may be that experi-
ences are more resistant to social comparisons than possessions, so talking about
your experiences with others is less likely to incite feelings of jealousy (Carter &
Gilovich, 2010 ; s e e a l s o S o l n i c k & H e m e n w a y , 1998 ) . T h e r e s a l s o e v i d e n c e t h a t
people are more likely to spontaneously talk about their experiences than their pos-
sessions, which not only provides the opportunity to make meaningful social connec-
tions as described above but also helps people to “reconsume” that experience,
embellishing and improving the memory (Kumar & Gilovich, 2013 ) . W h a t s m o r e ,
people seem to cherish that mechanism of sharing. In an experiment, after ranking
either a variety of beach vacations (experiential condition) or electronic gadgets
(material condition), participants were asked to imagine that they had to choose
between getting their top-ranked option, but with the caveat that they weren’t allowed
to talk about it with anyone, or their second-ranked option, which had no restrictions.
Participants in the material condition apparently didn’t care about sharing—they
simply wanted their top choice and were perfectly happy to forgo the social ele-
ment in order to get it, further illustrating the more solitary nature of possessions.
Not so with participants in the experiential condition: the ability to talk about their
experience with others was far more important, so they greatly preferred the
socially unrestricted second-ranked option (Kumar & Gilovich, 2013 ) .
T h u s , a b i g p a r t o f t h e r e a s o n w h y e x p e r i e n c e s e n d u p b e i n g m o r e s a t i s f y i n g w a y s
to spend money than possessions is that they confer greater social benefi ts both
during and long after the purchase itself. Given how important other people are to our
well-being, spending money that reinforces your social relationships, or helps you
feel a sense of connectedness to the world, is going to be money well spent—even if
you don’t get to consume it yourself.
Conclusion
The act of spending money is an emotional decision, with hedonic consequences
that can last far into the future. Greater attention to how we approach that act, and
especially the processes by which we make our decisions, can help one accomplish
the overarching goal of improving one’s well-being. The attention one pays need
not be exhausting, however. The approaches outlined above offer a few ways that
10 Psychological Science of Spending Money
236
may help reduce the anxiety many people feel when pondering an act of spending—
worrying about the prospect of buyer’s remorse—that robs the moment of some of
its excitement. It may not be easy to make peace with the fact that spending money
is always going to involve a loss and focus instead on what you’ll gain, but perhaps
a good way to start is simply to choose to take a good friend out to share a nice meal,
savor each bite, and make a memory that you’ll cherish for a lifetime.
References
Ahuvia, A. (2008). If money doesn’t make us happy, why do we act as if it does? Journal of
Economic Psychology, 29 (4), 491–507. doi:
10.1016/j.joep.2007.11.005 .
Aknin, L. B., Barrington-Leigh, C. P., Dunn, E. W., Helliwell, J. F., Burns, J., Biswas-Diener, R.,
et al. (2013). Prosocial spending and well-being: Cross-cultural evidence for a psychological
universal. Journal of Personality and Social Psychology, 104 (4), 635–652. doi:
10.1037/
a0031578 .
Aknin, L. B., Dunn, E. W., & Norton, M. I. (2011). Happiness runs in a circular motion: Evidence
for a positive feedback loop between prosocial spending and happiness. Journal of Happiness
Studies, 13 (2), 347–355. doi:
10.1007/s10902-011-9267-5 .
Aknin, L. B., Hamlin, J. K., & Dunn, E. W. (2012). Giving leads to happiness in young children.
(A. H. Kemp, Ed.). PLoS One, 7 (6), 1–4. doi:
10.1371/journal.pone.0039211.g002 .
Aknin, L. B., Norton, M. I., & Dunn, E. W. (2009). From wealth to well-being? Money matters,
but less than people think. The Journal of Positive Psychology, 4 (6), 523–527.
doi:
10.1080/17439760903271421 .
Alba, J. W., & Williams, E. F. (2013). Pleasure principles: A review of research on hedonic con-
sumption. Journal of Consumer Psychology, 23 (1), 2–18. doi:
10.1016/j.jcps.2012.07.003 .
Andrade, E. B., & Ariely, D. (2009). The enduring impact of transient emotions on decision
making. Organizational Behavior and Human Decision Processes, 109 (1), 1–8. doi: 10.1016/j.
obhdp.2009.02.003 .
Ariely, D. (2000). Controlling the information ow: Effects on consumers’ decision making and
preferences. Journal of Consumer Research, 27 (2), 233–248. doi:
10.1086/314322 .
Ariely, D., Huber, J., & Wertenbroch, K. (2005). When do losses loom larger than gains? Journal
of Marketing Research, 42 (2), 134–138.
A r i e l y , D . , K a m e n i c a , E . , & P r e l e c , D . ( 2 0 0 8 ) . M a n s s e a r c h f o r m e a n i n g : T h e c a s e o f L e g o s . Journal
of Economic Behavior and Organization, 67 ( 3 4 ) , 6 7 1 6 7 7 . d o i : 1 0 . 1 0 1 6 / j . j e b o . 2 0 0 8 . 0 1 . 0 0 4 .
Arkes, H., & Blumer, C. (1985). The psychology of sunk cost. Organizational Behavior and
Human Decision Processes, 35 , 124–140. doi:
10.1016/0749-5978(85)90049-4 .
Avnet, T., & Higgins, E. T. (2006). How regulatory t affects value in consumer choices and
opinions. Journal of Marketing Research, 43 (1), 1–10. doi: 10.2307/30163364 .
Belk, R. W. (1988). Possessions and the extended self. Journal of Consumer Research, 15 ,
139–168.
Bell, D. E. (1985). Disappointment in decision making under uncertainty. Operations Research,
33 (1), 1–27. doi:
10.1287/opre.33.1.1 .
Berger, J., Draganska, M., & Simonson, I. (2007). The infl uence of product variety on brand
perception and choice. Marketing Science, 26 (4), 460–472. doi:
10.1287/mksc.1060.0253 .
Biswas-Diener, R., & Diener, E. (2001). Making the best of a bad situation: Satisfaction in the
slums of Calcutta. Social Indicators Research, 55 (3), 329–352.
Botti, S., & McGill, A. L. (2006). When choosing is not deciding: The effect of perceived respon-
sibility on satisfaction. Journal of Consumer Research, 33 (2), 211–219. doi:
10.1086/506302 .
Boyce, C. J., Wood, A. M., Banks, J., Clark, A. E., & Brown, G. D. A. (2013). Money, well-being,
and loss aversion: Does an income loss have a greater effect on well-being than an equivalent
income gain? Psychological Science, 24 (12), 2557–2562. doi:
10.1177/0956797613496436 .
T.J. Carter
237
Brown, S., Taylor, K., & Price, S. W. (2005). Debt and distress: Evaluating the psychological cost
of credit. Journal of Economic Psychology, 26 (5), 642–663. doi:
10.1016/j.joep.2005.01.002 .
Buehler, R., & McFarland, C. (2001). Intensity bias in affective forecasting: The role of temporal
focus. Personality and Social Psychology Bulletin, 27 (11), 1480–1493.
Caprariello, P. A., & Reis, H. T. (2013). To do, to have, or to share? Valuing experiences over
material possessions depends on the involvement of others. Journal of Personality and Social
Psychology, 104 (2), 199–215. doi:
10.1037/a0030953 .
Carmon, Z., & Ariely, D. (2000). Focusing on the forgone: How value can appear so different to
buyers and sellers. The Journal of Consumer Research, 27 (12), 360–370.
Carmon, Z., Wertenbroch, K., & Zeelenberg, M. (2003). Option attachment: When deliberating makes
choosing feel like losing. Journal of Consumer Research, 30 ( 1 ) , 1 5 2 9 . d o i : 1 0 . 1 0 8 6 / 3 7 4 7 0 1 .
Carter, T. J. (2013). The abstract and concrete nature of experiences and possessions . Unpublished
manuscript.
Carter, T. J., & Gilovich, T. (2010). The relative relativity of material and experiential purchases.
Journal of Personality and Social Psychology, 98 (1), 146–159. doi:
10.1037/a0017145 .
Carter, T. J., & Gilovich, T. (2012). I am what I do, not what I have: The differential centrality of
experiential and material purchases to the self. Journal of Personality and Social Psychology,
102 (6), 1304–1317. doi:
10.1037/a0027407 .
Carter, T. J., & Gilovich, T. (2013). Getting the most for the money: The hedonic return on experi-
ential and material purchases. In M. Tatzel (Ed.), Consumption and well-being in the material
world . New York, NY: Springer. doi:
10.1007/978-94-007-7368-4_3 .
Chancellor, J., & Lyubomirsky, S. (2011). Happiness and thrift: When (spending) less is (hedoni-
cally) more. Journal of Consumer Psychology, 21 (2), 131–138. doi:
10.1016/j.jcps.2011.02.004 .
Chernev, A. (2003). Product assortment and individual decision processes. Journal of Personality
and Social Psychology, 85 (1), 151–162. doi:
10.1037/0022-3514.85.1.151 .
Csikszentmihalyi, M. (2000). The costs and benefi ts of consuming. Journal of Consumer Research,
27 (2), 267–272. doi:
10.1086/314324 .
Dhar, R., Nowlis, S. M., & Sherman, S. J. (1999). Comparison effects on preference construction.
Journal of Consumer Research, 26 (3), 293–306. doi:
10.1086/209564 .
Diehl, K., & Poynor, C. (2010). Great expectations?! Assortment size, expectations and satisfac-
tion. Journal of Marketing Research, 47 (2), 312–322.
Diener, E., & Biswas-Diener, R. (2002). Will money increase subjective well-being?: A literature
review and guide to needed research. Social Indicators Research, 57 (2), 119–169.
D i e n e r , E . , & F u j i t a , F . ( 1 9 9 5 ) . R e s o u r c e s , p e r s o n a l s t r i v i n g s , a n d s u b j e c t i v e w e l l - b e i n g : A n o m o thetic
and idiographic approach. Journal of Personality and Social Psychology, 68 (5), 926–935.
doi:
10.1037/0022-3514.68.5.926 .
Diener, E., Lucas, R. E., & Scollon, C. N. (2006). Beyond the hedonic treadmill. American
Psychologist, 61 (4), 305–314.
Diener, E., & Seligman, M. E. P. (2002). Very happy people. Psychological Science, 13 (1), 81–84.
Diener, E., Tay, L., & Oishi, S. (2013). Rising income and the subjective well-being of nations.
Journal of Personality and Social Psychology, 104 (2), 267–276. doi:
10.1037/a0030487 .
Dunn, E. W., Aknin, L. B., & Norton, M. I. (2008). Spending money on others promotes happiness.
Science, 319 (5870), 1687–1688. doi:
10.1126/science.1150952 .
Dunn, E. W., Gilbert, D. T., & Wilson, T. D. (2011). If money doesn’t make you happy, then you
probably aren’t spending it right. Journal of Consumer Psychology, 21 (2), 115–125.
doi:
10.1016/j.jcps.2011.02.002 .
Dunn, E. W., & Norton, M. I. (2013). Happy money . New York, NY: Simon & Schuster.
Frederick, S., & Loewenstein, G. F. (1999). Hedonic adaptation. In D. Kahneman, E. Diener, &
N. Schwartz (Eds.), Well-being: The foundations of hedonic psychology (pp. 302–329).
New York, NY: Russell Sage.
Frederick, S., Novemsky, N., Wang, J., Dhar, R., & Nowlis, S. (2009). Opportunity cost neglect.
Journal of Consumer Research, 36 (4), 553–561. doi:
10.1086/599764 .
Gentner, D., & Markman, A. B. (1994). Structural alignment in comparison: No difference without
similarity. Psychological Science, 5 (3), 152–158. doi:
10.1111/j.1467-9280.1994.tb00652.x .
10 Psychological Science of Spending Money
238
Gilbert, D. T., Giesler, R. B., & Morris, K. A. (1995). When comparisons arise. Journal of
Personality and Social Psychology, 69 (2), 227–236.
Gilbert, D. T., Pinel, E. C., Wilson, T. D., Blumberg, S. J., & Wheatley, T. P. (1998). Immune
neglect: A source of durability bias in affective forecasting. Journal of Personality and Social
Psychology, 75 (3), 617–638. doi:
10.1037/0022-3514.75.3.617 .
Gilovich, T. (1983). Biased evaluation and persistence in gambling. Journal of Personality and
Social Psychology, 44 (6), 1110–1126. doi:
10.1037/0022-3514.44.6.1110 .
Gilovich, T. (1991). How we know what isn’t so: The fallibility of human reason in everyday life .
New York, NY: The Free Press.
Gilovich, T., Kerr, M., & Medvec, V. H. (1993). Effect of temporal perspective on subjective
confi dence. Journal of Personality and Social Psychology, 64 (4), 552–560.
doi:
10.1037/0022-3514.64.4.552 .
Goldstein, R., Almenberg, J., Dreber, A., Emerson, J. W., Herschkowitsch, A., & Katz, J. (2008).
Do more expensive wines taste better? Evidence from a large sample of blind tastings. Journal
of Wine Economics, 3 (1), 1–9.
Griffi n, J. G., & Broniarczyk, S. M. (2010). The slippery slope: The impact of feature alignability
on search and satisfaction. Journal of Marketing Research, 47 (2), 323–334.
Griffi n, D. W., Dunning, D., & Ross, L. D. (1990). The role of construal processes in overconfi dent
predictions about the self and others. Journal of Personality and Social Psychology, 59 (6),
1128–1139. doi:
10.1037/0022-3514.59.6.1128 .
Griskevicius, V., Tybur, J. M., & Van den Bergh, B. (2010). Going green to be seen: Status, reputa-
tion, and conspicuous conservation. Journal of Personality and Social Psychology, 98 (3), 392–
404. doi:
10.1037/a0017346 .
Hastie, R. (1984). Causes and effects of causal attribution. Journal of Personality and Social
Psychology, 46 (1), 44–56.
Haynes, G. A. (2009). Testing the boundaries of the choice overload phenomenon: The effect of
number of options and time pressure on decision diffi culty and satisfaction. Psychology and
Marketing, 26 (3), 204–212. doi:
10.1002/mar.20269 .
Herrmann, A., Heitmann, M., Morgan, R., Henneberg, S. C., & Landwehr, J. (2009). Consumer
decision making and variety of offerings: The effect of attribute alignability. Psychology and
Marketing, 26 (4), 333–358. doi:
10.1002/mar.20276 .
Higgins, E. T. (1997). Beyond pleasure and pain. American Psychologist, 52 (12), 1280–1300.
doi:
10.1037/0003-066X.52.12.1280 .
Howell, R. T., & Hill, G. (2009). The mediators of experiential purchases: Determining the impact
of psychological needs satisfaction and social comparison. The Journal of Positive Psychology,
4 (6), 511–522. doi:
10.1080/17439760903270993 .
Howell, R. T., Pchelin, P., & Iyer, R. (2012). The preference for experiences over possessions:
Measurement and construct validation of the Experiential Buying Tendency Scale. The Journal
of Positive Psychology, 7 (1), 57–71. doi:
10.1080/17439760.2011.626791 .
Hsee, C. K. (1996). The evaluability hypothesis: An explanation for preference reversals between
joint and separate evaluations of alternatives. Organizational Behavior and Human Decision
Processes, 67 (3), 247–257. doi:
10.1006/obhd.1996.0077 .
Hsee, C. K. (1998). Less is better: When low-value options are valued more highly than high-value
options. Journal of Behavioral Decision Making, 11 (2), 107–121.
Hsee, C. K., & Leclerc, F. (1998). Will products look more attractive when presented separately or
together? The Journal of Consumer Research, 25 (2), 175–186.
Hsee, C. K., Loewenstein, G. F., Blount, S., & Bazerman, M. H. (1999). Preference reversals
between joint and separate evaluations of options: A review and theoretical analysis.
Psychological Bulletin, 125 (5), 576–590.
Hsee, C. K., Yang, Y., Li, N., & Shen, L. (2008). Wealth, warmth and wellbeing: Whether happiness is
relative or absolute depends on whether it is about money, acquisition or consumption. Journal of
Marketing Research, 46 (3), 396–409. doi:
10.1509/jmkr.46.3.396 .
Hsee, C. K., & Zhang, J. (2004). Distinction bias: Misprediction and mischoice due to joint evaluation.
Journal of Personality and Social Psychology, 86 ( 5 ) , 6 8 0 6 9 5 . d o i : 1 0 . 1 0 3 7 / 0 0 2 2 - 3 5 1 4 . 8 6 . 5 . 6 8 0 .
Hsee, C. K., Zhang, J., Cai, C. F., & Zhang, S. (2013). Overearning. Psychological Science, 24 (6),
852–859. doi:
10.1177/0956797612464785 .
T.J. Carter
239
Isen, A. M. (2001). An infl uence of positive affect on decision making in complex situations:
Theoretical issues with practical implications. Journal of Consumer Psychology, 11 (2), 75–85.
Iyengar, S. S., & Lepper, M. R. (2000). When choice is demotivating: Can one desire too much of
a good thing. Journal of Personality and Social Psychology, 79 (6), 995–1006.
Iyengar, S. S., Wells, R. E., & Schwartz, B. (2006). Doing better but feeling worse: Looking for the
“best” job undermines satisfaction. Psychological Science, 17 (2), 143–150.
Johnson, E. J., & Payne, J. W. (1985). Effort and accuracy in choice. Management Science, 31 (4),
395–414.
Kahn, B. E., & Lehmann, D. R. (1991). Modeling choice among assortments. Journal of Retailing,
67 (3), 274–299.
Kahneman, D., Krueger, A. B., Schkade, D., Schwarz, N., & Stone, A. (2004). A survey method
for characterizing daily life experience: The day reconstruction method. Science, 306 ,
1776–1780.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk.
Econometrica, 47 (2), 263–292.
Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39 (4),
341–350.
Kassam, K. S., Gilbert, D. T., Boston, A., & Wilson, T. D. (2008). Future anhedonia and time
discounting. Journal of Experimental Social Psychology, 44 (6), 1533–1537. doi:
10.1016/j.
jesp.2008.07.008 .
K a s s e r , T . ( 2 0 1 1 ) . C a n t h r i f t b r i n g w e l l - b e i n g ? A r e v i e w o f t h e r e s e a r c h a n d a t e n t a t i v e t h e o r y . Social
and Personality Psychology Compass, 5 ( 1 1 ) , 8 6 5 8 7 7 . d o i : 1 0 . 1 1 1 1 / j . 1 7 5 1 - 9 0 0 4 . 2 0 1 1 . 0 0 3 9 6 . x .
Kasser, T., Cohn, S., Kanner, A. D., & Ryan, R. M. (2007). Some costs of American corporate
capitalism: A psychological exploration of value and goal confl icts. Psychological Inquiry,
18 (1), 1–22.
Kasser, T., & Ryan, R. M. (1993). A dark side of the American dream: Correlates of nancial
success as a central life aspiration. Journal of Personality and Social Psychology, 65 (2),
410–422.
Kihlstrom, J. F., Beer, J. S., & Klein, S. B. (2003). Self and identity as memory. In M. R. Leary &
J. Tagney (Eds.), Handbook of self and identity (pp. 68–90). New York, NY: Guilford Press.
Knutson, B., Rick, S., Wimmer, G. E., Prelec, D., & Loewenstein, G. (2007). Neural predictors of
purchases. Neuron, 53 (1), 147–156. doi:
10.1016/j.neuron.2006.11.010 .
Koob, G. F., & Le Moal, M. (2001). Drug addiction, dysregulation of reward, and allostasis.
Neuropsychopharmacology, 24 (2), 97–129. doi:
10.1016/S0893-133X(00)00195-0 .
Kumar, A., & Gilovich, T. (2013). We’ll always have Paris: Differential story utility from experi-
ential and material purchases . Manuscript under review.
Lee, B.-K., & Lee, W.-N. (2004). The effect of information overload on consumer choice quality
in an on-line environment. Psychology and Marketing, 21 (3), 159–183. doi:
10.1002/mar.20000 .
Lerner, J. S., Small, D. A., & Loewenstein, G. (2004). Heart strings and purse strings: Carryover
effects of emotions on economic decisions. Psychological Science, 15 (5), 337–341.
doi:
10.1111/j.0956-7976.2004.00679.x .
Loewenstein, G. (1996). Out of control: Visceral infl uences on behavior. Organizational Behavior
and Human Decision Processes, 65 (3), 272–292.
Loewenstein, G., & Lerner, J. S. (2003). The role of affect in decision making. In R. J. Davidson,
K. R. Scherer, & H. H. Goldsmith (Eds.), Handbook of affective sciences (pp. 619–642).
Oxford, UK: Oxford University Press.
Loewenstein, G. F., Weber, E., Hsee, C. K., & Welch, N. (2001). Risk as feelings. Psychological
Bulletin, 127 (2), 267–286.
Lyubomirsky, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reli-
ability and construct validation. Social Indicators Research, 46 (2), 137–155.
Ma, J., & Roese, N. J. (2013). The countability effect: Comparative versus experiential reactions
to reward distributions. Journal of Consumer Research, 39 (6), 1219–1233. doi:
10.1086/668087 .
Markman, A. B., & Medin, D. L. (1995). Similarity and alignment in choice. Organizational
Behavior and Human Decision Processes, 63 (2), 117–130.
10 Psychological Science of Spending Money
240
Markus, H. R., & Kitayama, S. (1991). Culture and the self: Implications for cognition, emotion,
and motivation. Psychological Review, 98 (2), 224–253.
Martin, K. D., & Hill, R. P. (2012). Life satisfaction, self-determination, and consumption ade-
quacy at the bottom of the pyramid. Journal of Consumer Research, 38 (6), 1155–1168.
doi:
10.1086/661528 .
Mattila, A., & Wirtz, J. (2000). The role of preconsumption affect in postpurchase evaluation of
services. Psychology and Marketing, 17 (7), 587–605.
McAdams, D. P. (2001). The psychology of life stories. Review of General Psychology, 5 (2), 100–
122. doi:
10.1037//I089-2680.5.2.100 .
Mellers, B. A., & McGraw, A. P. (2001). Anticipated emotions as guides to choice. Current
Directions in Psychological Science, 10 (6), 210–214. doi:
10.1111/1467-8721.00151 .
Mellers, B., Schwartz, A., & Ritov, I. (1999). Emotion-based choice. Journal of Experimental
Psychology: General, 128 , 332–345.
Mitchell, T., Thompson, L., Peterson, E., & Cronk, R. (1997). Temporal adjustments in the evalu-
ation of events: The “rosy view”. Journal of Experimental Social Psychology, 33 (4), 421–448.
doi:
10.1006/jesp.1997.1333 .
Mochon, D., Norton, M. I., & Ariely, D. (2012). Bolstering and restoring feelings of competence
via the IKEA effect. International Journal of Research in Marketing, 29 (4), 363–369.
doi:
10.1016/j.ijresmar.2012.05.001 .
Myers, D. G. (2000). The funds, friends, and faith of happy people. American Psychologist, 55 (1),
56–67. doi:
10.1037//0003-066X.55,1.56 .
Nelson, L. D., & Meyvis, T. (2008). Interrupted consumption: Disrupting adaptation to hedonic
experiences. Journal of Marketing Research, 45 (6), 654–664.
N e w b y - C l a r k , I . R . , R o s s , M . , B u e h l e r , R . , K o e h l e r , D . J . , & G r i f n, D. (2000). People focus on opti-
mistic scenarios and disregard pessimistic scenarios while predicting task completion times.
Journal of Experimental Psychology. Applied, 6 ( 3 ) , 1 7 1 1 8 2 . d o i : 1 0 . 1 0 3 7 / / 1 0 7 6 - 8 9 8 X . 6 . 3 . 1 7 1 .
Nickerson, C. C., Schwarz, N. N., Diener, E., & Kahneman, D. D. (2003). Zeroing in on the dark side
of the American dream: A closer look at the negative consequences of the goal for nancial
success. Psychological Science, 14 ( 6 ) , 5 3 1 5 3 6 . d o i : 1 0 . 1 0 4 6 / j . 0 9 5 6 - 7 9 7 6 . 2 0 0 3 . p s c i _ 1 4 6 1 . x .
Nicolao, L., Irwin, J. R., & Goodman, J. K. (2009). Happiness for sale: Do experiential purchases
make consumers happier than material purchases? Journal of Consumer Research, 36 (2),
188–198. doi:
10.1086/597049 .
Norton, M. I., Mochon, D., & Ariely, D. (2012). The IKEA effect: When labor leads to love.
Journal of Consumer Psychology, 22 (3), 453–460. doi:
10.1016/j.jcps.2011.08.002 .
Novemsky, N., & Kahneman, D. (2005). The boundaries of loss aversion. Journal of Marketing
Research, 42 (2), 119–128.
Nowlis, S., Mandel, N., & McCabe, D. (2004). The effect of a delay between choice and consump-
tion on consumption enjoyment. The Journal of Consumer Research, 31 (3), 502–510.
Oliver, R. L. (1980). A cognitive model of the antecedents and consequences of satisfaction deci-
sions. Journal of Marketing Research, 17 , 460–469.
Patrick, V. M., Macinnis, D. J., & Park, C. W. (2007). Not as happy as I thought I’d be? Affective
misforecasting and product evaluations. Journal of Consumer Research, 33 (4), 479–489.
doi:
10.1086/510221 .
Pham, M. T. (1998). Representativeness, relevance, and the use of feelings in decision making.
Journal of Consumer Research, 25 (2), 144–159. doi:
10.1086/209532 .
Phillips, D. M., & Baumgartner, H. (2002). The role of consumption emotions in the satisfaction
response. Journal of Consumer Psychology, 12 ( 3 ) , 2 4 3 2 5 2 . d o i : 1 0 . 1 2 0 7 / S 1 5 3 2 7 6 6 3 J C P 1 2 0 3 _ 0 6 .
Prelec, D., & Loewenstein, G. F. (1998). The red and the black: Mental accounting of savings and
debt. Marketing Science, 17 (1), 4–28.
P r e l e c , D . , & S i m e s t e r , D . ( 2 0 0 1 ) . A l w a y s l e a v e h o m e w i t h o u t i t : A f u r t h e r i n v e s t i g a t i o n o f t h e c r e d i t -
card effect on willingness to pay. Marketing Letters, 12 ( 1 ) , 5 1 2 . d o i : 1 0 . 1 0 2 3 / A : 1 0 0 8 1 9 6 7 1 7 0 1 7 .
Quoidbach, J., Dunn, E. W., Petrides, K. V., & Mikolajczak, M. (2010). Money giveth, money
taketh away: The dual effect of wealth on happiness. Psychological Science, 21 (6), 759–763.
doi:
10.1177/0956797610371963 .
T.J. Carter
241
Reutskaja, E., & Hogarth, R. M. (2009). Satisfaction in choice as a function of the number of
alternatives: When “goods satiate”. (B. Scheibehenne & P. M. Todd, Eds.). Psychology and
Marketing, 26 (3), 197–203. doi:
10.1002/mar.20268 .
Rick, S. I., Cryder, C. E., & Loewenstein, G. F. (2008). Tightwads and spendthrifts. Journal of
Consumer Research, 34 (6), 767–782. doi:
10.1086/523285 .
Rosenzweig, E., & Gilovich, T. (2012). Buyer’s remorse or missed opportunity? Differential
regrets for material and experiential purchases. Journal of Personality and Social Psychology,
102 (2), 215–223. doi:
10.1037/a0024999 .
Sacks, D. W., Stevenson, B., & Wolfers, J. (2012). The new stylized facts about income and subjec-
tive well-being. Emotion, 12 (6), 1181–1187. doi:
10.1037/a0029873 .
Scheibehenne, B., Greifeneder, R., & Todd, P. M. (2009). What moderates the too-much-choice
effect? Psychology and Marketing, 26 (3), 229–253. doi:
10.1002/mar.20271 .
Scheibehenne, B., Greifeneder, R., & Todd, P. M. (2010). Can there ever be too many options? A
metaanalytic review of choice overload. Journal of Consumer Research, 37 (3), 409–425.
doi:
10.1086/651235 .
Schroeder, J., & Epley, N. (2013). Mistakenly seeking solitude. Manuscript under review.
Schwartz, B. (2004). The paradox of choice: Why more is less . New York, NY: Harper Perennial.
Schwartz, B., Ward, A., Monterosso, J., Lyubomirsky, S., White, K., & Lehman, D. R. (2002).
Maximizing versus satisfi cing: Happiness is a matter of choice. Journal of Personality and
Social Psychology, 83 (5), 1178–1197.
Sela, A., Berger, J., & Liu, W. (2009). Variety, vice, and virtue: How assortment size infl uences
option choice. Journal of Consumer Research, 35 (6), 941–951. doi:
10.1086/593692 .
Sevdalis, N., & Harvey, N. (2006). Determinants of willingness to pay in separate and joint evalu-
ations of options: Context matters. Journal of Economic Psychology, 27 (3), 377–385.
doi:
10.1016/j.joep.2005.07.001 .
Sheldon, K. M., & Lyubomirsky, S. (2012). The challenge of staying happier: Testing the hedonic
adaptation prevention model. Personality and Social Psychology Bulletin, 38 (5), 670–680.
doi:
10.1177/0146167212436400 .
Shiv, B., & Huber, J. (2000). The impact of anticipating satisfaction on consumer choice. The
Journal of Consumer Research, 27 , 202–216.
Shugan, S. M. (1980). The cost of thinking. Journal of Consumer Research, 7 (2), 99–111.
Simon, H. A. (1955). A behavioral model of rational choice. Quarterly Journal of Economics,
69 (1), 99–118.
Solnick, S., & Hemenway, D. (1998). Is more always better?: A survey on positional concerns.
Journal of Economic Behavior and Organization, 37 , 373–383.
Soman, D. (2001). Effects of payment mechanism on spending behavior: The role of rehearsal and
immediacy of payments. Journal of Consumer Research, 27 (4), 460–474. doi:
10.1086/319621 .
Sparks, E. A., Ehrlinger, J., & Eibach, R. P. (2012). Failing to commit: Maximizers avoid commit-
ment in a way that contributes to reduced satisfaction. Personality and Individual Differences,
52 (1), 72–77. doi:
10.1016/j.paid.2011.09.002 .
Tian, K. T., Bearden, W. O., & Hunter, G. L. (2001). Consumers’ need for uniqueness: Scale devel-
opment and validation. The Journal of Consumer Research, 28 (1), 50–66.
Trope, Y., & Liberman, N. (2003). Temporal construal. Psychological Review, 110 (3), 403–421.
doi:
10.1037/0033-295X.110.3.403 .
Tsiros, M., Mittal, V., & Ross, W. T., Jr. (2004). The role of attributions in customer satisfaction:
A reexamination. Journal of Consumer Research, 31 (2), 476–483. doi:
10.1086/422124 .
Tversky, A., & Kahneman, D. (1991). Loss aversion in riskless choice: A reference-dependent
model. The Quarterly Journal of Economics, 106 (4), 1039–1061.
Ubel, P., Loewenstein, G. F., & Jepson, C. (2005). Disability and sunshine: Can hedonic predic-
tions be improved by drawing attention to focusing illusions or emotional adaptation? Journal
of Experimental Psychology. Applied, 11 (2), 111–123. doi:
10.1037/1076-898X.11.2.111 .
Van Boven, L., & Ashworth, L. (2007). Looking forward, looking back: Anticipation is more
evocative than retrospection. Journal of Experimental Psychology: General, 136 , 289–300.
doi:
10.1037/0096-3445.136.2.289 .
10 Psychological Science of Spending Money
242
Van Boven, L., Campbell, M. C., & Gilovich, T. (2010). Stigmatizing materialism: On stereotypes
and Impressions of materialistic and experiential pursuits. Personality and Social Psychology
Bulletin, 36 (4), 551–563. doi:
10.1177/0146167210362790 .
Van Boven, L., & Gilovich, T. (2003). To do or to have? That is the question. Journal of Personality
and Social Psychology, 85 (6), 1193–1202. doi:
10.1037/0022-3514.85.6.1193 .
Van Praag, B. M., & Frijters, P. (1999). The measurement of welfare and well-being: The Leyden
approach. In Well-being: The foundations of hedonic psychology (pp. 413–433).
Wang, J., Novemsky, N., & Dhar, R. (2009). Anticipating adaptation to products. Journal of
Consumer Research, 36 (2), 149–159. doi:
10.1086/597050 .
Welsch, H., & Kühling, J. (2010). Pro-environmental behavior and rational consumer choice:
Evidence from surveys of life satisfaction. Journal of Economic Psychology, 31 (3), 405–420.
doi:
10.1016/j.joep.2010.01.009 .
Wilson, T. D., Centerbar, D., Kermer, D., & Gilbert, D. T. (2005). The pleasures of uncertainty:
Prolonging positive moods in ways people do not anticipate. Journal of Personality and Social
Psychology, 88 (1), 5–21. doi:
10.1037/0022-3514.88.1.5 .
Wilson, T. D., & Gilbert, D. T. (2003). Affective forecasting , 35 , 345–411.
Wilson, T. D., & Gilbert, D. T. (2008). Explaining away: A model of affective adaptation.
Perspectives on Psychological Science, 3 (5), 370–386. doi:
10.1111/j.1745-6924.2008.00085.x .
Wilson, T. D., Lisle, D. J., Kraft, D., & Wetzel, C. G. (1989). Preferences as expectation-driven
inferences: Effects of affective expectations on affective experience. Journal of Personality and
Social Psychology, 56 (4), 519–530. doi:
10.1037/0022-3514.56.4.519 .
Wilson, A., & Ross, M. (2003). The identity function of autobiographical memory: Time is on our
side. Memory, 11 (2), 137–149. doi:
10.1080/741938210 .
Wilson, T. D., Wheatley, T., Meyers, J., Gilbert, D. T., & Axsom, D. (2000). Focalism: A source of
durability bias in affective forecasting. Journal of Personality and Social Psychology, 78 (5),
821–836.
Wirtz, D., Kruger, J., Scollon, C., & Diener, E. (2003). What to do on spring break? The role of
predicted, on-line, and remembered experience in future choice. Psychological Science, 14 (5),
520–524.
Xiao, J. J., & Li, H. (2010). Sustainable consumption and life satisfaction. Social Indicators
Research, 104 (2), 323–329. doi:
10.1007/s11205-010-9746-9 .
Zhang, S., & Markman, A. B. (2001). Processing product unique features: Alignability and
involvement in preference construction. Journal of Consumer Psychology, 11 (1), 13–27.
T.J. Carter
Psychological Science
22(8) 1011 –1018
© The Author(s) 2011
Reprints and permission:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/0956797611414726
http://pss.sagepub.com
How do people decide which political candidate to support, or
whether their country goes to war? In the social science litera-
ture, it has traditionally been assumed that political behavior
reflects a thoughtful and rational analysis of the pros and cons
of the options (e.g., Baum & Jamison, 2006; Downs, 1959;
Lau & Redlawsk, 1997). Recent work in social and cognitive
psychology suggests, however, that political behavior can also
be unconsciously influenced by contextual cues, such as vot-
ing location (Berger, Meredith, & Wheeler, 2008) and the
facial characteristics of candidates (Todorov, Mandisodza,
Goren, & Hall, 2005).
But how robust and durable is the influence of such inci-
dental cues on political decisions and behavior? In the research
reported here, we examined one of the most iconic political
symbols of a nation—its flag—and tested the direction and
durability of its influence on political behavior, attitudes, and
judgment.
National flags are pervasive cues in the political landscapes
of many nations, appearing on houses, schools, government
buildings, and the lapels of political candidates (Gellner, 2005).
Flags constitute particularly powerful political cues because
they may reinforce national sentiments without being con-
sciously noticed by the citizenry (e.g., Billig, 1995). Although
social scientists have speculated that national flags might exert
an unnoticed influence on political thought and behavior, there
is little empirical evidence to support this claim.
How might a national flag influence the political behavior
of the citizenry? National flags have traditionally been seen as
rallying symbols that bring citizens together (Baker & O’Neal,
2001; Mueller, 1970). For instance, citizens and members of
government often intentionally display the national flag dur-
ing wartime in an effort to unify the populace behind the war
efforts (Skitka, 2005). Recent research has shown that even
subtle exposure to a national flag can have unifying effects.
Hassin, Ferguson, Shidlovski, and Gross (2007) found that
subliminal exposure to a national flag led citizens to vote in a
manner that reflected politically moderate views, such that
participants at each end of the political spectrum moved
Corresponding Authors:
Travis J. Carter, Center for Decision Research, University of Chicago Booth
School of Business, C74 Harper Center, Chicago, IL 60637
E-mail: travis.carter@chicagobooth.edu
Melissa J. Ferguson, Department of Psychology, 230 Uris Hall, Cornell
University, Ithaca, NY 14853
E-mail: mjf44@cornell.edu
A Single Exposure to the American Flag
Shifts Support Toward Republicanism up
to 8 Months Later
Travis J. Carter1, Melissa J. Ferguson2, and Ran R. Hassin3
1Center for Decision Research, Booth School of Business, University of Chicago; 2Department of Psychology, Cornell University;
and 3Department of Psychology and The Center for the Study of Rationality, Hebrew University
Abstract
There is scant evidence that incidental cues in the environment significantly alter people’s political judgments and behavior
in a durable way. We report that a brief exposure to the American flag led to a shift toward Republican beliefs, attitudes, and
voting behavior among both Republican and Democratic participants, despite their overwhelming belief that exposure to the
flag would not influence their behavior. In Experiment 1, which was conducted online during the 2008 U.S. presidential election,
a single exposure to an American flag resulted in a significant increase in participants’ Republican voting intentions, voting
behavior, political beliefs, and implicit and explicit attitudes, with some effects lasting 8 months after the exposure to the prime.
In Experiment 2, we replicated the findings more than a year into the current Democratic presidential term. These results
constitute the first evidence that nonconscious priming effects from exposure to a national flag can bias the citizenry toward
one political party and can have considerable durability.
Keywords
political psychology, priming, voting behavior, American flag
Received 12/13/10; Revision accepted 3/8/11
Research Article
at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from
1012 Carter et al.
toward the ideological center. This was the first evidence that
national flags can change people’s political behavior in a sub-
tle, nonconscious fashion.
Yet the psychological effects of exposure to a national flag
are likely to vary considerably according to a given country’s
characteristics, such as its culture, history, and political atmo-
sphere. Although there may be cases in which a national flag
unifies people by pushing them toward the center of the ideo-
logical spectrum, there may be other cases in which a national
flag instead moves people toward one end of the spectrum. We
argue that this possibility is particularly likely in a country in
which the political landscape is polarized by what is largely a
two-party system, and in which one of the two major parties
has come to be more associated with the flag. In these cases,
the flag may bias the citizenry toward a particular political
party, potentially without their awareness (Billig, 1995).
We tested this prediction in the United States, a country in
which the political system is sharply divided between Demo-
crats and Republicans. To examine the associations between
the flag and each political party, we conducted a pilot study
in which we asked 51 participants which party “tends to
brandish the American flag more often (e.g., by wearing it,
waving it, holding it, having it on their house).” Participants
in our sample strongly believed that the tendency to display
the flag was more common among Republicans; responses
differed significantly from the midpoint of the scale, t(50) =
6.50, p = .001 (see also Carney, Jost, Gosling, & Potter,
2008). The same sample of participants overwhelmingly
(90.2%) believed that their voting behavior would not be
influenced by the presence of a flag, and the few who thought
it might did not agree on the direction of its influence. Thus,
despite associating the American flag more strongly with one
political party than with the other, participants in our pilot
study did not believe that exposure to the flag would have
any effect on their behavior.
In contrast to the beliefs of the participants in the pilot study,
the results from the experiments reported here show that expo-
sure to the American flag introduces a bias toward the Republi-
can Party over the Democratic Party. In one experiment, we
tested whether subtle exposure to the American flag shifted
peoples beliefs, attitudes, and behaviors toward the Republican
end of the political continuum. We found that a single exposure
to a small American flag during deliberation about voting inten-
tions prior to a general election led to significant and robust
changes in participants’ voting intentions, voting behavior, and
political attitudes, all in the politically conservative direction. In
a separate experiment, we replicated these patterns more than a
year into a Democratic presidential term.
We also tested the longevity of this priming effect on
judgment and attitudes. Flag-priming effects may be espe-
cially potent if priming occurs while a person is consciously
deliberating about politics and voting intentions. We exposed
participants to the American flag once during such an argu-
ably critical psychological window and found that the effects
from this single exposure lasted up to 8 months later. This
prolonged influence represents one of the most durable prim-
ing effects in the cognitive sciences literature, and shows not
only that contextual effects can influence important political
decisions, but also that this influence can be robust and long
lasting.
Experiment 1
In this experiment, we tested whether a single exposure to the
American flag would lead participants to shift their attitudes,
beliefs, and behavior in the politically conservative direction.
We conducted a multisession study during the 2008 U.S. presi-
dential election. Starting in September 2008, we recruited
American adults across the United States to participate in a
paid online study of political beliefs and attitudes. We col-
lected measures from the same sample of participants at four
times over a period of 8 months.
Participants and recruitment
Between September 19 and October 10, 2008 (Session 1), 396
participants were recruited through advertising in online
social-networking sites (e.g., Facebook.com) to participate in
an online survey in exchange for a $10 Amazon.com gift cer-
tificate. In order to avoid the possibility that our priming
manipulation might alter the outcome of the election, we used
measurements from Session 1 to identify participants (n =
235) from the initial pool who planned to vote in a state where
polling indicated that a significant margin separated Obama
and McCain. These participants were randomly assigned to
either the flag-prime or the control condition.
The participants who were in solidly Republican or Demo-
cratic states were contacted to complete questionnaires for
Session 2 (starting on October 11, 2008, and ending on the day
before the election, November 3, 2008) and Session 3 (Novem-
ber 5 through November 12, 2008) in exchange for a $15
Amazon.com gift certificate. Of the participants contacted,
197 completed Session 2, and 191 completed Session 3. More
than 79% of participants completed Session 2 by October 21;
thus, the vast majority of participants voted at least 2 weeks
after their exposure to the prime. In early July of 2009, the
participants who had completed Session 3 were contacted to
complete Session 4 in exchange for a 1 in 20 chance to win a
$25 Amazon.com gift certificate. Seventy-one participants
completed this session (37.2%). We attribute this relatively
high rate of attrition to the use of a lottery rather than guaran-
teed payment.
There were no significant differences on any variables of
importance (e.g., political ideology, voting intentions, beliefs
about specific political issues, religiousness, nationalism, need
for cognition) between the participants who did and did not
complete the 8-month follow-up.
We excluded 8 participants (4 in each of the two condi-
tions) from the analyses because they completed the measures
in Session 1 in less than 10 min (median time = 36 min).
at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from
Long-Term Effects of U.S. Flag Exposure on Republicanism 1013
Materials and procedure
Session 1. Measures directly or potentially relevant to our
hypotheses were embedded within a larger set of personality
measures that participants completed in Session 1. Relevant
measures included the Patriotism and Nationalism subscales
of the Patriotism and Nationalism Scale (Kosterman &
Feshbach, 1989), a measure of warmth toward the candidates,
a demographics questionnaire, a measure of political orienta-
tion and exposure to news media, and a survey of attitudes
regarding specific political issues (to view the survey, see
Instrument Details in the Supplemental Material available
online). Participants also completed measures of intention to
vote for Barack Obama and Joseph Biden, and for John
McCain and Sarah Palin, using separate 11-point scales (from
1, definitely not, to 11, absolutely). Surveys were presented in
random order. None of these measures moderated the effects
observed in subsequent sessions.
Session 2. In Session 2, all participants first reported their
voting intentions, using the same 11-point scales used in
Session 1. For participants assigned to the flag-prime condition,
a small picture (72 × 45 pixels) of an American flag was present
in the top left corner of the survey. For participants in the control
condition, there was nothing in the corner of the survey (to view
the survey, see Experimental Manipulations in the Supplemen-
tal Material). Except for this single presentation of the Ameri-
can flag on this particular survey, the procedure and materials
in all sessions were identical for all participants.
Participants also answered several questions unrelated to the
present hypothesis. They then rated their warmth toward the
Democratic and Republican Parties, presidential candidates,
and vice presidential candidates (using 500-point analog sliding
scales); completed measures of political orientation, news-
consumption habits, and exposure to specific news sources;
answered the same questions about political issues asked in Ses-
sion 1; and rated the importance of those political issues.
After completing all of the surveys, participants completed
a number of Implicit Association Tests (IATs; Nosek, Green-
wald, & Banaji, 2007), presented in random order. The IAT
measures that were directly relevant to the current hypothesis
included a Barack Obama/John McCain IAT, a Joseph Biden/
Sarah Palin IAT, and a Democrat/Republican IAT. These tests
were presented and scored in accordance with the procedures
outlined by Nosek, Greenwald, and their colleagues (following
Lane, Banaji, Nosek, & Greenwald, 2007). Higher scores repre-
sent greater positivity toward the Republican candidate or party.
Session 3. In Session 3, participants were first asked to report
which candidate they voted for, selecting their choice from a
list that included the major- and minor-party candidates who
appeared on the ballots in most states, as well as “other” and
“did not vote.” Participants also answered questions about
their vote choice and the attributes of Barack Obama and John
McCain. They then rated how fairly they felt the media had
treated each presidential and vice presidential candidate, using
9-point scales (4 = very unfairly negatively, 2 = somewhat
unfairly negatively, 0 = accurately, +2 = somewhat unfairly
positively, +4 = very unfairly positively).
Finally, participants completed measures about their news-
consumption habits and their exposure to specific television,
print, and radio news sources. After completing Session 3, par-
ticipants were referred to a Web site containing questions that
probed for suspicion about the experiment. Once participants
had answered these questions, they were debriefed on the
nature of the study. No participants expressed any suspicion
about the presence of the American flag during Session 2.
Session 4. In the final session, participants first answered
a number of questions about their current feelings about
President Obama and his job performance to date, using
11-point Likert scales. Next, participants indicated how
warmly they felt toward a variety of liberal and conservative
leaders using the same analog sliding scales used previously,
and answered the same questions about political beliefs used
in previous sessions. Participants were also asked to report
their personal political ideology, their religiousness, the impor-
tance of being an American to their identity, their media-
consumption habits, and their exposure to the same variety of
news sources asked about in Session 3.
Participants were then thanked and presented with further
debriefing information about the study.
Session 2 results
Voting intentions. We created composite measures of voting
intentions for both Sessions 1 and 2 by calculating the differ-
ence between intentions to vote for McCain and intentions to
vote for Obama; higher numbers indicate a greater intention to
vote for McCain than for Obama. We then regressed the cen-
tered Session 2 intentions on centered Session 1 intentions and
used the residuals from this analysis as our main measure of
voting intentions. Thus, we measured the impact of the flag
prime on voting intentions during Session 2 that could not be
explained by voting intentions from Session 1.
As predicted, participants in the flag-prime condition (M =
0.072, SD = 0.47) reported a greater intention to vote for
McCain than did participants in the control condition (M =
0.070, SD = 0.48), t(181) = 2.02, p = .04, d = 0.298 (see
Fig. 1).
Explicit attitudes. We created a composite score of partici-
pants’ ratings of warmth toward the Republican and Demo-
cratic Parties, presidential candidates, and vice presidential
candidates, controlling for the same measures administered at
Session 1. Higher scores indicate more positive feelings
toward the Republican Party and candidates than toward the
Democratic Party and candidates. As predicted, participants in
the flag-prime condition (M = 0.424, SD = 2.73) felt relatively
more warmth toward the Republican Party and Republican
at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from
1014 Carter et al.
candidates than did participants in the control condition (M =
0.410, SD = 2.37), t(181) = 2.21, p = .03, d = 0.354 (see
Fig. 2).
Implicit attitudes. We created a composite measure from
scores on the three political IATs to represent the aggregate
positivity toward the Republican Party and Republican candi-
dates relative to the Democratic Party and Democratic candi-
dates. Participants in the flag-prime condition (D = 0.006)
showed significantly more positivity toward the Republican
Party and candidates than did participants in the control condi-
tion (D = 0.102), t(173) = 2.03, p < .05, d = 0.313, an effect
that was mirrored in each of the IATs separately.
Political beliefs. Participants’ responses were reverse-scored
when needed and then averaged into a composite measure of
political beliefs (α = .84). This index was correlated with self-
reported party affiliation and political ideology (r = .73, p <
.001), which confirmed that reported political beliefs did cor-
respond with participants’ reported political ideology.
Participants in the flag-prime condition reported margin-
ally more conservative beliefs (M = 3.25, SD = 0.82) than did
participants in the control condition (M = 3.03, SD = 0.79),
t(181) = 1.80, p = .07, d = 0.274. This result held, and even
increased slightly, when we controlled for responses to mea-
sures of political beliefs in Session 1, β = 0.141, t(180) = 1.84,
p = .06 (see Fig. 1).
Session 3 results
Voting behavior. To maximize statistical power in measuring
voting behavior, we analyzed data only from participants who
reported voting for McCain or Obama (n = 166). Although
participants in the control condition generally tended to vote
for Obama (83.5% for Obama, 16.5% for McCain), this ten-
dency was significantly reduced in the flag-prime condition
(72.8% for Obama, 27.2% for McCain), χ2(1, N = 166) = 2.81,
p < .05, one-tailed (see Fig. 3). This pattern held when we
analyzed the data from all participants, although the signifi-
cance level dropped. It is worth noting that voting behavior
was highly predicted by voting intentions reported in Session 2.
Indeed, when we included voting intentions and priming
condition as predictors of voting behavior in a regression anal-
ysis, voting intentions remained reliably predictive, β = 3.26,
–0.15 –0.10 –0.05 00.050.100.15
Voting Intentions
Political Beliefs
Standardized Residual Score
Control Condition Flag-Prime Condition
Fig. 1. Voting intentions and political attitudes at Session 2 in Experiment
1 as a function of condition (flag prime or control). The graph presents
standardized residual scores that control for responses to the same measures
administered at Session 1. Higher numbers indicate a greater intention to
vote for the Republican candidates relative to the Democratic candidates
and greater support for the politically conservative position relative to the
politically liberal position. Error bars indicate ±1 SEM.
–1.5 –1.0 –0.5 00.5 1.
01
.5
Session 2
Session 4
Warmth Toward Candidates and Parties
Control Condition Flag-Prime Condition
Fig. 2. Relative preference for the Republican and Democratic Parties and
presidential and vice presidential candidates as a function of condition (flag
prime or control), at Sessions 2 and 4 in Experiment 1. The graph presents
standardized residual scores that control for responses to the same measures
administered at Session 1. Higher numbers indicate greater preference for
the Republican Party and candidates relative to the Democratic Party and
candidates. Error bars indicate ±1 SEM.
16.5%
83.5%
27.2%
72.8%
Control
Condition
Flag-Prime
Condition
McCain Obama
Fig. 3. Percentage of participants in the control and flag-prime conditions
who reported voting for McCain and for Obama in Session 3 of Experiment
1 (n = 166).
at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from
Long-Term Effects of U.S. Flag Exposure on Republicanism 1015
χ2(1, N = 166) = 27.67, p < .0001, whereas the effect of prim-
ing condition dropped to nonsignificance (p = .25). These
results suggest that the effect of priming condition on voting
behavior was mediated by voting intentions, rather than that
priming condition had an unmediated, direct effect on voting
behavior (see also Hassin et al., 2007).
Treatment in the media. We created a composite index of
how fairly participants believed the media treated the candi-
dates; on this index, positive values indicate the belief that the
media treated the Republican candidates better than they
treated the Democratic candidates, and negative numbers indi-
cate the opposite belief. Although participants in the control
condition generally believed that the media were unduly harsh
in their treatment of the Republican candidates (M = 1.39,
SD = 3.54), this tendency was significantly greater in the flag-
prime condition (M = 2.69, SD = 4.43), t(181) = 2.20, p =
.029, d = 0.370.
Session 4 results
Obama’s job performance. We averaged the ratings of
Obama’s job performance to create a composite measure (α =
.97). As predicted, participants in the flag-prime condition felt
less positively about Obama’s job performance at the 8-month
follow-up (M = 6.76, SD = 2.88) than did participants in the
control condition (M = 8.01, SD = 2.25), t(69) = 2.04, p < .05,
d = 0.44.
Explicit attitudes. We created a composite attitude index by
subtracting the average rating of warmth toward liberal lead-
ers from the average rating of warmth toward conservative
leaders. Participants in both conditions generally felt more
warmth toward the Democrats than toward the Republicans,
but participants in the flag-prime condition (M = 54.76, SD =
182.18) were less warm toward Democrats than were partici-
pants in the control condition (M = 193.47, SD = 176.16),
t(69) = 3.26, p = .002, d = 0.80. We found the same pattern of
results using the composite measure used in Session 2 (partici-
pants’ ratings of warmth toward the political parties, presiden-
tial candidates, and vice presidential candidates, controlling
for the same measures administered at Session 1), t(69) = 2.77,
p < .01, d = 0.71 (see Fig. 2).
Political beliefs. As was the case in Session 2, participants in
the flag-prime condition exhibited significantly more conser-
vative beliefs (M = 3.35, SD = 0.85) than did participants in
the control condition (M = 2.85, SD = 0.88), t(68) = 2.43, p <
.02, d = 0.60.1
Discussion
Our results demonstrate that a single exposure to an unobtrusive
American flag shifted participants’ voting intentions, voting
behavior, attitudes, and beliefs toward the Republican end of the
ideological spectrum. It is important to note that political ideol-
ogy and party affiliation did not moderate these effects. That is,
both liberal and conservative participants were influenced by
the flag prime, and in the same (conservative) direction. These
effects lasted 8 months after the initial exposure. Why did they
last so long? One possibility is that voting behavior (Session 3)
had an especially influential effect on beliefs and attitudes
reported in Session 4. Indeed, voting behavior did significantly
predict beliefs about policy and warmth toward political leaders
and parties at Session 4—beliefs: t(60) = 4.71, p < .001; warmth:
t(61) = 6.7, p < .001. This pattern raises the question of whether
the effects observed in Session 4 could be explained entirely by
a self-perception account, whereby participants at Session 4
merely recalled their voting choice. The data do not support this
account. Controlling for voting behavior at Session 3, priming
condition still significantly predicted warmth toward Demo-
crats and Republicans in Session 4 (p < .01), and marginally
significantly predicted attitudes regarding political issues (p <
.09). Moreover, analyses controlling for voting intentions as
measured in Session 2 also showed that priming condition still
significantly predicted warmth (p < .01) and marginally signifi-
cantly predicted attitudes regarding political issues (p < .08).
These results suggest that the flag prime’s initial influence was
not restricted to voting intentions but also extended to attitudes
and beliefs more broadly, and that it was the accumulation and
perhaps rolling influence of these influences that affected voting
behavior at Session 3 and attitudes and beliefs at Session 4.
It is noteworthy that the size of the priming effect was con-
siderably larger in Session 4 than in the earlier sessions. Might
this have been due to the selective attrition of participants? Of
the participants who completed Session 4, those in the flag-
prime and control conditions did not differ in their political ide-
ology or voting intentions as measured in Session 1; this
suggests that any between-condition differences in Session 4
were not the product of a particular coincidence of attrition of
liberal participants from the flag-prime condition and attrition
of conservative participants from the control condition. Further-
more, participants who chose to take part in Session 4 showed
no baseline differences (on more than 20 variables) from those
who did not. It is of course impossible to definitively rule out
the possibility of selective attrition, as participants may have
differed on some unmeasured variable. There is some evidence
that people who have been exposed to persuasive appeals show
increasingly strong effects of those appeals over time (i.e.,
“sleeper effects”; Kumkale & Albarracín, 2004; see also Cook
& Flay, 1978; Pratkanis, Greenwald, Leippe, & Baumgardner,
1988), although the applicability of that evidence to the current
findings remains speculative.
Experiment 2
Before concluding that exposure to the American flag pro-
duces a bias toward Republicanism, we tested whether the flag
creates a shift specifically toward Republicanism, rather than
toward whichever party currently controls the executive
at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from
1016 Carter et al.
branch of the government. Thus, we conducted Experiment 2
in the spring of 2010, more than a year after the election of
President Obama and while the Democrats still had the major-
ity in both houses of Congress.
Participants and recruitment
Seventy participants completed the experiment for either $5 or
extra credit in a psychology class. Four participants were
excluded from the analyses: 1 who had previously taken part
in a highly similar experiment, 1 who did not complete the part
of the experiment that contained the priming, and 2 who
guessed the hypothesis.
Materials and procedure
Once participants arrived at the lab, they completed a task that
they were told concerned the ability to discern the time of day
that a photograph had been taken. They were presented with
four photographs of buildings and asked to estimate whether
they thought each photograph had been taken during the morn-
ing, afternoon, or evening (for examples, see Experimental
Manipulations in the Supplemental Material). For participants
randomly assigned to the flag-prime condition, two of the four
photographs had American flags in them (on flag poles or
hanging from the front of the building). For participants in the
control condition, the flags were digitally removed. After this
task, participants completed a short (eight-item) version of the
political belief survey used in Experiment 1; responses were
made on a 7-point scale.
Results and discussion
The responses were reverse-coded when needed and averaged
together (α = .67). Attitudes of participants in the flag-prime
condition (M = 3.10) were significantly closer to the Republi-
can end of the scale than were attitudes of participants in the
control condition (M = 2.65), t(64) = 2.04, p < .05. This find-
ing suggests that the American flag introduced a shift toward
the Republican worldview, even during a Democratic admin-
istration. Again, the effect was not moderated by political ide-
ology or any other measured variable, which suggests that the
flag produced the same conservative shift for both liberal and
conservative participants.
General Discussion
Although the American flag is assumed to represent the entire
country, our findings suggest that the psychological processes
put in motion by flag priming yield increased support for the
beliefs of a particular political party. Subtle exposure to the
American flag significantly shifted both Democratic and
Republican participants’ beliefs, attitudes, and voting behavior
toward Republicanism.
These findings provide the first empirical evidence that a
national flag can push citizens toward a specific end of the
ideological spectrum, rather than having the unifying effect
documented extensively in the social sciences literature (Baker
& O’Neal, 2001; Hassin et al., 2007; Mueller, 1970). Why did
a national flag have an ideologically specific effect (i.e., creat-
ing a bias toward Republicanism) in our study, even though
previous research has shown a unifying effect? As we noted in
the introduction, the American flag seems to be perceived (at
least in our samples) as more closely linked with the Republi-
can than with the Democratic Party, and this “flag branding”
may be especially influential in a two-party system in which
there are typically only two viable voting choices. In other
words, the American flag conjures up Republican beliefs and
attitudes, and these primes collectively push people in the
Republican direction. By contrast, if any flag branding of a
particular party or viewpoint exists in a political system that
allows for multiple parties and viewpoints, such branding may
be relatively diluted and thus less influential.
It is possible that the American flag does indeed have a uni-
fying influence that can manifest itself as increased Republican-
ism. In other words, the flag might trigger concepts of unity or
political moderation that move people toward the center of the
ideological spectrum, but because the samples in our studies
were relatively Democratic and liberal, their movement toward
the center was a move toward Republicanism. The effects of the
American flag observed in our experiments are therefore con-
sistent with the flag either having a unifying effect or inducing a
movement toward conservative beliefs and attitudes. If the for-
mer explanation is correct, exposing a highly conservative sam-
ple to an American flag prime would lead to a shift toward the
Democratic end of the spectrum. If the latter explanation is cor-
rect, participants already located at the Republican end of the
ideological spectrum would show little movement toward the
center if exposed to an American flag prime.
The mechanism may be more nuanced than either of these
possibilities, however. As we have argued elsewhere (Hassin
et al., 2009), national flags may be strongly associated specifi-
cally with prototypes of national citizens and may influence
people by shifting their attitudes toward those of the (imagi-
nary) prototypical citizen. The direction of the shift for a given
sample of people would depend on whether those people
believe the prototypical citizen is more liberal or more conser-
vative than they are themselves. In a way, this would be a uni-
fying effect, because the flag would move people toward what
they perceive to be the typical or average citizen. And yet, as
long as people believe that the typical American is more con-
servative than they are, this “unifying” effect would result in a
shift toward Republicanism. We do have some evidence that
our participants generally believed that the prototypical Amer-
ican is more conservative than they are themselves. At the end
of Session 4, we asked participants in Experiment 1 about their
views of the “typical American.” Although participants gener-
ally anchored on their own beliefs in estimating those of the
at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from
Long-Term Effects of U.S. Flag Exposure on Republicanism 1017
typical American, they felt that the typical American would
feel more warmly toward Republican politicians than they did
themselves, paired t(68) = 2.34, p < .03, and that the typical
American would give more Republican answers to the specific
policy questions than they had themselves, paired t(68) = 7.07,
p < .001. Future research can test more directly how people’s
beliefs about the prototypical citizen predict the effect of flag
priming on political thought and behavior (see Hassin et al.,
2009, for a more detailed discussion).
Our results also demonstrate that a single exposure to a
national flag can have wide-ranging effects. Why did a single,
brief exposure to the American flag in Experiment 1 have such
an enduring impact? Indeed, considering how often Ameri-
cans are exposed to their flag, why would this one exposure
have any impact at all? In contrast with the vast majority of
instances in which people are exposed to the American flag,
this particular exposure occurred when participants were
reporting their voting intentions, an act that has been shown to
strongly predict and shape voting behavior (Greenwald, Car-
not, Beach, & Young, 1987). For some participants, explicitly
declaring voting intentions may have been a rare event that
further crystallized their stated intentions and attitudes, incor-
porating any bias introduced by the presence of the flag at that
critical moment. Indeed, when we controlled for participants’
voting intentions at Session 2, the effect of the flag exposure
on voting behavior dropped to nonsignificance (see also
Hassin et al., 2007). Thus, exposure to the American flag may
have an especially strong influence when it occurs immedi-
ately before or during a person’s consideration of political
issues or declaration of political decisions (e.g., in the voting
booth).
It is also important to note that exposure to the American
flag can have a range of short-term effects that are not depen-
dent on conscious declarations, and are not even overtly politi-
cal (Carter, Ferguson, & Hassin, 2011; Ferguson & Hassin,
2007). For example, Ferguson and Hassin (2007) found that
brief exposure to the American flag increased aggressive
thoughts and behavior, specifically among people who fol-
lowed news about politics.
Our data suggest that American people are not aware of
this effect: Participants in our pilot study erroneously believed
that exposure to the American flag would not influence their
political behavior or attitudes. This mistaken belief is in line
with the standard claim in psychology and political science
that important political behavior results from careful and
rational deliberation (Baum & Jamison, 2006; Downs, 1959;
Lau & Redlawsk, 1997). Thus, our findings challenge lay-
people’s assumptions as well as the standard claim in the lit-
erature, and extend recent research showing that subtle cues
in the environment—from polling locations (Berger et al.,
2008), to the facial characteristics of political candidates
(Greenwald, Smith, Sriram, Bar-Anan, & Nosek, 2009;
Rule et al., 2010; Todorov et al., 2005), to the presence of
national flags (Hassin et al., 2007)—can significantly influence
how people vote.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with
respect to their authorship or the publication of this article.
Supplemental Material
Additional supporting information may be found at http://pss.sagepub
.com/content/by/supplemental-data
Note
1. In Session 4, participants responded to an additional item about
extreme interrogation techniques that was not included in previous
sessions. Including this measure in the composite measure did not
change the results.
References
Baker, W. D., & O’Neal, J. R. (2001). Patriotism or opinion
leadership? The nature and origins of the “rally ’round the
flag” effect. Journal of Conflict Resolution, 45, 661–687. doi:
10.1177/0022002701045005006
Baum, M. A., & Jamison, A. S. (2006). The Oprah effect: How soft
news helps inattentive citizens vote consistently. The Journal of
Politics, 68, 946–959. doi:10.1111/j.1468-2508.2006.00482.x
Berger, J., Meredith, M., & Wheeler, S. C. (2008). Contextual
priming: Where people vote affects how they vote. Proceedings
of the National Academy of Sciences, USA, 105, 8846–8849.
doi:10.1073/pnas.0711988105
Billig, M. (1995). Banal nationalism. London, England: Sage.
Carney, D. R., Jost, J. T., Gosling, S. D., & Potter, J. (2008). The
secret lives of liberals and conservatives: Personality profiles,
interaction styles, and the things they leave behind. Political Psy-
chology, 29, 807–840.
Carter, T. J., Ferguson, M. J., & Hassin, R. R. (2011). Supporting
the American system: The relationship between implicit American
nationalism and system justification. Social Cognition, 29, 341–359.
Cook, T. D., & Flay, B. R. (1978). The persistence of experimentally-
induced attitude change. In L. Berkowitz (Ed.), Advances in
experimental social psychology (Vol. 11, pp. 1–57). New York, NY:
Academic Press.
Downs, A. (1959). An economic theory of democracy. New York,
NY: Harper and Row.
Ferguson, M. J., & Hassin, R. R. (2007). On the automatic associa-
tion between America and aggression for news watchers. Person-
ality and Social Psychology Bulletin, 33, 1632–1647.
Gellner, E. (2005). Nations and nationalism. Reno: University of
Nevada Press.
Greenwald, A. G., Carnot, C. G., Beach, R., & Young, B. (1987).
Increasing voting behavior by asking people if they expect to vote.
Journal of Applied Psychology, 72, 315–318. doi:10.1037/0021-
9010.72.2.315
Greenwald, A. G., Smith, C. T., Sriram, N., Bar-Anan, Y., & Nosek,
B. A. (2009). Implicit race attitudes predicted vote in the 2008
U.S. presidential election. Analyses of Social Issues and Public
Policy, 9, 241–253. doi:10.1111/j.1530-2415.2009.01195.x
Hassin, R. R., Ferguson, M. J., Kardosh, R., Porter, S. C., Carter,
T. J., & Dudareva, V. (2009). Précis of implicit nationalism. In
at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from
1018 Carter et al.
O. Vilarroya, S. Atran, A. Navarro, K. Ochsner, & A. Tobena
(Eds.), Values, empathy, and fairness across social barriers
(pp. 135–145). Hoboken, NJ: Wiley-Blackwell.
Hassin, R. R., Ferguson, M. J., Shidlovski, D., & Gross, L. (2007).
Subliminal exposure to national flags affects political thought
and behavior. Proceedings of the National Academy of Sciences,
USA, 104, 19757–19761. doi:10.1073/pnas.0704679104
Kosterman, R., & Feshbach, S. (1989). Toward a measure of patriotic
and nationalistic attitudes. Political Psychology, 10, 257–274.
Kumkale, G. T., & Albarracín, D. (2004). The sleeper effect in persua-
sion: A meta-analytic review. Psychological Bulletin, 130, 143–172.
Lane, K. A., Banaji, M. R., Nosek, B. A., & Greenwald, A. G. (2007).
Understanding and using the Implicit Association Test: IV.
What we know (so far) about the method. In B. Wittenbrink &
N. Schwarz (Eds.), Implicit measures of attitudes (pp. 59–102).
New York, NY: Guilford Press.
Lau, R. R., & Redlawsk, D. P. (1997). Voting correctly. American
Political Science Review, 91, 585.
Mueller, J. E. (1970). Presidential popularity from Truman to John-
son. American Political Science Review, 64, 18–34.
Nosek, B. A., Greenwald, A. G., & Banaji, M. R. (2007). The implicit
association test at age 7: A methodological and conceptual review.
In J. A. Bargh (Ed.), Social psychology and the unconscious: The
automaticity of higher mental processes (pp. 265–292). New
York, NY: Psychology Press.
Pratkanis, A. R., Greenwald, A. G., Leippe, M. R., & Baumgardner,
M. H. (1988). In search of reliable persuasion effects: III. The
sleeper effect is dead. Long live the sleeper effect. Journal of
Personality and Social Psychology, 54, 203–218.
Rule, N. O., Ambady, N., Adams, R. B., Jr., Ozono, H., Nakashima,
S., Yoshikawa, S., & Watabe, M. (2010). Polling the face: Predic-
tion and consensus across cultures. Journal of Personality and
Social Psychology, 98, 1–15. doi:10.1037/a0017673
Skitka, L. J. (2005). Patriotism or nationalism? Understanding
post-September 11, 2001, flag-display behavior. Journal of
Applied Social Psychology, 35, 1995–2011. doi:10.1111/j.1559-
1816.2005.tb02206.x
Todorov, A., Mandisodza, A. N., Goren, A., & Hall, C. C. (2005).
Inferences of competence from faces predict election outcomes.
Science, 308, 1623–1626. doi:10.1126/science.1110589
at UNIV OF CHICAGO LIBRARY on August 19, 2011pss.sagepub.comDownloaded from