Questioning the End Effect 1

Questioning the End Effect 1 Running head: QUESTIONING THE END EFFECT Word count: 11,417 Questioning the End Effect: Endings Do Not Inherently Have a Disproportionate Impact on Evaluations of Experiences Stephanie Tully New York University Tom Meyvis New York University Author Note Stephanie Tully is a doctoral candidate of marketing at the Stern School of Business, New York University. Tom Meyvis is Professor of Marketing and Peter Drucker Faculty Fellow at the Stern School of Business, New York University. Correspondence concerning this article should be addressed to Stephanie Tully, Stern School of Business, 40 W. 4th Street, Ste 822, New York University, New York, NY 10012. E-mail: stully@stern.nyu.edu. Electronic copy available at: http://ssrn.com/abstract=2498663 Questioning the End Effect 2 Abstract The present research re-examines one of the most basic findings regarding the evaluation of hedonic experiences: the end effect. The end effect suggests that people’s retrospective evaluations of an experience are disproportionately influenced by the final moments of the experience. The findings in this paper indicate that endings are not inherently over-weighted in retrospective evaluations. That is, episodes do not disproportionately affect the evaluation of an experience simply because they occur at the end. We replicate prior demonstrations of the end effect, but provide additional evidence implicating other processes as driving factors of those findings. Keywords: retrospective evaluations, end effect, experiences Electronic copy available at: http://ssrn.com/abstract=2498663 Questioning the End Effect 3 Questioning the End Effect: Endings Do Not Inherently Have a Disproportionate Impact on Evaluations of Experiences “Did you like the concert?” “How much did you enjoy that restaurant?” “How painful was this medical procedure?” To answer common questions such as these, people need to evaluate the experiences they live through. Since these evaluations in turn influence people’s willingness to recommend or repeat an experience (e.g., Wirtz et al., 2003), it is essential to understand how people form such retrospective evaluations of their past experiences. The current research re-examines one of the most basic findings in this area: the end effect. The end effect refers to the fact that people’s retrospective evaluations are disproportionately influenced by the final moments of the experience (e.g., Kahneman et al., 1993; Fredrickson & Kahneman, 1993). While there are many prior demonstrations of the end effect, previous research has also documented several notable boundary conditions. In the current work, we do not explore such boundary conditions, but instead revisit the basic effect, focusing on a simple, continuous experience to test if the end of an experience does indeed inherently receive disproportionately more weight. While we acknowledge that endings can have a disproportionate impact on retrospective evaluations, our findings suggest that this is not due to an inherent over-emphasis of the final moments of an experience, but rather because of specific additional properties of the end in certain settings. Prior research has proposed that, when retrospectively evaluating an experience, people do not add or integrate their reactions across the experience, but rather recall the most representative moments of the experience and then evaluate the experience based on these selected moments (e.g., Ariely & Carmon, 2000; Kahneman, 2000a; Kahneman, 2000b; Varey & Kahneman 1992). Furthermore, the most representative moments of an experience tend to consist Questioning the End Effect 4 of the most extreme moment (the peak) and the final moment (the end). Thus, according to this evaluation-by-moments principle, the peak and the end of the experience will disproportionately affect the global evaluation of the experience (Kahneman, 2000a). The over-weighting of the end of the experience, in particular, has received substantial attention and has led to a variety of recommendations to restructure experiences to take advantage of this effect, including to optimize customer experiences (Cusick, 2012; Shaw, Dibeehi, & Walden, 2010), to understand American’s sentiment about the economy (Surowiecki, 2002), and to improve personal happiness and well-being (Conniff, 2006). The end effect has empirically been demonstrated across a variety of domains and using a variety of procedures (e.g., Kahneman et al., 1993; Fredrickson & Kahneman, 1993; Redelmeier & Kahneman, 1996; Ariely, 1998). Much of the empirical support for the end effect is based on the analysis of online (moment-to-moment) ratings of affective experiences. More specifically, previous research has demonstrated that the online rating of the end of the experience is often a disproportionately effective predictor of the retrospective evaluation of the entire experience. This has been shown for a wide range of stimuli including medical procedures (Redelmeier & Kahneman, 1996), painful pressure from a vise (Ariely, 1998), annoying noises (Ariely & Zauberman, 2000; Schreiber & Kahneman, 2000), advertisements (Baumgartner, Sujan & Padgett, 1997), and television shows (Hui, Meyvis, & Assael, 2014). Additional evidence for the end effect comes from studies documenting the effect of “adding a better end.” Participants in those studies show an irrational preference for negative experiences with an additional period of reduced discomfort over the same experience without the “better” (i.e., less aversive) end. For instance, Schreiber & Kahneman (2000) asked participants to listen to a series of annoying noises and observed that participants preferred a Questioning the End Effect 5 longer sound profile with a less intense ending to a shorter sound profile that was identical but lacked the additional, less aversive ending. As an example, participants preferred a sound profile that consisted of 8 seconds of noise at 78 decibels followed by 16 seconds of noise at 66 decibels over a sound profile that consisted only of 8 seconds of noise at 78 decibels. This beneficial effect of adding a better (less aversive) ending has also been observed with other experiences, such as submerging one’s hands in ice water (Kahneman et al., 1993), undergoing a colonoscopy (Redelmeier, Katz, & Kahneman, 2003), and judgments of hypothetical pain profiles (Varey & Kahneman, 1992). Finally, the end effect has also received support from studies that systematically manipulated the order in which different components of the experience were presented, and generally found that participants reacted most favorably to experiences in which the best part was positioned at the end. For instance, Ariely and Zauberman (2000) observed that participants rated an annoying sound profile as more aversive when the most intense sound was positioned at the end of the profile rather than at the beginning or the middle. However, in spite of these many demonstrations of the end effect, prior research has also documented several important boundary conditions. First, the end of an experience does not have a disproportionate impact when that experience is expected to continue in the future (i.e., when the end is seen as temporary). For instance, when participants in a social interaction expected to further interact with the other person in the future, the most recent interaction did not receive additional weight in the global evaluation of the personal relationship (Fredrickson, 1991). Similarly, when people evaluated a series of aversive pictures that they had viewed and anticipated seeing again in the near future, the peak, but not the end dominated the evaluation of that experience (Branigan et al., 1997). Second, the presence of the end effect also depends on Questioning the End Effect 6 the type and structure of the experience. Breaking up simple experiences into segments attenuates the end effect, whereas complex experiences consisting of qualitatively distinct components often fail to show any end effect at all. For instance, segmenting aversive sounds into discrete parts reduced the end effect (Ariely & Zauberman, 2000) and no end effect was observed in evaluations of activities over the course of a day (Miron-Shatz, 2009), evaluations of vacations (Kemp, Burt, & Furneaux, 2008), or evaluations of meals (Rode, Rozin, & Durlach, 2007). Finally, the end effect does not appear to be a basic evolutionary trait shared with other animals as it does not extend to food sequence preferences of rhesus macaque monkeys (Xu, Knight, & Kralik, 2011). In sum, prior research includes both ample demonstrations of the end effect as well as many studies documenting boundary conditions. What does this imply for the status of the end effect? One possibility is that the end does inherently have a disproportionate influence, but specific conditions can activate other processes that interfere with (or compensate for) the effect. However, another possibility is that the end does not inherently have a disproportionate impact. In that case, the prior demonstrations of the end effect may be driven by other mechanisms, with the boundary conditions merely reflecting the absence of those mechanisms. Closer inspection of the prior demonstrations of the end effect provides some initial support for this second possibility. First, consider the prior demonstrations that adding a better end to an aversive experience improves the overall evaluation of that experience. Although adding a better end does indeed manipulate the end of that experience, it also reduces the average intensity of the experience. Therefore, the improvement in the overall evaluation of the experience could be driven by the change in average intensity, rather than the over-weighting of the end. As such, these findings are more accurately classified as demonstrations of duration neglect, rather than Questioning the End Effect 7 demonstrations of an end effect. Second, consider the finding that the online (i.e., moment-tomoment) rating of the final moments of an experience is a particularly good predictor of the overall evaluation (relative to ratings of other parts of the experience). While this finding is consistent with the overweighting of the end, it would also occur in the absence of an end effect if participants’ online ratings incorporate information from past as well as current moments of the experience. This would result in the final ratings being more informed than the initial ratings, and therefore correlating more strongly with the overall evaluation. Moreover, providing explicit online ratings may artificially enhance the salience of the final rating, leading participants to use it as an anchor in their global evaluation. Finally, the finding that experiences are evaluated more favorably when the best part is positioned at the end (rather than elsewhere in the experience) is based on studies that systematically varied the position of the different components of the experience within-subjects. Asking participants to evaluate multiple experiences that were identical in all respects except for the order of the components likely encouraged participants to rely on that order in their evaluations, as it was the only aspect that varied. As such, this procedure may have lead participants to rely on their lay beliefs about how experiences should be optimally structured (e.g., “This was identical to the previous experience, with exception that the best part now came at the end. Ending on a high note is good, so I like that experience more.”). It should be noted that, even if endings do not inherently have a disproportionate impact, this does not preclude that, under specific circumstances, they can in fact have a greater impact than other parts of the experience. This would for instance be the case if the structure of the experience is made salient and consumers rely on their lay beliefs about the desirability of favorable endings (as may be the case in the within-subject designs mentioned earlier). Similarly, Questioning the End Effect 8 past research has observed a strong impact of the end of an experience when that end is particularly meaningful, as is the case with goal-directed experiences (Carmon & Kahneman, 1996), where endings determine whether a goal is met, and television shows (Hui et al., 2014), where endings serve as meaningful conclusions of a storyline. However, in those cases, the end does not have a disproportionate impact merely because it is the end, but rather because it has additional properties that increase its significance relative to the rest of experience. Overview of the Current Research In this paper, we focus on the inherent impact of the end, and test whether merely being the end of an experience is sufficient for disproportionately affecting overall evaluations. To do so, we do not examine novel situations, nor the complex experiences that have been previously identified as boundaries of the effect. Instead, we examine the type of basic experience that has traditionally been used in studies that have provided support for the effect: listening to short fragments of simple auditory stimuli. In a first study, we observe that aversive sounds with either a better beginning or a better ending are not rated differently, even when participants clearly recall the ending as better or worse than the rest of the experience. The remaining studies reconcile this lack of a discernable end effect with previous demonstrations of the effect. Studies 2 and 3 demonstrate that changing the ending does affect evaluations when the end changes the experience’s overall average, but not when the average is unaffected. Next, in studies 4 and 5, which use a repeated measures design, we observe that, while endings are not over-weighted in evaluations of the first experience, they are over-weighted in evaluations of a subsequent experience. That is, moving a distinct part of the experience to the end versus the beginning of an experience only affects people’s evaluations when they can readily observe this shift (being the only difference between both experiences). Finally, in a field study (Study 6), we examine the Questioning the End Effect 9 relationship between the overall evaluation of an experience and ratings of distinct components of the experience—and fail to observe any increased impact of the final rating. Study 1: A Better Beginning versus a Better End In this first study, we re-examine the end effect using a simple stimulus (an aversive noise), which is unlike the complex stimuli from the boundary condition studies, but similar to the stimuli used in the classic demonstrations of the effect (e.g., Ariely & Zauberman, 1999; Schreiber & Kahneman, 2000). Our study did, however, differ from those demonstrations in that it systematically manipulated the structure of the experience, both between participants and without changing the average intensity of the experience. To achieve this, we presented participants with one of two sound profiles, which were the inverse of each other, so that they were identical in total volume but one sound clip began loudly and ended quietly, whereas the other began quietly and ended loudly. Since the sound was an aversive noise, this implies that some participants experienced a better (less aversive) ending, whereas others experienced a worse ending. If the end of the experience has a greater impact on global evaluations than other parts of the experience, then the sound clip with the better ending should be rated as less aversive than the sound clip with the worse ending. Method Three hundred and three Mechanical Turk participants completed the study online in exchange for monetary compensation. Participants were told they would be listening to a few short irritating sounds. They were asked to listen to the sounds using their headphones, and told that they would need to identify the sounds they heard later in the experiment (to ensure that participants indeed listened to the sounds). Participants first listened to a sound clip of a dot matrix printer and were asked to use this sound to calibrate the volume of their headphones. Questioning the End Effect 10 Next, all participants listened to a short drill sound, and indicated how annoying the sound was on a 9-point scale (1 = not annoying at all, 9 = very annoying). This measure was included to be used as a covariate in the analyses and thus reduce error variance due to differences across participants in headphone volume or in their general aversion to annoying sounds. Participants then listened to one of two sound clips, depending on condition. Both clips consisted of 24 seconds of vacuum cleaner sound. One clip (Better End condition) started at a high volume which was sustained for 6 seconds, after which it gradually reduced in volume for the remaining 18 seconds, resulting in a relatively quiet (i.e., less aversive) ending. The other clip (Worse End condition) was identical, but reversed in time. That is, it started quietly and increased in volume for the next 18 seconds, ending with 6 seconds of sustained high volume noise. See Figure 1 for a visual depiction of the sound profiles. Figure 1. Visual depiction of sound profiles used in Studies 1 and 4. The height of the waveform represents the volume of the sound. Time is represented on the horizontal axis in seconds. Better End: Worse End: After listening to the sound clip, participants rated how annoying, unpleasant, and irritating it was to listen to the clip (all on 9-point scales anchored by: 1 = not at all, 9 = very). Questioning the End Effect 11 Further, to ensure that participants in the different conditions indeed noticed the difference in the volume of the ending, participants were asked to indicate how the end of the sound clip compared to the rest of the sound clip (9-point scale: -4 = end was much worse, 4 = end was much better). Next, to verify that participants indeed listened to the sound clip, they were asked to select the sound they listened to from three options (an ambulance, a car alarm, and a vacuum). We then asked participants whether they had adjusted the volume of their headphones at any point while listening to the sound clips. Finally, we collected demographic information. Results Six people failed to correctly recognize the sound clip and are thus excluded from the analysis, leaving a sample of 297 participants (MAge = 32.3, SD = 11.12; 63.3% male). Manipulation check. To verify that the manipulation of the ending was successful, we first analyzed participants’ perception of how the end compared to the rest of the clip. As intended, participants in the Better End condition rated the end of the clip as relatively better (M = 1.93, SD = 1.64) than did participants in the Worse End condition (M = -2.47, SD = 1.94), F(1, 295) = 273.67, p < .001, ηp2 = 0.481. Perceived aversiveness. The measures of annoyance, unpleasantness, and irritation were standardized and combined to form an aversiveness index (α = .94). To test the end effect, we then analyzed this index—while adjusting for the covariate (i.e., the aversiveness of the drill sound) to increase the power of this test. If the final moments of an experience indeed have an inherently disproportionate impact, participants in the Better End condition should rate their listening experience as less aversive than those in the Worse End condition. However, the two Questioning the End Effect 12 conditions did not substantially differ in the perceived aversiveness of the experience (M Better End = 6.69, SD = 1.71; M Worse End = 6.96, SD = 1.81), F < 1, ηp2 = .0021. Discussion Study 1 tested the end effect using a simple stimulus that was unlike the stimuli in the boundary condition studies (e.g., complex experiences, experiences that are expected to continue), but similar to the stimuli used in prior demonstrations of the end effect. Yet, in spite of this, we did not observe an end effect: placing the better (less aversive) part of the sound at the beginning versus the end did not substantially affect participants’ evaluation of the experience. This null effect is quite informative given the large sample size and given that participants readily recalled the ending as better or worse (consistent with the manipulation). In the following studies, we will provide additional tests of the end effect, as well as attempt to reconcile previous demonstrations with the absence of an effect in this study. Study 2: A Better Average versus a Better End In the second study, we revisit previous demonstrations that extending an aversive experience with a less aversive (but still negative) ending tends to improve the overall evaluation of the experience. Although this finding is consistent with an over-weighting of the end of the experience, it could also be due to a decrease in the average intensity of the experience. To distinguish between these two accounts, we exposed participants to one of three sound clips of an irritating noise: (1) a clip with a softer (and thus better) middle section (Better Middle), (2) a clip with a softer ending (Better End), or (3) a clip with a softer middle section and an additional softer ending (Added End). The Better Middle and Better End clips had an identical average volume and only differed in the timing of the softer section. The Added End clip consisted of the Better Middle clip with an additional, softer extension of the noise. 1 The means adjusted for the covariate: M Better End = 6.70, M Worse End = 6.97. Questioning the End Effect 13 Thus, the Better Middle and Better End clips differed in the aversiveness of the ending, but not in the average intensity of the experience, whereas the Added End clip differed from both other clips in the average intensity of the experience. If the end effect holds and endings are inherently over-weighted, then the Better End and Added End experiences should both be perceived as less aversive than the Better Middle experience. However, if adding a less aversive ending improves evaluations because it reduces the average intensity of the experience (and not because endings are over-weighted), then the Added End experience should be perceived as less aversive than both the Better Middle and Better End experiences, and there should be no difference between the perceived aversiveness in the Better Middle and Better End conditions. Method Two hundred and sixty undergraduate students participated in the study for either partial course credit or monetary compensation. Participants were seated at a desktop computer and asked to wear headphones, the volume of which was fixed and approximately equal across computers. All participants first listened to a short drill sound, and rated their irritation with the sound on a 101-point sliding scale (0 = not at all irritating, 100 = very irritating). As in Study 1, this measure was included to be used as a covariate in the analyses and thus reduce error variance due to individual differences in aversion to annoying sounds. Next, participants completed a short filler task before continuing with the main study. Participants were then asked to listen to the sound of a vacuum cleaner. They listened to one of three sound profiles, depending on condition. All three sound profiles consisted of a vacuum noise that fluctuated in volume between low and moderately high. In the Better Middle condition, the clip contained a 30-second low-volume segment in the middle of the clip. Both of Questioning the End Effect 14 the other conditions were based on the Better Middle condition, but in the Better End condition, the low-volume segment was moved to the end of the clip (instead of the middle), and in the Added End condition, an additional 30-second low-volume segment was added to the end of the experience (together with a 5-second transition, resulting in a total clip time of 170 seconds). Thus, the sound clips in the Better Middle and Better End conditions differed in ending, but not in average volume, whereas the sound clip in the Added End condition differed in average volume from the clips in both other conditions. See Figure 2 for a visual depiction of the sound profiles. Figure 2. Visual depiction of sound profiles used in Study 2. The height of the waveform represents the volume of the sound. Time is represented on the horizontal axis in seconds. Better End: Better Middle: Added End: After participants listened to the clip, they rated the extent to which they found the experience of listening to the sound annoying (9-point scale: 1 = mildly annoying, 7 = extremely annoying), unpleasant (9-point scale: 1 = mildly unpleasant, 7 = very unpleasant), or irritating Questioning the End Effect 15 (measured on the same scale as the covariate: a 101-point slider scale anchored by: 0 = mildly irritating, 100 = extremely irritating). After the primary dependent measures were collected, participants were asked to again listen to the drill sound that they listened to at the start of the study, and then indicated whether this experience was more or less irritating than listening to the vacuum sound (9-point scale, 1 = much less irritating, 9 = much more irritating). Participants then rated the volume of the vacuum sound (1 = very quiet, 9 = very loud). Next, participants indicated how much money, out of $10, they would give back to avoid repeating the experience, and how long (in seconds) they believed the experience lasted. These four additional measures were included to test whether, if the end effect would again not obtain in scale measures of the subjective experience, it might instead manifest in alternative measures: a relative preference measure (which avoids scaling effects), an evaluation of the objective experience (volume), valuation, or a downstream effect (on time perception). To verify that participants had noted the volume at the end of the clip, they were asked to indicate how the end of the experience compared to the rest of the experience (by selecting one of three options: the end was quieter, the end was about the same, the end was louder). Finally, participants provided demographic information and completed an Instructional Manipulation Check (Oppenheimer, Meyvis, & Davidenko, 2009), which consisted of a paragraph of text explaining the importance of reading instructions and asking participants to choose “none of the above” from a marital status dropdown list. Results Thirty-five people failed the Instructional Manipulation Check, leaving a sample of 224 participants (MAge = 20.2, SD = 2.17; 44.2% male). Questioning the End Effect 16 Manipulation check. Participants were more likely to indicate that the end was quieter than the rest of the sound clip in the Better End condition (P = 60.1%) than in the Better Middle condition (P = 31.5%), χ2 (1) = 12.82, p < .001, indicating that the manipulation of the ending was successful. Participants in the Added End condition were also more likely to indicate that the end was quieter (P = 45.3%) than were participants in the Better Middle condition, but this effect was only marginally significant, χ2 (1) = 2.95, p = .086 (possibly because this clip was longer and therefore the perception of the end extended beyond the final low-volume segment). Perceived aversiveness. The measures of annoyance, unpleasantness, and irritation were standardized and combined to form an aversiveness index (α = .93). As in Study 1, we analyzed this index while controlling for the aversiveness covariate (the rating of the drill sound at the start of the study) to increase the power of the tests. First, we tested the end effect by comparing the Better Middle and Better End conditions, which differed in ending, but not in average intensity. A planned contrast showed that the experience was not perceived as less aversive in the Better Middle condition (M = 0.00, SD = 0.92) than in the Better End condition (M = 0.11, SD = 0.96), F < 1, ηp2 < 0.001. Thus, as in Study 1, we again did not observe an end effect. Next, we tested whether adding a better end (rather than moving the better part to the end) changes the perceived aversiveness of the experience, by comparing the Added End condition to the other two conditions, both of which had a higher average intensity. A planned contrast confirmed that the experience was perceived as less aversive in the Added End condition (M = -0.10, SD = 0.94) than in the other two conditions, F(1, 220) = 4.43, p = .036, ηp2 = 0.020.2 Thus, while we again did not replicate the end effect, we did replicate the prior finding that adding a less aversive 2 The means adjusted for the covariate: M Added End = -0.16, M Better End = 0.06, and M Better Middle = 0.10. Questioning the End Effect 17 ending to a negative experience reduces the overall aversiveness of the experience (in spite of adding negative utility). Other measures. Similar to the aversiveness index, the additional measures did not show any difference between the Better Middle and Better End conditions. The position of the lowvolume segment did not affect the relative preference over the drill sound (F < 1), perceived volume (F < 1), willingness to pay to avoid the experience (F < 1), or the perceived duration of the clip (F(2, 220) = 1.63, NS). The contrast comparing the Added End condition to the other two conditions also did not show any reliable difference for relative preference over the drill sound (F(1, 220) = 1.18, NS) or perceived volume (F(1, 220) = 1.88, NS). However, participants in the Added End condition were willing to pay less to avoid the experience (M = $0.78, SD = 1.92) than were participants in the Better End condition (M = $1.56, SD = 2.52) or Better Middle condition (M = $1.60, SD = 3.37), F(1, 220) = 5.25, p = .023, ηp2 = 0.023, consistent with the earlier finding that adding a better end reduced the perceived aversiveness of the experience. Finally, participants in the Added End condition also provided higher estimates of clip duration (M = 155 secs, SD = 90.71) than those in the Better End condition (M = 116 secs, SD = 76.30) or the Better Middle condition (M = 101.09, SD = 50.56), F(1, 220) = 18.56, p < .001, ηp2 = 0.078, which was consistent with the actual longer duration of the clip in that condition. Discussion Moving the less aversive part of an irritating noise to the end versus the middle did not affect the perceived aversiveness of the experience, casting further doubt on the existence of an inherent end effect. However, extending the irritating noise with an additional, less aversive part did lead participants to perceive the overall experience as less aversive. Thus, Study 2 replicates Questioning the End Effect 18 prior findings of the beneficial effects of “adding a better end,” but also indicates that this effect is driven by a lowering of the average rather than a disproportionate impact of the end. In the next study, we conceptually replicate this finding in the positive domain. Study 3: Adding a Worse Middle versus a Worse End In Study 2, we examined the effect of adding a less aversive (i.e., better) segment to an aversive experience. In this next study, we examine the effect of adding a less enjoyable (i.e., worse) segment to an enjoyable experience: listening to pleasant music clips. Furthermore, unlike in Study 2, we now vary whether the segment is added to the middle of the experience or to the end of the experience. In other words, we compare the effect of adding a worse middle to the effect of adding a worse end. If evaluations of the experience are disproportionately based on the end of the experience, then adding a worse end should have a greater (negative) effect than adding a worse middle. However, if evaluations are based on the average intensity of the experience rather than the end of the experience, then adding a worse segment should have a similar (negative) effect, regardless of whether it is added to the end or to the middle. Method Pretest. For the main study, we constructed three different music clips: one music clip consisting of four enjoyable pieces of instrumental music (for the control condition) and two music clips consisting of those same four enjoyable pieces of music and one additional, less enjoyable piece of instrumental music (either inserted in the middle or at the end). To select these music fragments, we first pretested a wide range of instrumental music fragments using a sample of 121 participants drawn from the same population as used for the main study (Mechanical Turk). Each participant listened to a selection of 10 30-second clips of instrumental music (out of a total set of 30 clips) and rated each clip on a 9-point scale. Based on this pretest, Questioning the End Effect 19 we selected four clips that were enjoyed by most participants, namely 30-second fragments from “Herd Reunion” (from the Ice Age: Continental Drift Soundtrack, M = 6.84, SD = 1.91), “Heart Song” (performed by Gosha Mataradze, M = 6.29, SD = 2.09), Bach’s “Goldberg Variations” (M = 6.38, 1.55), and Mozart’s “Rondo Alla Turca” (M = 6.05, SD = 2.03). We also selected one sound clip that was significantly less enjoyable than each of the four other clips: “Reanimator” (performed by Amon Tobin, M = 4.71, SD = 2.12), all t’s(79) > 2.92, p’s < .002. To further ensure that this last clip was clearly less enjoyable than the others, we increased repetitiveness by expanding it to 45 seconds and also applied a minor change in pitch shift. Main study. Nine hundred and twelve Mechanical Turk participants completed the study online in exchange for monetary compensation. Analogous to the previous studies, we first obtained a measure of participants’ propensity to like instrumental music, to be used as a covariate in the analysis (and thus increase the power of our tests). Specifically, participants first listened to a 10-second instrumental music clip (a segment from “On the Right Track,” performed by Zhanna Hamilton) and indicated how much they enjoyed listening to the clip on a 9-point scale (1 = not at all, 9 = very much). Participants were then told that they would next listen to a music compilation. They were reminded that the study was on the enjoyment of music and to simply sit back, relax, and listen to the music compilation which would be the length of a short song. Participants then heard one of three music clips, depending on condition. In the Control condition, the music clip consisted of the four enjoyable 30-second fragments identified in the pretest. The fragments were combined into one 148-second clip by gradually phasing out of one fragment and into the next (thus preserving the unity of the experience). In the two experimental conditions, the clips consisted of the Control condition clip with the addition of the less enjoyable fragment identified in the pretest. Questioning the End Effect 20 This fragment was either inserted in the middle of the clip (Worse Middle condition) or at the end of the clip (Worse End condition). The order of the four enjoyable fragments was counterbalanced. Next, participants indicated how much they enjoyed listening to the clip on the same 9-point scale as used for the covariate measure. As manipulation checks, participants were asked to indicate how the middle compared to the rest of the clip (-4 = middle was much worse, 4 = middle was much better) and how the end compared to the rest of the clip (-4 = end was much worse, 4 = end was much better). Participants then listened to a 10-second version of the less enjoyable fragment and were asked to categorize this fragment as either pleasant, neither pleasant nor unpleasant, or unpleasant. Finally, to verify that participants had indeed listened to the music compilation, we asked them to listen to three short music fragments and to identify which one of these fragments had been played as part of the music compilation. Results Twenty-eight people failed to recognize the fragment used in the compilation and are thus excluded from all analyses, leaving 884 participants (MAge = 29.4, SD = 9.64; 65.3% male). Manipulation checks. The majority of participants rated the less enjoyable fragment as either unpleasant (35.7%) or neither pleasant nor unpleasant (35.4%), confirming that this fragment was not particularly enjoyable, as intended. More important, participants in the experimental conditions reported that the middle section (or the end section, depending on where this less enjoyable fragment was placed) was indeed relatively less enjoyable, indicating that the manipulation was successful. Specifically, participants in the Worse End condition rated the end as worse than the rest of the compilation (M = -1.27, SD = 2.20), compared to participants in the Worse Middle condition (M = 1.22, SD = 2.06), F(1, 880) = 197.11, p < .001, and those in the Questioning the End Effect 21 Control condition (M = 0.79, SD = 1.99), F(1, 880) = 125.13, p < .001. In addition, participants in the Worse Middle condition rated the middle as worse than the rest of the compilation (M = 0.43, SD = 2.09), compared to participants in the Worse End condition (M = 1.02, SD = 1.96), F(1, 880) = 79.78, p < .001, and those in the Control condition (M = 0.50, SD = 1.99), F(1, 880) = 30.54, p < .001. Enjoyment of the experience. Similar to the previous studies, the analysis of the enjoyment measure was adjusted for the covariate (i.e., the enjoyment of the clip at the start of the study) to increase the power of the test. However, once again, we did not obtain any evidence of an end effect. Participants did not enjoy the clip less when the worse fragment was placed at the end of the clip (M = 6.50, SD = 1.65) rather than in the middle of the clip (M = 6.58, SD = 1.69), F < 1, ηp2 = 0.001. However, adding the worse fragment (regardless of its position) did decrease the enjoyment of the clip relative to the Control condition (M = 6.76, SD = 1.56), F(1, 880) = 4.37, p = .037, ηp2 = 0.005.3 This conceptually replicates the results of Study 2 and suggests that participants’ enjoyment relied on the average of the experience rather than the ending. Discussion The results of Study 3 replicate those of Study 2 in the positive domain, and provide additional evidence that the effect of adding a less intense ending on the overall evaluation is driven by changes in average intensity, rather than over-weighting of the end. Adding a less enjoyable music fragment reduced overall enjoyment of the music compilation, but it did not matter whether this fragment was inserted at the end or in the middle of the experience. The fact that the positioning of the less enjoyable fragment did not affect overall evaluations provided particularly compelling evidence against the existence of a substantial, inherent end effect, given 3 The means adjusted for the covariate: M Control = 6.76, M Worse End = 6.51, M Worse Middle = 6.58. Questioning the End Effect 22 that (1) the manipulation check showed that participants in the respective conditions could clearly identify the end (or middle) as less enjoyable than the rest of the experience, (2) adding the less enjoyable fragment did affect overall evaluations, and (3) the test of the end effect in this study was particularly powerful given the large sample size and the use of a highly correlated covariate (r = .43). In fact, the procedure of this study allowed for the detection of a small effect (Cohen’s f 2 = 0.01) with a probability of 90.8%. In other words, studies 2 and 3 suggest that the often documented “adding a better end” effect should be interpreted solely as a demonstration of duration neglect, rather than evidence for the over-weighting of the end. However, the end effect has also received support from studies using other paradigms. In particular, studies that have systematically manipulated the structure of experiences have provided more direct evidence of the end effect (e.g., Ariely, 1998; Ariely & Zauberman, 2000). These studies have commonly found that experiences with a better (or less aversive) ending are evaluated more favorably (or less negatively) than experiences with a better beginning or a better middle, even when the average intensity is held constant. However, as mentioned earlier, these studies tend to employ within-subject designs, which expose each participant to anywhere between 8 and 64 different experiences. Since these experiences are identical except for their structure (e.g., a loud noise that ends softly versus the same noise that starts softly), this structure would be particularly salient for participants, who may infer that they need to use this structure in their evaluations. As such, end effects observed in these studies may reflect people’s lay beliefs that it is better to end on a high note (or preferences for improving sequences, Loewenstein & Prelec, 1993), rather than a spontaneous reaction to an experience that ends well versus poorly. Even if participants are not relying on lay beliefs, the increased salience of experience structure in within-subject designs could still be a requirement for end effects to Questioning the End Effect 23 manifest (suggesting that the end is not inherently over-weighted). The last two studies were designed to test whether endings are indeed over-weighted in the context of repeated experiences, but not when experiences are judged in isolation. Study 4: Single versus Repeated Negative Experiences To examine whether exposure to repeated experiences (with variations in structure) leads people to increase their evaluations of experiences that end well, we used a repeated measures design. Specifically, participants were asked to listen to two aversive sounds that were identical, but reversed in sequence, such that one sound ended well and one sound ended poorly. The order of the sounds was manipulated between conditions. Consistent with the lack of an end effect in the first studies, we expected that the difference in structure would not affect participants’ rating of the first sound they heard: participants will rate the experience as equally aversive, regardless of whether they were assigned to a noise that ends well or to a noise that ends poorly. However, consistent with prior demonstrations of end effects in within-subject designs, we expected that the difference in structure would affect the rating of the second sound: after listening to a noise that ends well, participants will rate a noise that ends poorly as more aversive (and vice versa). Method Two hundred and four Mechanical Turk participants completed the study online in exchange for monetary compensation. The procedure was similar to that of Study 1. All participants first listened to the printer sound clip (to calibrate the volume) and then rated their irritation with the drill sound clip (to be used as covariate). Participants then listened to one of two versions of the main stimulus: 24 seconds of vacuum cleaner noise. As in Study 1, the two sound clips were identical but reversed, so that one clip started with 6 seconds of high volume noise, followed by 18 seconds that Questioning the End Effect 24 gradually tapered off in volume (Better End), whereas the other clip started quietly and ended at a high volume (Worse End). See Figure 1for a visual depiction of the sound profiles. Immediately after listening to the sound clip, participants rated how annoying, unpleasant, and irritating it was to listen to the clip (on 9-point scales: 1 = not at all, 9 = very). Unlike in Study 1, participants next listened to the other clip (i.e., those who listened to the Better End clip then listened to the Worse End clip and vice versa), and rated that clip as well. After rating the second sound clip, participants were asked to indicate which of the two sound clips they would choose if they had to listen to one of the clips again (9-point scale: -4 = I would definitely choose the first clip, 0 = No preference, 4 = I would definitely choose the second clip). As a manipulation check, we next asked participants, for each clip, how the end of the experience compared to the rest of the experience (9-point scale: -4 = End was much worse, 4 = End was much better). To verify that participants had indeed listened to the clips, we then asked them to select the sound they listened to from three options: an ambulance, a car alarm, and a vacuum. Finally, we asked participants whether they had adjusted the volume of their headphones at any point, before collecting demographic information. Results Three people failed to recognize the sound clip used in the study and are thus excluded from the analysis, leaving a sample of 201 participants (MAge = 32.8, SD = 11.12; 63.7% male). Manipulation checks. The end of the first clip was rated as significantly better by participants who listened to the Better End clip first (M = 6.73, SD = 1.92) than by participants who listened to the Worse End clip first (M = 3.15, SD = 2.15), F(1, 199) = 155.06, p < .001. Similarly, the end of the second clip was rated as significantly better by participants who listened to the Better End clip second (M = 7.16, SD = 2.02) than by participants who listened to the Questioning the End Effect 25 Worse End clip second (M = 3.17, SD = 2.06), F(1, 199) = 191.86, p < .001. These results confirm that the manipulation of the structure of the experience was successful. Perceived aversiveness. The measures of annoyance, unpleasantness, and irritation were standardized and combined to form an aversiveness index for each sound clip (α clip 1 = .95, α clip 2 = .96). As in the previous studies, the between-subjects analysis of this index was adjusted for the covariate (the irritation with the drill sound) to increase the power of those tests. Consistent with prior demonstrations of the end effect, the within-subject analysis showed that participants perceived their Better End experience as less aversive than their Worse End experience, F(1, 199) = 73.45, p < .001, ηp2 = 0.270. Although this effect is quite sizeable, the between-subjects analysis (i.e., the comparison of the two order conditions) adds an important nuance to the interpretation of this effect. Indeed, consistent with our previous studies, the perceived aversiveness of the first clip did not differ between participants who listened to the Better End clip first (M = 6.56, SD = 1.90) and those who listened to the Worse End clip first (M = 6.69, SD = 1.82), F < 1, ηp2 < .001. It was only for the second clip that participants who listened to the Better End clip reported less aversiveness (M = 5.88, SD = 1.98) than those who listened to the Worse End clip (M = 7.42, SD = 1.60), F(1, 198) = 53.44, p < .001, ηp2 = .2124. These results are graphically depicted in Figure 3. Thus, although the within-subject analysis replicated previous demonstrations of the end effect, the effect once again failed to obtain when participants evaluated a single experience. Figure 3. Perceived Aversiveness of Sound Clips by Condition (Study 4) 4 The means adjusted for the covariate, Clip 1: M Worse End = 6.59, M Better End = 6.66, Clip 2: M Worse End = 7.45, M Better 5.85. End = Questioning the End Effect 26 Aversiveness (9-point scale) 8 7 6 5 4 First Sound Clip Worse End Second Sound Clip Better End Note: Error bars denote standard errors. Preference. Whether participants preferred to repeat the first or the second clip depended on which clip they listened to first, F(198) = 140.26, p < .001. Participants who listened to the Better End clip followed by the Worse End clip preferred listening to the first clip again (M = 3.36, SD = 2.59), t(98) = -2.44, p = .016, whereas participants who listened to the clips in the opposite order preferred listening to the second clip again (M = 3.36, SD = 2.59), t(101) = 14.08, p < .001. These results indicate that participants consciously preferred a noise that ended well over an equivalent noise that started well. Discussion At first glance, the results of this study provide strong evidence for an end effect, consistent with prior research. When participants listened to a noise that started loudly but ended better (Better End) and one that started better but ended loudly (Worse End), they rated the noise that ended better as less aversive, and strongly preferred that noise to the one with a worse ending. Yet, the between-subjects analysis reveals that this advantage for the better ending experience only emerges after participants have been exposed to multiple experiences (that are Questioning the End Effect 27 identical to each other except for their structure). Indeed, for the first sound clip, the end effect is as conspicuously absent in this study as it was in the previous studies: participants rated the first clip as equally aversive, regardless of whether it ended well or poorly. The end effect only emerged for the second sound clip, which was identical to the first sound clip except for its structure. Thus, in this study, the end effect only emerged when the repetition of experiences made the structure of the experience salient, indicating that although people do not spontaneously over-weight the end of an experience, they may do so when they are encouraged to base their evaluations on differences in structure. Note that it is not sufficient that participants note that the end of the experience is clearly better or worse than the rest of the experience (as our manipulation checks indicate this is generally the case in our studies), but rather that differences in structure are made salient as a criterion for evaluation. Study 5: Single versus Repeated Positive Experiences Study 5 aimed to conceptually replicate Study 4 with positive stimuli. We used pleasant music compilations that varied in the position of a less enjoyable segment (as in Study 3) and presented all participants with both versions, in counterbalanced order (as in Study 4). Similar to the results of Study 4, we expected that the position of the less enjoyable segment would not affect participants’ rating of the first music compilation, but would affect the rating of the second compilation: after listening to a music compilation with a mediocre middle (ending), participants will rate a clip with a mediocre ending (middle) as less (more) enjoyable. Method Five hundred and two Mechanical Turk participants completed the study online in exchange for monetary compensation. Questioning the End Effect 28 As in Study 3, participants first listened to a 10-second instrumental music clip ( “On the Right Track”) and rated their enjoyment on a 9-point scale (1 = not at all, 9 = very much), to be used as a covariate in the analysis. Next, participants read that they would listen to two music compilations. Both music compilations were composed of three of the five fragments used in Study 3: two of the very enjoyable fragments (“Herd Reunion” and “Heart Song”) as well as the less enjoyable fragment (“Reanimator”). The fragments lasted thirty seconds each and were tapered and integrated to create a more continuous experience, resulting in a music compilation of 80 seconds. The two compilations differed only in the position of the less enjoyable fragment: it was either positioned in the middle (Worse Middle) or at the end (Worse End). The order in which participants heard each compilation was counterbalanced: half of participants heard the Worse Middle clip first, while the other half heard the Worse End clip first. After each compilation, participants were asked to indicate how enjoyable and pleasant it was to listen to the music, both on 9-point scales (1 = not at all, 9 = very much). Similar to Study 2, we also added a relative preference measure after the primary measures: participants were asked to indicate how much they enjoyed listening to the experience relative to listening to music on the radio (9-point scale: -4 = much less than listening to the radio, 4 = much more than listening to the radio). After participants completed these measures for the second music compilation, they were asked to indicate which of the two music experiences they enjoyed more (9-point scale: -4 = definitely the first experience, 4 = definitely the second experience). As a manipulation check, we next asked participants to indicate, for each music compilation, how the middle compared to the rest of the compilation (9-point scale: -4 = middle was much worse, 4 = middle was much better) and how the end compared to the rest of the compilation (9-point scale: -4 = end was much worse, 4 = end was much better). Participants Questioning the End Effect 29 then listened to a 10-second version of the less enjoyable fragment and were asked to categorize this fragment as either pleasant, neither pleasant nor unpleasant, or unpleasant. Finally, to verify that participants had indeed listened to the music compilation, we asked them to listen to three short music fragments and to identify the fragment that was part of the music compilation. Results Twelve people failed to recognize the song used in the compilation and are thus excluded from all analyses, leaving 490 participants (MAge = 21.8, SD = 10.3; 57.8% male). Manipulation checks. The majority of participants rated the less enjoyable fragment as either unpleasant (66.3%) or neither pleasant nor unpleasant (22.2%), indicating that it was not particularly enjoyable. More important, the manipulation of the placement of this fragment within the music clip had the intended effect on participants’ perceptions. This was true for ratings of the first music clip: participants who listened to the Worse Middle clip first rated the middle of the clip as worse and the end of the clip as better (MMiddle = -1.37, SD = 2.36; MEnd = 1.39, SD = 2.09) than did participants who listened to the Worse End clip first (MMiddle = 0.81, SD = 2.32; MEnd = -1.00, SD = 2.53), FMiddle(1, 488) = 106.89, p < .001; FEnd(1, 488) = 129.57, p < .001. This was also true for ratings of the second music clip: participants who listened to the Worse Middle clip second rated the middle of the clip as worse and the end of the clip as better (MMiddle = -1.20, SD = 2.44; MEnd = 1.87, SD = 1.93) than did participants who listened to the Worse End clip second (MMiddle = 1.41, SD = 1.97; MEnd = -1.41, SD = 2.49), FMiddle(1, 488) = 170.02, p < .001; FEnd(1, 488) = 265.86, p < .001. In short, the manipulation was successful: participants perceived the middle of the Worse Middle clip and the end of the Worse End clip as relatively less enjoyable. Questioning the End Effect 30 Enjoyment. The measures of enjoyment and pleasantness were averaged to form an enjoyment index (α clip 1 = .95, α clip 2 = .94). The between-subjects analysis of this index was again adjusted for the covariate (the enjoyment of the clip at the start of the study) to increase the power of those tests. As in Study 4, the within-subject analysis of this index is consistent with prior demonstrations of the end effect: participants rated their Worse End experience as less enjoyable than their Worse Middle experience, F(1, 488) = 15.52, p < .001, ηp2 = 0.031. However, the between-subjects analysis again adds an important nuance to the interpretation of this result. Consistent with the absence of an end effect in our prior studies, the enjoyment of the first music clip did not differ between participants who listened to the Worse End clip (M = 6.28, SD = 1.61) and those who listened to the Worse Middle clip (M = 6.20, SD = 1.64), F < 1, ηp2 < 0.001. Mirroring the results of Study 4, it was only for the second music clip that participants who listened to the Worse End clip rated their experience as less enjoyable (M = 5.97, SD = 1.63) than participants who listened to the Worse Middle clip (M = 6.27, SD = 1.61), F(1, 487) = 5.10, p = .024, ηp2 = 0.010.5 These results are graphically depicted in Figure 4. 5 The means adjusted for the covariate, Clip 1: M Worse End = 6.21, M Worse Middle = 6.27, Clip 2: M Worse End = 5.96, M 6.27. Worse Middle = Questioning the End Effect 31 Figure 4. Enjoyment of Sound Clips by Condition (Study 5) Enjoyment (9-point scale) 8 7 6 5 4 First Sound Clip Worse End Second Sound Clip Worse Middle Note: Error bars denote standard errors. Other measures. We next analyzed participants’ relative preference between listening to the clip and listening to a song on the radio. Consistent with the enjoyment index (and with prior demonstrations of the end effect), a within-subjects analysis of this measure indicated that participants’ showed a greater preference for listening to the music clip (rather than the radio) when rating the Worse Middle clip (M = 5.48, SD = 2.17) rather than the Worse End clip (M = 5.35, SD = 2.16), F(1, 488) = 8.169, p = .004, ηp2 = 0.016. However, the between-subjects analysis of these relative preference ratings did not show any reliable difference between people who listened to the Worse Middle clip and those who listened to the Worse End clip, neither for the first clip, F(1, 486) = 2.18, NS, nor for the second clip, F < 1. Thus, similar to the analysis of the enjoyment index, we did not observe an end effect for the first clip, but unlike for the enjoyment index, we also did not observe an end effect for the second clip, suggesting that this particular measure may not be sufficiently sensitive to provide a strong test of the end effect. Questioning the End Effect 32 Finally, participants’ stated preference between sound clips 1 and 2 showed that they were more likely to prefer the second clip over the first one when that second clip was the Worse Middle clip (M = 1.14, SD = 2.69) rather than the Worse End clip (M = 0.11, SD = 2.70), F(1, 488) = 17.64, p < .001. Thus, participants showed a conscious preference for music with a poor middle over music that ends poorly. Discussion Study 5 conceptually replicated the effect of Study 4 with positive experiences. Consistent with prior research, participants reported enjoying the same music compilation less when the less enjoyable segment appeared at the end, rather than in the middle. However, this finding only held when participants were asked to directly compare the two arrangements, either implicitly (when they were asked to evaluate the second clip after evaluating a clip that was identical except for the position of the less enjoyable segment), or explicitly (when asked which of the two clips they preferred). When participants simply listened to and rated the first music compilation, their enjoyment was completely unaffected by the position of the less enjoyable segment—even though participants could clearly tell that the middle (or end) of the clip was worse than the rest of the clip, as revealed by the manipulation check measures for the first clip. This is consistent with the absence of an end effect observed in the previous four studies, and suggests that people do not spontaneously overweight the end of an experience. Instead, the structure of the experience has to be made salient as a possibly relevant evaluation criterion for average constant). Study 6: Relating Overall Evaluations to Ratings of the End of the Experience So far, we have addressed two sources of support for the end effect in prior research: demonstrations of the positive effect of “adding a better end” and the more favorable evaluations Questioning the End Effect 33 of experiences that end well in within-subject designs. However, as we mentioned in the introduction, there is a third type of prior support for the end effect. Specifically, several studies have demonstrated that, when overall evaluations are regressed on moment-to-moment ratings of the experience, the rating of the final moments of the experience is a particularly effective predictor of the overall evaluation. Yet, as discussed earlier, this does not necessarily imply that the final moments are being over-weighted. If moment-to-moment ratings do not only reflect people’s isolated reaction to the current moment, but are also influenced by past moments, then final ratings would be more effective predictors because they incorporate more information. In Study 6, we aimed to examine this issue by focusing on an experience with distinct components that can be evaluated separately, thus reducing any possible confusion or contamination by prior impressions (as may be the case with a continuous noise). Specifically, we used field study data from participants in an obstacle course fun run. After the run, participants were asked to rate their satisfaction with the race in addition to rating each individual obstacle, as well as providing an overall rating of the obstacles. If participants’ impression of the experience was disproportionately affected by the end, then one would expect that the rating of the final obstacle would be a better predictor of participants’ satisfaction than the ratings of the other obstacles. Further, one would expect that, when controlling for the overall rating of the obstacles, the rating of the final obstacle would improve the prediction of participants’ satisfaction with the race. Method Seven hundred and fifty participants in an obstacle course fun run completed the study online in the days following the completion of the race. Questioning the End Effect 34 Participants completed a fun run consisting of 12 large obstacles. The night following the race, participants received an email from the race company which included a link to a race evaluation survey. Participants first indicated their satisfaction with the race on a 10-point scale (1 = not satisfied, 10 = very satisfied). Later in the survey, participants provided their overall evaluation of all the obstacles in the run on a 10-point scale (1 = lame, 10 = awesome). Next, they rated each individual obstacle they completed on a five-point scale (1 = lame, 5 = awesome). Other items measured in this survey, not relevant to the current research, are available upon request. Results and Discussion We first regressed participants’ satisfaction with the race on the ratings of each of the twelve obstacles. Although the final obstacle was a significant predictor (β = 0.24, t(737) = 6.61, p < .001), out of the eleven other obstacles, nine were better predictors of participants’ satisfaction than the rating of the final obstacle (see Table 1). Table 1. Results of Separate Regressions of Satisfaction with the Race on the Rating of each Obstacle (in Chronological Order). Obstacle Obstacle Obstacle Order Order Order β β β 1 0.294 5 0.278 9 0.257 2 0.292 6 0.181 10 0.241 3 0.293 7 0.289 11 0.252 4 0.209 8 0.238 12 0.235 Note: All betas are reliably different from 0 (all t’s(737) > 5.03, p’s < .001) As an alternative test of the special status of the final event, we also regressed participants’ satisfaction with the race on both the overall rating of the obstacles and the Questioning the End Effect 35 individual rating of the final obstacle. The overall rating of the obstacles significantly predicted satisfaction with the race, β = 0.55, t(747) = 17.01, p < .001. However, once the overall rating of the obstacles was taken into account, the rating of the final obstacle did not contribute significantly to the prediction of overall satisfaction with the race, β = 0.05, t(747) = 1.54, NS. These results suggest that, when people can cleanly separate the different components of an experience, then the rating of the final component is not a privileged determinant of the overall evaluation. Of course, since this study was a field experiment, it suffered from several limitations, the most important of which is that the order of the obstacles was not counterbalanced. We can therefore not rule out that an idiosyncratic property of the final obstacle may have reduced its relationship with satisfaction and thus counter-acted the end effect. General Discussion Prior research has argued that evaluations of experiences are disproportionately influenced by the final moments of the experience, since endings have a privileged status as a prototypical moment of the experience. Although past research has documented several notable boundary conditions in which this effect does not obtain, there exists an impressive body of evidence supporting the existence of an end effect for simple, continuous experiences. In this paper, we did not set out to identify additional boundary conditions, but rather re-examined the basic end effect, starting with the type of experience that was used in the initial demonstrations of the effect: a simple, short, meaningless, continuous, aversive sensation (listening to an irritating noise). Yet, in spite of meeting those conditions, this experience did not produce an end effect in our studies: the noise was not rated as more aversive when the loudest part was placed at the end rather than in the beginning (Studies 1 and 4) and was not rated as less aversive when a softer section was placed at the end rather than in the middle (Study 2). Other studies with a Questioning the End Effect 36 positive experience also failed to document an end effect: listening to a short music compilation was not rated as less enjoyable when a weaker music segment was placed at the end of the compilation rather than in the middle (Studies 3 and 5). These null effects obtained even though participants could readily recall whether the end or middle of the experience was particularly good or bad, and even though the studies were properly powered to detect a small effect with a reasonable probability. Finally, results from a correlational study further corroborate the pattern observed in the experimental studies. In a large-scale field study with obstacle race participants, we failed to observe a privileged relationship between the rating of the last obstacle of the race and the overall satisfaction with the race (Study 6). As such, these results question the assumption that the final moments of an experience have an inherent, substantial advantage in determining the overall evaluation of the experience. Although each of our studies documented a failure to obtain the end effect, our results are not inconsistent with past demonstrations. Specifically, consistent with prior research, we found that extending an experience with a less intense ending results in less extreme global evaluations of that experience. However, adding the less intense segment in the middle rather than at the end produced the same results. Thus, our findings indicate that the effect of adding a less intense ending is driven by a reduction in the average intensity of the experience rather than a disproportionate impact of the ending. Similarly, consistent with prior research, we observed that when each participant evaluated multiple experiences that only differed in the ordering of its components, participants did prefer experiences that ended well over experiences that ended poorly. Yet, the structure of the experience did not affect the evaluation of the first experience participants encountered. Therefore, our results suggest that people do not spontaneously assign Questioning the End Effect 37 greater weight to the ending, but instead rely on the structure of the experience when it is a salient basis for evaluation because it is the only aspect that differs between experiences. Our experimental studies thus empirically addressed two types of prior support for the end effect: the fact that extending an experience with a less intense ending weakens the overall evaluation, and the fact that people prefer experiences that end well over equivalent experiences that end poorly. Additionally, in Study 6, we addressed a third type of empirical support for the end effect: the fact that the moment-to-moment rating of the end of an experience can be a particularly good predictor of the global evaluation of that experience. We have proposed that this privileged relationship may be explained by mechanisms other than the over-weighting of the end. For instance, if moment-to-moment ratings are also influenced by past moments in the experience, then final ratings would be more effective predictors because they incorporate more information. Alternatively, explicit final moment-to-moment ratings may simply serve as salient anchors for the immediately subsequent overall evaluation of the experience. In study 6, we examined an experience that consisted of easily identifiable parts (thus reducing confusion and contamination of the ratings) and we asked participants to rate those parts after the overall evaluation (thus avoiding anchoring of the overall evaluation on the final rating). Under these circumstances, we did not observe any privileged relationship between the rating of the final part of the experience and participants’ overall satisfaction with the experience—which is consistent with our alternative interpretations of those prior findings. Although we propose, based on our results, that endings are not inherently over-weighted in retrospective evaluations of experiences, this certainly does not imply that endings cannot have a disproportionate impact when additional conditions are fulfilled. As studies 4 and 5 already indicate, when differences in structure are highly salient, people may rely on their lay Questioning the End Effect 38 beliefs about the desirability of good endings and prefer experiences with better endings over other, equivalent experiences. Moreover, we can identify at least two other circumstances under which the final moments of experiences are likely over-weighted. First, when the last part of an experience is particularly meaningful, and colors the perception of everything that preceded it, we would naturally expect it to disproportionately impact the overall evaluation. For instance, evaluations of goal-directed experiences may be particularly affected by the end of the experience (Carmon & Kahneman, 1996) since the end often determines whether the goal has been met (and thus whether the preceding effort was in vain or not). Similar to goal-directed experiences, the end may also be particularly meaningful (and influential) for narrative experiences, such as watching television shows (Hui et al., 2014), since the end of an episode often provides some type of resolution. The evaluation of a murder mystery strongly depends on how the mystery is being resolved, just as the evaluation of a romantic comedy depends on whether the couple ends up together, and the evaluation of a baseball game depends on which team wins. Aside from being particularly meaningful, endings can also have a disproportionate impact through a second mechanism: a recency effect. Specifically, for experiences that are long and varied (e.g., a year-long trip around the world), people may simply be unable to remember many parts of the experience due to memory constraints. In that case, the overall evaluation may be disproportionately influenced by the beginning and end of the experience since research on list memorization finds that these components are recalled more easily than items in the middle (Ebbinghaus 1913). For instance, the observation of an end effect for hypothetical experiences presented in list format has been attributed to such recency effects due to memory constraints (Montgomery & Unnava, 2009). Questioning the End Effect 39 It should be noted that, for many experiences, the end does not benefit from either recency effects or being particularly meaningful—in which case we would not expect the end to have a disproportionate impact on evaluations. Even TV shows or narratives do not always offer meaningful endings that provide a resolution. For instance, in contrast to a romantic comedy, the ending of a nature documentary may not be more meaningful than what preceded it. Similarly, for the experiences studied in this paper, neither the final seconds of the noise nor the final fragment of the music compilation were more meaningful than the rest of the experience. The same holds for the situations identified as boundary conditions of the effect: the last part of a meal (Rode, Rozin, & Durlach, 2007), the final activity over the course of a day (Miron-Shatz, 2009), and the last moments of a vacation (Kemp, Burt, & Furneaux, 2008) do not commonly convey any special meaning. In sum, the current research cautions against the common recommendation to restructure experiences to end on a high note. Although this improved ending may disproportionately impact evaluations in specific cases, our studies suggest that this would not occur merely because it is the ending. That is, rather than positing the existence of an inherent end effect that is disabled in specific circumstances (i.e., those identified by the boundary condition studies), we propose that it is more accurate to state that there is no inherent end effect, but that endings can have a disproportionate impact on evaluations through other processes under specific circumstances. Questioning the End Effect 40 References Ariely, D. (1998). Combining experiences over time: The effects of duration, intensity changes & on-line measurements on retrospective pain evaluations. Journal of Behavioral Decision Making, 11, 19-45. Ariely, D., & Zauberman, G. (2000). On the Making of an Experience: The Effects of Breaking and Combining Experiences on their Overall Evaluation. Journal of Behavioral Decision Making, 13(2), 219-232. Ariely, D., & Carmon, Z. (2000). Gestalt Characteristics of Experiences: The Defining Features of Summarized Events. Journal of Behavioral Decision Making, 13, 191-201. Baumgartner, H., Sujan, M., & Padgett, D. (1997). Patterns of Affective Reactions to Advertisments: The Integration of Moment-to-Moment Responses. Journal of Marketing Research, 34(2), 219-232. Branigan, C., Moise, J., Fredrickson, B., & Kahneman, D. (1997). Peak (but not end) ANS reactivity to aversive episodes predicts bracing for anticipated re-experience. Poster presented at Society for Psychophysiological Research, Cape Cod, MA. Abstract retrieved from https://www.sprweb.org/meeting/past_mtng/1997/97posters1.html. Carmon, Z., & Kahneman, D. (1996). The Experienced Utility of Queuing: Experience Profiles and Retrospective Evaluations of Simulated Queues. Retrieved from http://faculty.insead.edu/carmon/pdffiles/The%20Experienced%20Utility%20of%20Que uing.pdf. Questioning the End Effect 41 Conniff, R. (2006). What Modern Science Can Teach You About Turning That Frown Upside Down. Men's Health. January, 118-123. Cusick, B. (2012). The Peak-End Rule: A way to improve every customer experience. Retail Customer Experience. Newsletter, Networld Media Group, September 19, 2012. Ebbinghaus, H. (1913). On memory: A contribution to experimental psychology, New York: Teachers College. Fredrickson, B. L. (1991). Anticipating endings: An explanation for selective social interaction (Doctoral dissertation, Stanford University, 1990). Dissertation Abstracts. Fredrickson, B. L., & Kahneman D. (1993). Duration neglect in retrospective evaluations of affective episodes. Journal of Personality and Social Psychology, 65, 45-55. Hui, S. K., Meyvis, T., & Assael, H. (2014). Analyzing Moment-to-Moment Data Using a Bayesian Functional Linear Model: Application to TV Show Pilot Testing. Marketing Science, 33, 2, 222-240. Kahneman, D. (2000a). Evaluation by moments: Past and future. In A. Tversky, D. Kahneman, (Eds.), Choices, values, and frames (pp. 693-708). Cambridge: Cambridge University Press. Kahneman, D. (2000b). "Experienced utility and objective happiness: A moment-based approach," In A. Tversky, D. Kahneman, (Eds.), Choices, values, and frames (pp. 673692). Cambridge: Cambridge University Press. Kahneman, D., Fredrickson, B. L., Schreiber, C. A., & Redelmeier, D. A. (1993). When more pain is preferred to less: Adding a better end. Psychological Science, 4, 401-405. Kemp, S., Burt, C. D. B., & Furneaux L. (2008). A Test of the Peak-End Rule with Extended Autobiographical Events. Memory & Cognition, 36, 132-138. Questioning the End Effect 42 Loewenstein, G. F., & Prelec, D. (1993). Preferences for sequences of outcomes. Psychological review, 100(1), 91 - 108. Miron-Shatz, T. (2009). Evaluating multi-episode events: boundary conditions for the peak-end rule. Emotion, 9(2), 206-213. Montgomery, N. V., & Unnava, H. R. (2009). Temporal sequence effects: A memory framework. Journal of Consumer Research, 36(1), 83-92. Oppenheimer, D. M., Meyvis, T., & Davidenko, N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology, 45, 867–872. Pine, J. B., & Gilmore, J. H. (1998). Welcome to the experience economy. Harvard Business Review, 76, 97-105. Redelmeier, D. A., & Kahneman D. (1996). Patients' memories of painful medical treatments: Real-time and retrospective evaluations of two minimally invasive procedures. Pain, 116, 3-8. Redelmeier, D. A., Katz, J., & Kahneman, D. (2003). Memories of colonoscopy: a randomized trial. Pain, 104(1), 187-194. Rode, E., Rozin, P., & Durlach, P. (2007). Experienced and remembered pleasure for meals: Duration neglect but minimal peak, end (recency) or primacy effects. Appetite, 49(1), 18– 29. Schreiber, C. A., & Kahneman D. (2000). Determinants of the remembered utility of aversive sounds. Journal of Experimental Psychology: General, 129(1), 27-42. Questioning the End Effect 43 Shaw, C., Dibeehi, Q., & Walden, S. (2010). Customer Experience: Future Trends and Insights. Great Britain: Palgrave Macmillan. Google books. Web. 29 August 2014. http://books.google.com. Surowiecki, J. (2002). Boom and Gloom. The New Yorker. 11 November: The New Yorker. Web. 29 August 2014. www.newyorker.com. Varey, C. A., & Kahneman D. (1992). Experiences extended across time: Evaluation of moments and episodes. Journal of Behavioral Decision Making, 5, 169-186. Wirtz, D., Kruger, J., Scollon, C. N., & Diener, E. (2003). What to do on spring break? The role of predicted, on-line and remembered experience in future choice. Psychological Science, 14, 520-524. Xu, E. R., Knight, E. J., & Kralik, J. D. (2011). Rhesus monkeys lack a consistent peak-end effect. The Quarterly Journal of Experimental Psychology, 64(12), 2301-2315.

Questioning the End Effect 1

Related documents

Products

Support

Questioning the End Effect 1

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib