Performance under pressure and psychological momentum are well-documented topics in
sports psychology, but most research focuses on “in-game” pressure. This study views
pressure more broadly to examine how the external pressure of fans, quantified using the
sentiment of tweets mentioning the players, can affect how MLB players perform.
Although external pressure is intangible, it can impact a player’s psyche and performance.
This investigation focuses on players Chris Sale and David Price. A new process was
developed leveraging the Vader package in Python that can generate tweet sentiment to
compare to several performance metrics from Baseball Reference.
Results proved to be promising with correlation analysis pointing to some association
between sentiment and performance. There was also an observed difference in how both
players handled the pressure depending on whether they played for a small or large market
team. An anecdotal study of the 2018 season showed even more interesting differences
between Sale’s and Price’s performance and Twitter sentiment. Price’s performance and
Twitter’s sentiment moved in a cyclical manner throughout the season whereas Sale’s results
were much more consistent and less sensitive to change. Finally, a study focused on the
impact of both pressure and past performance on future outings showed results consistent with
past studies on the subject. For example, Sale was most likely to perform well under pressure
if he preceded the start with a very good or bad outing rather than an average outing.
Information like this could be useful for front offices and managers.
More analysis should be conducted to confirm and expand on the findings of this project.
However, this case study can be used as a foundation for a new and innovative approach to
player evaluation, ultimately complementing existing methods and informing decisions
regarding otherwise intangible factors.
Serving one of the largest markets in the MLB, Boston Red Sox players are heavily
scrutinized by scores of media, talk-show hosts, Twitter trolls, and dedicated fans. Players
who come to Boston can be beloved (David Ortiz, Brock Holt, Chris Sale) or hated (Pablo
Sandoval, Carl Crawford, David Price) with equal intensity depending on their attitude and
performance here. Some players seem to thrive under the pressure where others crack. This
study aims to determine whether some players react differently to external pressure by
comparing performance metrics to the sentiment of fan tweets.
Professional baseball is a multimillion-dollar industry. Some players are given contracts of
300 to 400 million dollars. When there is this magnitude of money involved, it is important
that front offices make the right decision when it comes to choosing and supporting players.
David Price ($217M), Carl Crawford ($142M), and Pablo Sandoval ($95M) are three recent
Red Sox contracts that are widely viewed as mistakes despite the success all three players had
on other teams. Conversely, David Ortiz initially signed with the Red Sox for $1.25M after
being released and went on to have a Hall of Fame caliber career. If the Red Sox front office
was able to understand the relationship between external pressure and performance, perhaps
they could make better decisions initially or provide some sort of intervention to struggling
players with bloated contracts.
Yogi Berra said that “baseball is ninety percent mental and the other half is physical”. Since
the time he retired in 1965, the field of sports psychology has continuously advanced. It is
from this field that most of the literature for this project will be drawn. Some of the key
findings from past studies relate to how performance is affected by pressure and
psychological momentum.
Performance Under Pressure
It is generally accepted that a graph depicting pressure or physiological arousal on the x-axis
and performance on the y-axis would be a parabolic shape (Luiselli & Reed, 2011) (See
Figure 1). With minimal pressure, performance would suffer due to indifference or disinterest.
As the amount of pressure increases, performance would improve until a maximum point is
reached where an athlete is “locked in” at the top of their game and in a state of mind called
“flow state”. Eventually, more pressure will lead to feelings of anxiety and nervousness that
will ultimately deteriorate performance (Williams, 2010).
Figure 1: There is a parabolic relationship between performance and stress (Bradberry, 2014).
Martin (2019) provides several reasons why nervousness might interfere with performance.
First, nervousness consumes energy that could be more efficiently utilized. Secondly, anxiety
narrows an athlete’s attention potentially preventing them from picking up on external cues.
Lastly, the rush of adrenaline that accompanies anxiety can destroy the timing of skilled
routines that are an important part of sports.
Masters et al. (1993) developed a “Reinvestment Scale” aimed at predicting who would
perform the worst under stress. They theorized that those who scored higher on the
reinvestment scale disrupted the “automaticity of skilled performance” by investing too much
of their attention to their action when under pressure. This supports the idea that “thinking too
much” will hurt one’s performance. The scale included three elements that would indicate
whether a person is more susceptible to “deautomatization”. They include rehearsal (a
tendency to mentally rehearse emotional events), private self-consciousness (the amount of
attention one pays to their thought processes), and public self-consciousness (the amount that
a person is concerned with how others view them). The higher a person rates on any of these
factors, the more likely they will succumb to pressure.
Psychological Momentum
Several studies have concluded that past performance is a significant factor when predicting
whether a player will succeed under pressure. One study analyzed NFL games and found that
the probability of failure was highest when the pressure was high and the play was preceded
by a failure (Harris et al., 2019).
In a separate study that focused on individual baseball players completing simulated at-bats,
findings were similar, showing an interaction effect between past performance and pressure.
Interestingly, the study found that players on either a cold or hot streak were more likely to
succeed under pressure as compared to those who were not on a streak. Researchers attributed
this to the fact that in a cold streak, a player will shift their attitudinal focus inward and focus
more on fundamental skills than external pressure. Conversely, those on a hot streak will have
heightened self-confidence or perceived control that improves their external attitudinal focus
and makes them more immune to the pressure. Additionally, those who succeeded under
pressure performed much better after the pressure was removed (Gray et al., 2013).
Another relevant finding came from a study by Golding et al. (2017). Athletes and non-
athletes were asked to pitch both with a crowd offering positive and negative criticism.
Athletes were more affected by the criticism presumably because they care more about their
performance, but the type of reaction and the magnitude was highly dependent on the person.
This seems to suggest that some athletes are affected more by external pressure than others.
Other Findings
One article discussed different ways to measure fan passion including the extent to which a
team occupies the fan’s heart, mind, body, and soul. Fan passion could be a measure to
determine how much pressure a player will be under and/or how likely they will be to interact
with fans on social media (Wakefield, 2016).
The idea of clutch, having an innate ability to perform in high leverage situations, is a
controversial one in baseball. There are many studies of clutch that have shown that given a
large enough sample, those who perform best in high pressure situations are those that
perform the best in low pressure. However, players like David Ortiz make it hard to believe
that clutch doesn’t exist, and many managers do believe that some players are more capable in
high leverage situations (Perrotto, 2018).
This Investigation
Most of the aforementioned literature focuses on short-term, in-game pressure while this
project hopes to expand the concept of pressure to include the continuous off-field pressure a
player faces when playing in large markets like Boston, New York, or Los Angeles.
Additionally, many of these articles were controlled experiments that did not include
professional baseball players. This investigation will use Twitter sentiment analysis as a proxy
for external pressure on professional baseball players and use actual game performance
metrics to determine the interaction between pressure and performance.
The fundamental questions that this study will seek to answer are the following:
1. Can external fan pressure be quantified using tweets and is this measure related to
player performance in a meaningful way?
2. Do some baseball players perform differently under high external pressure than
3. How can front offices predict how players will react to varying levels of external
Player Selection
To answer the research questions presented, a case study was completed comparing Chris
Sale and David Price. These players were selected because they are players who had similar
careers prior to joining the Red Sox (See Appendix A) except for one important difference:
their perceived reaction to external pressure. Both are left-handed pitchers who joined the
Boston Red Sox between 2016 and 2017 after playing in smaller markets. They are within 3
years of age and before joining the Red Sox had very comparable statistics (David Price Stats,
n.d.) (Chris Sale Stats, n.d.).
When the two came to Boston, however, their first few seasons were very different (See
Appendix A). In general, David Price did not perform as expected and fans were very critical
of him. In contrast, Sale began dominating and fans embraced him with open arms. It is
possible that the difference stemmed from their reaction to the increased external pressure.
Furthermore, this could be related to their social media habits. The largest differentiation
between Price and Sale when they joined the Red Sox was that Price was an avid Twitter user
who had publicized Twitter battles with others, including an Umpire while in Tampa Bay
(Rymer, 2013). Sale, on the other hand, did not have any social media accounts and appeared
to block out the noise (It Doesn’t Sound Like Chris Sale Will Be On Twitter, Social Media
Any Time Soon, 2016).
Compiling Data
The methodology for this case study included using tweets mentioning Sale and Price as a
proxy for external pressure from fans. Using the Python package, GetOldTweets3, a process
was developed for compiling tweet text, retweets, favorites, dates, and more with a user
defined time-period and query (Mottl, 2018/2020). All tweets mentioning “David Price” and
“Chris Sale” were collected from the time of their starting pitching debut (2010 and 2012,
respectively) until the end of 2019. Word Clouds were created from the resulting text using
the wordcloud package and the data was cleaned to remove advertisements and other
irrelevant tweets (See Figure 2) (Mueller, 2020).
Figure 2: Word Clouds produced from tweets mentioning David Price and Chris Sale
Additionally, performance data from the same dates were collected from Baseball Reference
game logs (MLB Stats, Scores, History, & Records, n.d.). The date, innings pitched (IP),
earned run average (ERA), win probability added (WPA), and game score (GSc) from each
pitcher’s outings were compiled. WPA is a measure of how much a player’s performance
increased their team’s chance of winning. A higher WPA indicated better performance
especially in high leverage situations (Slowinski, 2010). GSc is a metric developed by Bill
James to quantify a starting pitcher’s overall performance. A GSc of 50 to 60 is considered
about average whereas 80 plus is considered very impressive (What Is a Game Score?, 2020).
Sentiment Analysis
To determine the sentiment of the tweet text, the Vader Python package was employed
(Hutto, 2020). This package is intended for sentiment analysis of social media posts. The
package would break down tweets into individual words and categorize them as positive,
negative, or neutral. Based on this categorization, a weighted average compound score was
determined on a scale from -1 being most negative to 1 being most positive. This compound
score is referred to interchangeably with sentiment during this study.
Grouping Data
Once the data was compiled, it was then grouped for analysis. Twitter sentiment data was
grouped by date and again by the time periods between starts for Price and Sale. Sentiment
and performance metrics were merged into one final data set that included various metrics
about a particular start and the Twitter activity leading up to that start. A visualization of the
data flow described can be found in Figure 3.
Figure 3: This figure shows the transformation of the source data from Twitter and Baseball Reference into a final data set
for analysis.
Correlation Analysis
To confirm that sentiment relates to a player’s performance, correlations between
performance and sentiment following each game were examined. Interestingly, after joining
the Red Sox, the correlations between Price’s performance and Twitter’s reaction increased
substantially--especially when the performance included high leverage situations. As seen in
Figure 4, the correlation between WPA and subsequent sentiment increased from .21 when
playing for small market teams to .43 on the Red Sox. Conversely, the same statistic
decreased from .43 to .31 for Chris Sale. Expressed differently, compared to Price’s seasons
with small market teams and Sale’s seasons with the Red Sox, Price experienced higher
degrees of Twitter responses for good or bad performances. This could translate to more
pressure being placed on his performance in subsequent outings. It is also evidence that
Twitter sentiment is associated with performance and could be a suitable surrogate for
external pressure.
Figure 4: Correlation between WPA and Sentiment for Price and Sale before and after joining the Red Sox. While in Boston,
Price experienced higher degrees of Twitter responses for good and bad performances which could lead to more pressure in
future outings.
Sentiment preceding games and subsequent performance were then examined. The results
show that there is a relationship between the two. Both pitchers exhibited some correlation
between the percent of tweets with negative sentiment leading up to a game and ERA of that
game (.30 and .47 respectively).
This brief analysis shows that there may be some association between tweet sentiment and
various performance metrics. Additionally, there is some proof that this association could be
different depending on the player and whether they play for a small or large market team.
2018 Season
The next component of exploratory analysis included a detailed study of the 2018 season.
Figures 5 and 6 show average daily tweet sentiment and Game Scores for every start during
the 2018 season that ended in a Red Sox World Series title. This study was consistent with the
findings of the correlation analysis that showed higher degrees of Twitter response for David
Price than for Chris Sale.
Figure 5: David Price 2018 Average Sentiment and Performance. Price’s worst performances (usually against the Yankees)
were often preceded by decreasing sentiment, potentially a signal of increased pressure.
Price’s season appears cyclical with prolonged periods of overall increasing or decreasing
sentiment that appears roughly consistent with his performance. The graph shows that Price’s
worst performances (usually against the Yankees) were preceded by decreasing sentiment,
potentially a signal of increased pressure.
Figure 6: Chris Sale 2018 Average Sentiment and Performance. Sale’s 2018 season was much less cyclical in sentiment and
performance than Price’s with brief spikes for extraordinary events.
Conversely, sentiment of tweets mentioning Sale held relatively constant over time and did
not display the same cyclical nature as Price. Both performance and sentiment were relatively
stable outside of brief spikes for extraordinary events such as a poor outing against the
Braves, a shutout against the Yankees, and a stomach illness during the ALCS. This could
potentially be related to the differing social media habits of the two players. Price is very
active on Twitter so it makes sense that his performance and Twitter sentiment could be
linked whereas Sale does not even have an account which could explain the lack of Twitter
Overall, this element of the case study showed compelling evidence that Twitter sentiment
and performance are somehow related and that the relationship can be different depending on
the player.
Psychological Momentum and Pressure Study
The final element of this case study aimed to test the results of three studies mentioned in the
literature review above on psychological momentum and pressure. Psychological momentum
can be defined as a person’s propensity to perform better or worse based on their most recent
performance. Several studies have concluded that past performance and pressure have some
sort of interaction effect that can impact future performance.
There were three main conclusions from the literature review that were examined in the
context of the David Price and Chris Sale study:
1. Harris et al. concluded that the probability of poor performance is highest when the
pressure is high and there was a preceding poor performance (2019).
2. Gray et al. concluded that players on either a hot or cold streak were more likely to
perform well under pressure than an average player (2013).
3. Golding et al. concluded that athletes are more likely to be affected by criticism, but
the type and magnitude of their reaction is highly dependent on the person (2017).
To examine these conclusions, each outing for David Price and Chris Sale was flagged as a
Good, Bad, or Average performance based on Game Score. Game Score was used as a metric
because it does the best job of quantifying a starting pitcher’s overall performance. In
Addition, Twitter sentiment leading up to a start was classified as largely positive, largely
negative, or neutral. Nine groups of starts were then compiled based on the combination of
previous performance and Twitter sentiment. Figure 7 shows box plots for both pitchers’
Additionally, a single factor Analysis of Variance (ANOVA) hypothesis test was conducted to
determine whether the means of any the groups were significantly different from one another.
The ANOVA output for both pitchers is located in Appendix B. The p-values for both tests
are below .20 indicating that there is an 80% chance that there are differences in mean game
score among the nine groups referenced above and displayed in the box plots of Figure 7.
Figure 8 shows ANOVA plots for both pitchers for further analysis.
Figure 7: David Price and Chris Sale Performance after various levels of pressure and past performance.
Figure 8: Analysis of Variance Plots for David Price and Chris Sale that show the mean GSc of starts after various
combinations of prior performance and Twitter sentiment. Consistent with the literature review, both pitchers perform best
under neutral sentiment, Chris Sale performs best under the pressure of negative sentiment when entering the game on a
“cold streak” or “hot streak, and Price generally performs best after poor starts highlighting the differences between his
and Sale’s reactions to pressure and psychological momentum.
Positive Neutral Negative
Game Score
Twitter Sentiment Preceding Game
Chris Sale ANOVA Plot
Good GSc Avg GSc Bad GSc
Positive Neutral Negative
Game Score
Twitter Sentiment Preceding Game
David Price ANOVA Plot
Good GSc Avg GSc Bad GSc
The above ANOVA plots display the mean game score for each group that contains outings
flagged as following positive, negative, or neutral sentiment and good, bad, or average
previous performance. Twitter sentiment is meant to be a proxy for pressure and previous
performance indicates whether the player is entering the game on a “cold streak”, “hot
streak”, or neither. Viewing these plots, one can not only learn about the individual
relationships between future performance and past performance or pressure, but one can also
visualize the interaction effect that could be occurring between the two independent variables.
This analysis yielded several key insights that were consistent with the literature review. First,
as seen in Figure 8, both pitchers had their highest mean game scores after a period of neutral
sentiment. In other words, the pitchers performed better when there was limited pressure from
fans. Additionally, performance was generally the worst when there was negative sentiment
from fans. This is consistent with the conclusion of Harris et al. that higher pressure was
associated with worse performance.
Secondly, analysis of Chris Sale shows remarkable consistency with the findings of Gray et
al. When viewing Sale’s ANOVA plot in Figure 8, the spread between average previous GSc,
and good or bad previous GSc increases substantially under negative sentiment. This indicates
that when Sale experiences large amounts of external pressure, quantified by large volumes of
negative tweets, he performs better when on a “cold streak” or a “hot streak” compared to
starts following an average performance. This phenomenon is exactly what Gray et al
Finally, viewing David Price’s results supports the conclusions of Golding et al. Although
there are some similarities between trends in Price’s data and that of Sale, such as both
pitchers performing relatively well after neutral sentiment, overall, it appears Price reacts
quite differently than Sale does in similar conditions. For example, Price’s highest average
game scores were following bad performances. Furthermore, Price’s worst average game
score is following a good performance and period of negative sentiment. These findings
support the idea that players react differently to pressure and psychological momentum. It is
easy to see how this concept could be extended to front offices so that they can find players
who will react in desirable ways depending on the pressure associated with their team.
Overall, this case study serves as a compelling proof of concept for future research. It is
impossible to draw far-reaching conclusions from a study with only two players; however, the
findings presented can be used as building blocks for an entirely new area of player
evaluation. Too often teams sign free agents or trade for players with impressive numbers
without determining whether the player would be a good fit for their organization and their
market. Twitter sentiment analysis could be a tool to compliment traditional methods and give
front offices an edge.
The first research question presented in this analysis was whether external fan pressure could
be quantified using tweets. It was further questioned if this proxy for pressure would have any
relationship with player performance. Based on this study, the answer to both questions is yes.
The new process for collecting, cleaning, and analyzing tweets mentioning players was
successful and could be scalable to collect information on any modern player. Furthermore,
correlation results were promising showing at least some association between pressure and
performance for both Chris Sale and David Price and differences in that correlation depending
on the player and the team they played for. Furthermore, ANOVA tests that included
sentiment and psychological momentum produced results similar to what would be predicted
by other scholars in the field. Finally, anecdotal information from the 2018 season showed
compelling evidence that there could be a meaningful relationship between Twitter and player
The second research question asked whether some players would perform differently under
high external pressure than others. Again, this answer is yes based on the comparison of
David Price and Chris Sale. The two lefty pitchers had strikingly similar resumes before
joining the Red Sox, where external pressure undoubtedly increased. It is not difficult to
imagine that their reaction to this external pressure could be the source of subsequent
differences in performance. Correlation analysis revealed that the association between tweet
sentiment and metrics like WPA could be different depending on the player and whether they
play for a big or small market team. For example, Price experienced higher degrees of Twitter
responses based on his play with the Red Sox, which could have led to more pressure and
worse performance in subsequent outings. Additionally, analysis of the 2018 season revealed
that Price’s performance and tweet sentiment was much more cyclical than that of Sale.
Lastly, Sale performed in manners that were consistent with the findings of scholars
mentioned in the literature review in the context of performance under pressure and
psychological momentum, whereas Price deviated from this trend, potentially due to his
fundamentally different reactions to pressure.
The final research question mentioned in this exploration related to how front offices could
predict how players will react to various levels of external pressure. This question will
unfortunately remain largely unanswered until a broader analysis is conducted expanding this
study’s initial findings. One cannot assume that the results of Chris Sale and David Price will
apply to every Major League ballplayer. Future studies should include expansions to other
players, teams, and positions.
Using data such as that which is presented here, front offices could profile players based on
their performance under pressure. Large market teams could find players that will thrive
under increased pressure and avoid players that will crumble. Conversely small market teams
can find value in players that failed in the bright lights of New York or Boston not due to lack
of talent, but due to their psychology. This study proves there is value in considering Twitter
data when making baseball decisions, but it is not until more information is compiled and
analyzed that macrotrends will be revealed allowing front offices to improve their decision
making. Eventually, this study can be the basis of a movement towards front offices adding
new players to their team who are better fits and providing improved support to current
players throughout the season.
Appendix A – Price vs. Sale Comparison before and after Red Sox Debut
Price vs. Sale Comparison at time of Red Sox Debut
David Price
Chris Sale
Starting Pitcher
Starting Pitcher
Red Sox Debut
Age at Debut
Previous Teams
Career ERA at Debut
Career FIP at Debut
Career WHIP at Debut
Career WAR at Debut
Cy Young Awards
Top 10 in Cy Young Voting
Top 25 MVP Voting
Previous All-Star Appearances
Acquired by Red Sox Through
Free Agency
Annual Salary Year 1 with Red Sox
David Price
Chris Sale
Starting Pitcher
Starting Pitcher
Length of Tenure
4 seasons
3 seasons (and
2020 Season
Traded to Dodgers and
opted out of season
Tommy John Surgery
Cy Young Awards
Top 10 in Cy Young Voting
Top 25 MVP Voting
All Star Appearances
2020 Expected Salary
$32M (50% LA, 50% BOS)
Appendix B – ANOVA Output
David Price ANOVA: Single Factor
GoodGSc, Pos
GoodGSc, Neutral
GoodGSc, Neg
AvgGSc, Pos
AvgGSc, Neutral
AvgGSc, Neg
BadGSc, Pos
BadGSc, Neutral
BadGSc, Neg
Source of Variation
F crit
Between Groups
Within Groups
Chris Sale ANOVA: Single Factor
GoodGSc, Pos
GoodGSc, Neutral
GoodGSc, Neg
AvgGSc, Pos
AvgGSc, Neutral
AvgGSc, Neg
BadGSc, Pos
BadGSc, Neutral
BadGSc, Neg
Source of Variation
F crit
Between Groups
Within Groups
