Uploaded by Vivek Ranjan

Bollywood prediction

advertisement
Lal Bahadur Shastri Institute of Management, Delhi
LBSIM Working Paper Series
LBSIM/WP/2020/06
Predicting Success of
Bollywood Movies
Shivani Bali
August,2020
LBSIM Working Papers indicate research in progress by the author(s) and are brought out to elicit
ideas, comments, insights and to encourage debate. The views expressed in LBSIM Working Papers
are those of the author(s) and do not necessarily represent the views of the LBSIM nor its Board of
Governors.
WP/August2020/06
LBSIM Working Paper
Research Cell
Predicting Success of Bollywood Movies
Shivani Bali
Abstract
India’s entertainment economy is growing rapidly and with half a billion people under the age of twenty-five,
the Media and Entertainment (M&E) companies of the world are taking note of it. With favourable
demographics and rise in disposable incomes the propensity to spend on leisure and entertainment is growing
faster than the economy itself. Indian Hindi Movie industry popularly known as Bollywood has reached
staggering proportions in terms of volume of business (220 billion), employment, movies produced (more than
100 in a year) and its reach (more than 100 countries worldwide). The Indian film industry is the most prolific in
the world, with more than 1,000 films produced every year in more than 20 languages. Almost 90 percent of
revenues are derived from local films leaving English and foreign films with a minority 10 percent market share.
With 3.3 billion tickets sold annually, India also has the highest number of theater admissions. With so much at
stake and highly uncertain nature of returns, it is of commercial interest to develop a model which can predict
the success of a movie. Movies have been described as experience goods with very less shelf life therefore it is
difficult to forecast the demand for a movie. There are number of parameters that may influence success of a
movie like – time of its release, marketing, lead actors, director, producer, writer, music director – being some
of the factors. The present study aims to develop a model based upon Multiple Linear Regression that may help
in predicting the success of a movie in advance thereby reducing certain level of uncertainty.
Keywords: Predictive modeling, regression, success
Predicting Success of Bollywood Movies
Introduction
The Indian movie industry produces the maximum number of movies per year, higher than
any other country’s movie industry. However, very few movies taste commercial success.
Given the low success rate, models and mechanisms to predict reliably the ranking and / or
box office collections of a movie can help de-risk the business significantly and increase
average returns. Various stakeholders such as actors, financiers, directors etc. can use these
predictors to make more informed decisions. The study is restricted to Bollywood movies as
time and financial constraints were involved. Some of the questions that can be answered
using prediction models are given below.
1. What are the major factors that determine the success or ranking of a Bollywood movie?
2. Is the Budget of the Indian movie a key determinant of rank or success?
3. Does social media ratings drive the success of movie?
Further, a DVD rental agency or a distribution house could use these predictions to determine
which titles to stock or promote respectively.
Data from reliable sources from the Internet - Movie Database (IMDB),
Bollywoodhumgama.com ratings, critic’s ratings, social media (twitter, face-book etc.) were
taken and various data mining and prediction techniques like multi-linear regression, were
used to devise a model that can predict an Bollywood movie’s key success factors.
1. Literature review
In Hollywood researchers have conducted several studies on the box office or financial
performance of movies by primarily adopting psychological or economic approaches (Litman
& Ahn, 1998). The psychological approach focused on the individual moviegoer’s decision to
choose a movie over other entertainment options and to select particular movies, whereas the
economic approach was based on mainly the economic and industrial factors that are
determined by the supply side (Litman & Ahn, 1998). Methodologically, psychological
studies primarily used survey methods, whereas the economic approach used secondary data.
In another study conducted by Chang and Ki (2005) considered the fact that little research
had been done on inherent experience good property of movies as there is advantage in using
experience good property in that it is closely related to a movie audience’s decision making
process.
Jack Valenti, president and CEO of the Motion Picture Association of America, once
mentioned that “…No one can tell you how a movie is going to do in the marketplace …not
until the film opens in darkened theatre and sparks fly up between the screen and the
audience”(Valenti,1978). Movies then are something to be experienced and most trade
journals and magazines of motion picture Industry concur with the experience that supports
the claim of Valenti. The experience good property has two aspects. First, movies are an
experience good because individuals choose and use movies solely for the experience and
enjoyment (Hirschman & Holbrook,1982; Holbrook & Hirshman,1982). This means that in
case of movie goers, consumption experience is an end in itself (Reddy et al, 1998). Also
movies are an experience good because individuals do not know what the value of the movie
will be to them until they experience it (Shapiro & Varian)
Based on work by Litman (1983) and Sochay (1994), creative sphere, scheduling, release
pattern and marketing effort were used as categories of independent variables. However an
explicit criterion was not suggested for this categorization. Historically neither the creators
nor distributors of “cultural products” have used analytics – data ,statistics, predictive
modeling – to determine the likely success of their offerings. It’s easy to see why most people
see the prediction of taste as an art (Thomas et al 2009) But the balance between art and
science is shifting. Today companies have unprecedented access to data and sophisticated
technology that allows even the best known experts to weigh factors and consider evidence
that was unobtainable just a few years ago. The entertainment landscape as a result is looking
for prominent features for the prediction of consumer taste.
Gaikar et al. (2015) used twitter data to predict success of a movie. Bhave et al (2015)
discussed various factors that can impact the success of a movie. We are living in the era of
internet where we trust more on the sentiments of people given on social media. Mudra et al
(2019) dis sentiment analysis on the tweets to predict the success of a movie.
Since, not much work has been carried out in India in this area an attempt has been made to
develop a model which can predict the success factors of a movie. We differentiate our study
as no reported study to predict Bollywood movie success on the variables we have taken has
been done, moreover the study is a longitudinal study based on two years namely 2014 and
2016. The research attempts to review and empirically test predictors that have been adopted
by field practitioners for the purpose of predicting movie success in Bollywood
2. Research Methodology
The objectives of the study were to answer the following research questions.
1. Determine the key variables that results in predicting the success of a Bollywood movie
2. Determining whether budget is an important factor for the success of a Bollywood movie.
Secondary data for two years (2014 and 2016) were collected from reliable sources- IMDB
rating, Bollywood Hungama ratings, Social media-(Twitter ratings) and Critics ratings.
Major parameters that were taken in the study were:
Dependent Variables
1. Total Revenue crore- This was the most frequently used variable in literature (eg.,
Litman & Ahn, 1998). It is the collection over the entire running of movie in
theatres.
2. First week box office collection - While the first-week box office is considered to
be highly correlated with the total domestic box office collection, In this study this
variable is adopted to check whether some independent variables affect the two
dependent variables to different degrees.(Thomas et al 2005)
Independent Variables
1. Release date – Release date is considered important in India as lot of box office
hits have come to movies which have released on holidays/occasions. In fact
major production houses form a pact to release movies on the occasions of
Diwali/Id/Christmas so that the release dates do not clash. So to see how this
variable affects the total success of a movie it is taken in the study. The rationale
is that a high-attendance-period release (e.g., Christmas/Eid/Diwali) attracts a
bigger audience, which leads to higher box office performance. In reviewing the
high-attendance periods, only a few periods, such as Summer, were empirically
supported by previous literature (Litman, 1998; Sochay, 1991; Wyatt, 1991).
2. Budget of Movie in Crores- Production costs have been considered an important
predictor because big budgets manifest as lavish sets, locales, costumes, special
effects etc. Most previous studies (Basuroy et al,2003; Litman,1982; Litman &
Ahn,1988; Wyatt,1991, Lee 2009) have also supported the importance of budget.
The production budget data were brought in from the IMDb. Accurate information
on production budgets is highly difficult to obtain because it is considered
confidential. Therefore, some caution is required in using the production budget
data because they are usually based on press releases by the studios or estimates
by insiders in the industry.
3. Genre Based on previous research, the movies are categorized into seven
genres:
action/adventure,
children/family,
comedy,
drama,
horror,
mystery/suspense, and sci-fi/fantasy. This listing is complete and mutually
exclusive. To code genre, this study consulted IMDb and Bollywood Hungama
which coded genre for all the sample movies. The comedy genre was significant
in several studies and sci-fi fantasy and horror genre were supported in other
literature, however no studies were unanimous.
4. Lead Actor / Actress- the effects of actors or star power have received
considerable attention in literature (Basuroy, Chaterjee & Ravid,2003: De Silva
1998, Holbrook 1999, Prag and Casavant,1994, Wyatt, 1991) So here we have
categorized actors in 4 categories depending on No. of movies the Actor has done
-whether he is a
a. Debut(<5 movies)
b. Star(6 -50 movies) or
c. Super Star ( more than 50 movies).
d. Female Actor
According to brand theories, the use of superstars is strongly recommended because
the use of stars in movies is actually ingredient branding, which involves creating brand
equity for materials, components and parts that are necessarily within other branded products
(keller,1998; Levin et al 1997). Empirical studies of actor effect, however report conflicting
results. Although some researchers (Litman & Kohl,1989; Sochay ,1994;Wallace et al
1993;Wyatt,1991) found significant contributions from star power, several other studies (De
Silva,1998; Vany & Walls, (1999); Litman 1983; Litman & Ahn,1998: Prag and Casavant
,1994) did not find any significant effect.
5. No. of movies made by Director – just as super stars are ingredient branding
according to brand theories so are directors. The director becomes a brand
depending upon the total number of movies the director has directed during his or
her career and higher the successes the bigger the brand. The studies in
Hollywood however report that the effect of directors was not
significant.(Litman,1982;Litman
&
Ahn
,1998;Litman
and
Kohl
,1989;Sochay,1994,Wyatt 1991)
6. Sequel–a sequel uses established brand name to introduce a new product and is
conceptualized like a brand extension (Keller 1998).The use of established brand
parent allows brand extensions to get customer attention and thus reduces
marketing costs. In this study we have determined whether a movie is a sequel or
not.
7. Production House – The production house is an important variable in Indian Film
Industry and has been taken as one of the variable. Production houses like YashRaj films, Dharma productions, UTV motion pictures and Balaji films are
considered some of the most successful production houses. These successful
houses can also bear the cost of lavish sets and costumes, expensive digital
manipulations as they have budgets for all activities pre-production and post
production.
8. Critic’s ratings – Like the case of superstars and production houses, the effect of
critics ratings has been widely tested by previous research (De Silva ,1998;
Litman & Ahn, 1998, Sochay 1994, Prag & Casavant 1994). Ratings given by
famous movie critics. critics assist individuals in making a movie choice,
understanding the content of the movie, developing an initial opinion of the film,
and communicating movie information to others(Austin 1993)
9. IMDB ratings – IMDB is world’s largest and famous movie database website with
rating sclaes. (1 – 10 scale).
10. Bollywood Hungama Ratings – Ratings given on the Bollywood Hungama
websites on the movies (1 – 10 scale).
11. Social Media - Average of Tweets for the movie on a scale of 1 - 10.
The data for 140 movies on all the above mentioned factors were collected through
the secondary reliable sources.
3. Result and Analysis
The data collected was raw and a few values were missing which resulted in an
incomplete dataset. It also had certain outliers as well. So to make data ready for
analysis data cleaning and data transformation was performed. Dummy Variables
were created for all the categorical variables present in the data set and are mentioned
below:
A. Release Date – Dummy Release date was formed where 1 meant movie was
released on festival or holiday and 0 meant no occasion.
B. No. of movies the Actor has done- So here 4 categories were made- Debut, Actor,
Superstar and Female Actor. So since there were 4 categories – 3 dummy
variables were made namely – Debut, Actor and Superstar.
C. Genre – In Genre there were 7 categories, so 6 Dummy variables were made
namely – DummyAction, DummyFamily, DummyComdey, DummyDrama,
DummyHorror, and DummyMystery.
D. Production House – Here this variable was divided into 5 categories where 4 were
the famed production houses and the last category as others. So 4 Dummy
variables were DummyYashraj, DummyDharma, DummyUtv and DummyBalaji.
The sampled movies earned an average revenue of ₹ 63.53 crores. The mean of the first
week box office collection was ₹ 28.92 crores, showing that the performance of the first
week accounted for approximately 46% of the total box office revenue. The average
budget of the sampled movies was ₹ 33.60 crores which accounted for approximately
53% of the average box office revenue. Action (n=36, 26%), family (n = 12, 9%),
comedy (n = 30, 21%), drama (n = 45, 32%), horror (n = 8, 6%), mystery (n = 8, 6%),
sci/fic (n = 1, 1%).
Regarding the parameters of information source, the average critics’ rating was 4.70 (1
– 10 scale), the average IMDb rating was 5.76 (1 – 10 scale) and average Bollywood
hungama rating was 7.02 (1 – 10 scale) quite higher than the IMDb rating. Since we
are in the era of social media, therefore, tweets played a very critical role in predicting
the success of the movie. The average tweet rating was 6.18 (1 – 10 scale). Regarding
release time (n = 12, 9%) of the sampled movies were released during festival season or
during holidays. Out of these 12 movies, 8 movies have earned the total revenue of
more than 200 crores. The average values of all quantitative data are presented in
Table 1.
Table 1. Descriptive Analysis
Variables
Mean
First Week Box-Office Collection
28.9179
Total Revenue (in Crores)
63.53
Budget (in Crores)
33.6
Actor No of Movies
35.97
Tweet ratings
6.0921
Director No. of Movies
6.18
Critics Rating
4.6939
IMDb Rating
5.7598
Bollywood Hungama Rating
7.02
Regression Analysis: The result of regression analysis with total box office collection and
first week box office collection as dependent variable is shown in the Table 2. Both the
models explained significantly high amount of proportions of variances with R 2 values of
0.703 and 0.673 respectively.
Table 2 Summary of the outcome of the regression analysis
Independent Variable
First week
Total
Dummy Release date
Drama
Dharma
Budget
Bollywood Hungama
Tweets
30.027**
78.955**
13.698*
2.446**
1.638**
4.258*
3.499**
N
R
R2
Adjusted R2
140
0.838
0.703
0.69
140
0.821
0.673
0.661
13.481*
0.843**
F
Probability > F
78.689
53.593
0.000
0.000
*p<.05, **p<.01
The first model with total box office as the dependent variable shows that release date,
drama, budget, Bollywood hungama rating and tweets were the significant predictors. Movie
release date showed significant contribution to the success of total box office.
Movies released during festivals and holidays showed success. One interesting finding is that
the genre ‘drama’ was positively related to the total box office. The effect of other genres,
however, was not found. Budget contributes significantly to the success of the movies.
Budget includes expenses on advertising and promotion, hence, more the budget more would
be the chances of success. Evaluation of Bollywood hungama and tweets were found to be
significant. The study also highlighted that social media likings such as tweet ratings has
emerged as an important factor in predicting the success of a movie.
The second model predicting the first week box office collection found lesser number of
significant variables such as release date, dharma production, budget and tweets. For total
box office collection, evaluation of Bollywood hungama and tweets both were significant.
But for first week box office collection only tweet ratings was found to be significant. This
shows that the Bollywood hungama rating is a predictor rather than an influencer on the box
office success of the movies. Similarly, dharma production is an influencer and drama is a
predictor. For first week collection dharma production was significant irrespective of the
genre of the movie but for total collection drama is significant irrespective of the production
house.
4. Discussion
Based on the result of the regression analysis we are in a position to answer the following
research questions:
RQ1: Determine the key variables that results in predicting the success of a Bollywood
movie
Two models were developed to measure the success of the movie in terms of:
a) Total box-office collection.
b) First week box-office collection.
The variables considered significant in predicting the success are given below:
a) Total box office collection (Release date, Drama, Budget, Bollywood Hungama, Tweets)
b) First week box office collection (Release date, Dharma Productions, Budget, Tweets)
RQ2: Determining whether budget is an important factor for the success of a movie.
The regression analysis showed that budget is significant in predicting the box office
collection both for first week and total.
5. Practical Implication
This study carries implications for both movie makers and the audience. First the study
revealed that the release date of a movie is highly significant in predicting the box office
collection of both first week and the total. People are more willing to take out time to go to
the theatres to watch movies during holidays or festival season. Secondly, an interesting
finding of the study is that tweets play a significant role in predicting the box office success.
The decision to go for watching a movie is directed by the twitter ratings. People have a
belief that twitter ratings are more reliable source for knowing the performance of a movie
than any other ratings’ source. The first week box office success is explained by the twitter
ratings whereas the total box office success is explained by twitter as well as the evaluations
done by Bollywood hungama ratings as well. As always budget has the positive impact on the
success of a movie. For this study, budget includes both movie making expenses as well as
promotion expenses. The study also revealed that amongst the various production houses,
Dharma production has a positive impact over the audience. This is also one of the significant
predictors of the first week box-office success; during the first week people are willing to
watch a movie under the banner of Dharma production irrespective of the star cast or genre.
Last but not the least amongst all the genres ‘Drama’ is the choice of the audience. It has
positive impact on the total box office collection.
6. Conclusion & Future Scope
This research study attempts to propose models for measuring the success of the Bollywood
movies. Two models were proposed to predict the box office collection of first week and total
box office collection. Regression analysis was performed on the data set. The study revealed
that Release date, Dharma Productions, Budget, Tweets significantly impacted the first week
box-office success. For total box office collection, the significant variables were Release
date, Drama, Budget, Bollywood Hungama ratings and Tweet ratings. The key contribution
of this research is to give direction to the film maker regarding the significant drivers of the
success of the movie. Further in this study data for latest years can be added. Also, advance
machine learning techniques can be applied to the data set for better predictability of the
results.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Austin, B. A. (1983). A longitudinal test of the taste culture and elitist hypotheses. Journal of Popular
Film and Television, 11, 157–167.
Basuroy, S., Chatterjee, S., & Ravid, S. A. (2003). How critical are critical reviews? The box office
effects of film critics, star power, and budget. Journal of Marketing, 67(4), 103–117.
Bhave Anand, Kulkarni Himanshu ; Biramane Vinay ; Kosamkar Pranali (2015). Role of different
factors in predicting movie success Pervasive Computing (ICPC), 2015 International Conference, DOI:
10.1109/PERVASIVE.2015.7087152, IEEE
Chang.B.H and Ki E.J,(2005) Devising a Practical Model for predicting theatrical movie success
focusing on the Experience Good Property, Journal of Media Economics, 18(4),247-269
De Silva, I. (1998). Consumer selection of motion pictures. In B. R. Litman (Ed.), The motion picture
mega-industry (pp. 144–171). Needham Heights, MA: Allyn & Bacon.
Gaikar D. Damodar and Marakarkandy Bijith, Dasgupta Chandan (2015), Using Twitter data to predict
the performance of Bollywood movies, Industrial Management & Data Systems, Vol. 115 No. 9, 2015,
pp. 1604-1621
Hirschman, E. C., & Holbrook,M. B. (1982). Hedonic consumption: Emerging concepts, methods and
propositions. Journal of Marketing, 46, 92–101.
Holbrook,M. B., & Hirschman, E. C. (1982). The experiential aspects of consumption: Consumer
fantasies, feelings and fun. Journal of Consumer Research, 9, 132–140.
http://www.hollywoodreporter.com/news/indian-entertainment-biz-revenues-reach-270337 accessed on
8th July 2016 at 4.53 p.m
http://www.pwc.in/en_IN/in/assets/pdfs/PwC- Indian-Entertainment-and-Media-Outlook- 2009.pdf.
Keller, K. L. (1998). Strategic brand management. Upper Saddle River, NJ: Prentice Hall.
Lee K.J., W. Chang (2009). “Bayesian Belief Network for Box Office Performance: A Case Study of
Korean Movies”. Expert Systems with Applications, 2009, vol. 36 (1), page 280-291
Levin, A. M., Levin, I. P., & Health, C. E. (1997). Movie stars and authors as brand names: Measuring
brand equity in experiential products. Advances in Consumer Research, 24, 175–181.
14. Litman, B. R. (1982). Decision-making in the film industry: The industry of the TV market. Journal of
Communication, 32, 33–52.
15. Litman, B. R. (1983). Predicting success of theatrical movies: An empirical study. Journal of Popular
Culture, 16, 159–175.
16. Litman, B. R. (Ed.). (1998). Motion picture mega-industry. Needham Heights, MA: Allyn & Bacon.
17. Litman, B. R., & Kohl, L. S. (1989). Predicting financial success of motion pictures: The ’80s
experience.Journal of Media Economics, 2, 35–50.
18. Litman, B. R.,& Ahn, H. (1998). Predicting financial success ofmotion pictures: The early ’90s
experience.In B. R. Litman (Ed.), Motion picture mega-industry (pp. 172–197). Needham Heights,
MA:Allyn & Bacon publishing incorporated
19. Mundra S., Dhingra A., Kapur A., Joshi D. (2019) Prediction of a Movie’s Success Using Data
Mining Techniques. In: Satapathy S., Joshi A. (eds) Information and Communication Technology
for Intelligent Systems. Smart Innovation, Systems and Technologies, vol 106. Springer,
Singapore
20. Prag, J., & Casavant, J. (1994). An empirical study of the determinants of revenues and marketing
expenditures in the motion picture industry. Journal of Cultural Economics, 18, 217–235.
21. Shapiro, C., & Varian, H. R. (1999). Information rules. Boston: Harvard Business School Press.
22. Sochay, S. (1994). Predicting performance of motion pictures. Journal of Media Economics, 7, 1–20.
23. Vany A. De. Walls, D. (1999). Uncertainty in the movies: Can star power reduce the terror of the box
office? Journal of Cultural Economics, 23, 285–318.
24. Wyatt, J. (1991). High concept, product differentiation, and the contemporary U.S. film industry. Texas
Film and Media Study.
LAL BAHADUR SHASTRI INSTITUTE OF MANAGEMENT, DELHI
PLOT NO. 11/7, SECTOR-11, DWARKA, NEW DELHI-110075
Ph.: 011-25307700, www.lbsim.ac.in
Download