Project proposal

Ejebagom John Ojogbo
Omotoyosi Taiwo
Tara Welytok



The success of a movie is usually predicted based on its sales from the opening weekend. Our application project aims to test the claim that the opening weekend is the highest grossing weekend for a movie and (in general) is a strong indicator of the movie’s future success. Thus, the key question this project will be trying to answer is this: “Is the first weekend the most successful in a movie’s runtime?”.

To test this claim, we shall be using the data from the file Weekend Movie Box Office Receipts (movieweekend.dat), acquired from the Journal of Statistics Education’s data archive. The dataset contains 49 movies that opened in theaters from 1977 up to 2007. The movies were taken from a variety of sources and they include Academy Award Best Picture winners, movies in series such as the Harry Potter collection, highest grossing movies, and pictures from the Sundance Festival. For each movie in the set, the data shows the following properties: the name of the movie; the week observed, e.g. the first week or the sixth; the weekend gross per theater (in dollars); and the weekend date. The weekend date used in data refers to the Friday at the start of the weekend. Weekdays are not included in the dataset, but are not needed to test our claim.

To find answers to the question posed above, we shall apply hypothesis testing. Our null and alternative hypotheses are defined as follows:

Ho = “the first weekend is the most successful”

Ha = “the first weekend is not the most successful”

We define the most successful weekend as the weekend that has the highest percentage of the gross box office earnings in dollars. Therefore we shall be analysing the contributions of the first weekends of each movie in the dataset using their proportions to the entire box office gross. From the calculated values we expect to obtain a variety of confidence intervals, with special focus on 90%,  95% and 99%. These would give a very clear picture of how strong the results of our hypothesis tests are. It has been observed that after certain weeks the viewership (and by extension, the box office receipts) drops considerably. Adequately determining this point for the various movies would simplify the analysis process by reducing the amount of weeks to be parsed.

We shall observe how many movies in the set have their opening weekends as their highest grossing, and using proper weighting and analysis tools we will use this to determine whether or not to reject our null hypothesis. In the course of testing and analysis we expect to discover if there are any correlations between a certain weekend’s gross and total box office receipt, and which weekend (if not the opening) is in fact the most successful in the theatrical life of a movie. We also expect to see trends concerning which week in a movie’s lifetime viewership declines considerably, and whether this can be applied to all/most movies in the dataset.

1. McLaren, C., DePaolo, C. (2009). Movie Data. Journal of Statistics Education Volume 17. Retrieved from