REGRESSION AND CORRELATION
Twenty movies from 1999 and 2000 are listed in the attached Excel sheet with data entered.
You should have four values (variables) for each movie. Two ofthe values are found through the Rotten Tomotoes website
(www.rottentomatoes.com). The ‘Tomatometer’ is the percentage of movie critics who liked the movie and the ‘Audience’ value
is what website users thought ofthe movie. Finally, there are values for Worldwide Gross (amount the movie made) and the
Production Budget (how much the movie cost) for each movie.
Now do the following:
1) Plot four different regression graphs (X v Y): Tomatometer v Audience, Budget v Gross, Budget v Tomatometer, Audience v
Gross and find the best-fit line equation and correlation coefficient (r-value) for each one. X represents the independent
variable and Y represents the dependent variable.
2) Use the r-value and Table I (pg. 797 of your textbook) to determine ifyour best-fit lines are significant at 0.05 significant level.
(use 18 degrees offreedom here)
3) Calculate and interpret two 9596 confidence intervals to estimate the true mean Budget value. The first will assume
normality and use a z-interval; the second will assume a t-interval.
a) Finally do a write-up on your results including your regression graphs, your confidence intervals and hypothesis test, and
answers to the following questions using your graphs:
A) Using regression, does it appear that critics and audience members agree on what a movie should be rated? Why or why
B) Does it appear that large budgets lead to high-grossing movies? That is, are larger budgets associated with higher grossing
movies? Why or why not?
C) Does it appear that movie critics like high-budget movies? In other words, is there a relationship between higher ratings and
higher budgets? Why or why not?
D) Are higher ratings by audience members associated with higher gross amounts earned by the movie? Why or why not?