will look at normal versus abnormal distributions. In a normal distribution, there is symmetry about the mean. This means that data in the middle of the bell curve occurs more frequently than data further away from the mean. A normal distribution is one of the most important probabilities in statistics, as it fits many natural phenomena. See the image below for a normal distribution of dice rolling.
The rolling of dice is a good example of normal distribution. In an experiment, it has been found that when one die is rolled 100 times, chances to get ‘1’ are 16.7%. If we roll the die 1000 times, the chances to get ‘1’ are, again, the same—16.7%. If we roll two dice simultaneously, there are 36 possible combinations. However, the probability of rolling ‘1’ (with six possible combinations) again averages to around 16.7%, (6/36). The more dice you roll, the larger the normal distribution graph.
Let’s say that you want ‘1’ to come up more often than just 16.7%. So, you alter the sides on each die, changing numbers 2 and 3 to 1. This results in rolling ‘1’ more often than the normal distribution. This would be an abnormal distribution and would skew the results. A normal distribution is one in which the mean, median, and mode coincide with each other and make the curve we see in the dice image above. By altering the dice, the curve shifts as seen in the image below, and the mode, median, and mean no longer align.
As a leader at E-City’s university, you decide to run a regression analysis to see if workforce education level is associated with annual sales volume in the region. Then, you decide to run a regression analysis to see if workforce skills are related to annual sales volume. You find that one of these analyses produced abnormal results.
First, download the data that you will need for this week’s project
Regression Analysis Data for E-City.xlsx
Then answer the questions below.
- Out of the two regression analyses, which outcome is normal?
- What is the R2 number of the regression that you ran with workforce education level as the X variable and annual sales volume as the Y variable?
- Describe how you would use these regression results for further analyses. Describe what you think the outcomes mean.