[Stats] Homework

No problem. I've taught statistics classes so I understand that these concepts can be very hard to grasp at first. Here's a few pointers for the first two questions. #1, I was a little confused by the question, but I assume the answer would be yes it is appropriate. The correlation coefficient will show how close the two variables lie along the regression line. It also might be important to note that there is a cluster of outliers in the top right corner and that a logarithmic regression might provide a better fit than a linear in this case. For #2, remember the correlation will be the square root of R2 which is .352 (assuming it's a positive relationship, ie higher drop = longer duration). I'm not sure what is meant by #2 B and C, but I think it might be related to this. So keep in mind one standard deviation below a mean is the 15.9th percentile, and three standard deviations above the mean is the 99.9th percentile.

Here is how you would do the calculation for B and C

Step 1: Convert 15.9th (.159) into normal units for x, this is a z score of -0.998576

Step 2: Multiply correlation .352 by -0.998576 = -0.351499 (z score for y)

Step 3: Given a z score of -0.351499, the percentile for y is 0.362607 or 36.3th percentile. Or phrased in the question, you would predict the duration to be in the 36th percentile if it is one standard deviation below the drop mean.

Step 1: Convert 99.9th (.999) into normal units for x, this is a z score of 3.090232

Step 2: Multiply correlation .352 by 3.090232 = 1.08776 (z score for y)

Step 3: Given a z score of 1.08776, the percentile for y is 0.86165 or 86.2th percentile. Or phrased in the question, you would predict the duration to be in the 86th percentile if it is three standard deviations above the drop mean.

/r/HomeworkHelp Thread