Site hosted by Angelfire.com: Build your free website today!

Calculation of the Pearson Correlation Coefficient (r)

Pearson r tells us the degree to which a linear relationship is approximated; i.e., how closely the two variables approximate a linear relationship. The number that is calculated is called a correlation coefficient.

In class we used the following example: We have a small sample of 8 graduate students from SUNY Albany. We know their GRE scores and their GPA for their first year of grad school. We can use this GPA measure as a measure of success in grad school. The obtained data looks like this:

Indiv

GRE (X)

Grad GPA (Y)

X2

Y2

XY

1

1200

3.60

1440000

12.96

4320

2

1250

3.70

1562500

13.69

4625

3

1300

3.80

1690000

14.44

4940

4

1400

4.00

1960000

16.00

5600

5

1450

3.60

2102500

12.96

5220

6

1450

3.70

2102500

13.69

5365

7

1475

3.50

2175625

12.25

5162.5

8

1550

3.90

2402500

15.21

6045

 

- If we just look over GRE and GPA, we can see that it doesn’t really look like the GPA increases as GRE scores increase, but if we calculate the Pearson correlation coefficient, we can see how much these two variables are actually related (i.e., how closely these two variables approximate a linear relationship).

Computational formula for r:

- As can be seen in this formula, in the denominator, we have the SS for our X variable and the SS for our Y variable.

- In the numerator, we have the sum of the cross products of X and Y.

 

 

- So, r = +.16. What does this mean? There is a positive relationship between the two variables, so as one increases, the other also increases.

 

- But what about the magnitude? .16 is pretty close to zero, so we would conclude that there is not much of a relationship between these two variables.

 

 

 

CORRELATION DOES NOT MEAN CAUSATION!!!

Even if you find that two variables are strongly correlated, this does not mean that one variable causes the other. The correlation merely tells you that the two variables are somehow related.

 

 

It could be that: (1) X causes Y

(2) Y causes X

(3) Some other variable causes both X and Y to be related. (e.g., motivation)