Welcome to COMAP

Enter query

Order Form

All five dvds with both guides
Product # 8600

View Cart / Checkout



Decisions Through Data Home


Statistics: Decisions Through Data

UNIT 13: Correlation

Summary of the Video

The documentary story concerns the Minnesota Twins Study. We open with a look at a twins convention, then move to data. A scatterplot of the heights of pairs of identical twins shows a strong linear relationship. How can we describe the strength and direction of such a relation numerically?

The correlation coefficient r does this. Animated graphics present the main facts about correlation: r is always a number between 1 and 1; positive r means positive association, and the closer r is to 1, the closer to a straight line the scatterplot is; r = +1 is perfect positive linear association, the case in which all the points lie exactly on a straight line; negative r similarly measures negative linear association.

The scatterplot of twins' heights has r = 0.92, strong, positive, linear association. (Actually, because we could plot either twin on either axis, there's a special r for this setting that is not the same as the basic r whose recipe we will learn. But the properties and interpretation are the same.) These twins were separated shortly after birth and raised apart, so the high correlation suggests that inheritance has a lot to do with determining height. A major purpose of the Minnesota study is to use twins to examine the role of heredity.

We meet Mark and Jerry, identical twins raised apart, who describe their remarkable similarities in appearance and personality. Is their similarity just a coincidence? We follow another pair of twins through a battery of tests. Thomas Bouchard, the psychologist who directs the Minnesota Twins Study, explains how they use correlation. Compare the correlation between twins raised apart with that between twins raised together. Both heredity and environment are similar for twins raised together, so they should have higher correlations for most characteristics than do twins raised apart. Comparing the correlations helps show the role of heredity.

A scatterplot of a personality test score for identical twins raised apart has r = 0.49, not as strong as the correlation of heights. Twins have somewhat similar personalities, but the relation is not as close as the relation between their heights. For twins raised together, the personality correlation is r = 0.52. Because the two correlations are very similar, it appears that heredity plays a substantial role in personality and that a common home environment contributes little. (Some skeptics think that separated twins tend to be adopted by similar homes, so that some environment effects are included. But heredity appears to play a larger role in personality than most psychologists believed a few years ago.)

How can we calculate r from data? An animated graphic shows the recipe


The narrator notes that the formula uses standardized values of x and y (Unit 6), so that r doesn't change when we change the unit of measurement from (for example) pounds to kilograms. No numerical example appears, because the video concentrates on knowing the properties of r .

Another graphic shows how r helps interpret a scatterplot by demonstrating that the same scatterplot looks more linear when there is lots of empty space around the points. So graphs and numerical descriptions of data work together.