UNIT 11:
Scatterplots
Summary of the Video
The video opens with views of manatees, impressively large sea creatures that live on the Gulf coast of Florida. Each year, a number of manatees are killed by power boats. We look at data that show the relation between the number of power boats registered in Florida and the number of manatees killed.
These are both quantitative variables; that is, they are measured in meaningful numerical units for which arithmetic operations make sense. Moreover, power boats explain manatee deaths, so power boat registrations is the explanatory variable and manatees killed is the response variable. To show the relationship between the two variables, make a scatterplot with the explanatory variable on the horizontal axis. The relationship shown in the graph has led Florida to take measures to protect manatees from boats.
In looking at a scatterplot, first seek an overall pattern. Our scatterplot shows positive association because both variables increase together. A graph of hypothetical data on time to make a pie versus numbers of tries illustrates a negative association; as the number of tries increases, the time required decreases because of learning. The manatee scatterplot has a linear pattern; the points lie roughly in a straight line. Other scatterplots can have a curved pattern. In addition to the overall pattern, look for outliers, individual points that lie clearly outside the pattern. A plot of lake temperature against time in a volcanic region with one point that is suddenly much hotter than usual would trigger an investigation.
Not all scatterplots have an obvious pattern. One that does not arises from the 1970 draft lottery. We revisit the Vietnamera draft of young men. Congress instituted a lottery in 1970, so that a random drawing of birth dates would determine the draft number of all men, who would be drafted in the order of their draft numbers. The 1970 lottery turned out to be unfair—men born late in the year tended to get lower draft numbers than those with earlier birth dates. To see this, we combine graphs with calculations. Plot draft number against birth date (both numbered 1 to 366). Now look at each month's draft numbers separately. Make a boxplot and put the boxes sidebyside on the scatterplot. Connect the monthly medians with a line. This is called a median trace . The line trends down—the later months did get lower median draft numbers. For the 1971 lottery, as the video shows, a more careful random selection was made. The median trace shows no signs of unfairness in 1971.
