# Lesson 2: Constructing and Analyzing Scatterplots

In this "Cultivating Data" lesson, students will learn to construct and interpret scatterplots with lines of best fit, and then use them to find outliers and answer critical-thinking questions.

**STANDARDS**

**Grade 8:**CCSS.Math.Content.8.SP.A.1**Grades 6–8:**NCTM Data Analysis and Probability

**OBJECTIVE**

Students will be able to:

- construct scatterplots;
- find outliers; and
- draw a line of best fit.

**TIME REQUIRED:** 40 minutes, plus additional time for worksheets (may be split over two or more days)

**MATERIALS**

- Study Time vs. Test Score: Table and Scatterplot
- Worksheets 2.1, 2.2, and 2.3: PDF or whiteboard-friendly
- Graph paper
- Worksheet Answer Key (PDF)
- Classroom Poster (PDF)

**ADDITIONAL RESOURCES**

**DIRECTIONS**

1. Pose the following problem to your class: Do you think there is a relationship between the time studying for a test and the score earned? Discuss as a class.

2. Ask how one could determine whether or not the relationship exists (e.g., conduct a study in which data on the amount of study time and the resulting grade are captured).

3. Show the Study Time vs. Test Score table to the class. Ask if the table clearly shows the relationship between the two variables. Ask if there might be a clearer way to show the relationship (a **scatterplot**).

4. Show the Study Time vs. Test Score scatterplot to the class. Point out that the **independent** **variable** (which could be thought of as the cause) is study time and is on the X-axis. The **dependent** **variable** (the effect)—in this case the test score—is on the Y-axis.

5. If the class is comfortable creating and reading scatterplots, take a few data points from the table and show where they appear on the scatterplot. If the class has less experience with scatterplots, show how all the points on the table are represented on the scatterplot.

6. Ask if the scatterplot shows a relationship between study time and test score and, if so, how. Point out that the general shape the points make slopes upward to the right, showing that, in general, scores increase as study time increases. This pattern is known as a **positive** **correlation**. If the scatterplot sloped downward to the right, we might say it has a **negative** **correlation**. An example of a negative correlation might be apparent if we made a scatterplot of hours of television watched versus test scores. Sometimes, no relationship is apparent between the variables, e.g., if we plotted the number of letters in a person's first name versus the number of letters in his or her last name.

7. Call the class's attention to the points on the scatterplot for Sloane (studied 25 minutes and received a score of 98) and Rick (studied 125 minutes and received a score of 77). Ask if these two points appear to follow the trend observed of more study time leading to a higher score. Indicate that these points are called "**outliers**" because they fall outside the pattern formed by the other points. Ask if these two points represent possible data collection errors or if there could be a reasonable explanation for why they fall outside the pattern.

8. Ask if it would be possible to use the scatterplot to predict what a person would score if he or she studied for, say, 40 or 100 minutes. Show how to draw a **line of best fit** by "eyeballing" the points on the scatterplot. Inform the class that the line doesn't have to go through any of the points, but it's easiest to draw the line by identifying two representative points and connecting them. The resulting line can be used to make predictions, either by finding a point on the line or by determining the formula of the line and plugging in values. If the class is sufficiently advanced, demonstrate how to determine line of best fit with the least squares method or on a graphing calculator.

9. Distribute Worksheets 2.1–2.3 to students over 1–3 days, then review answers with class.