# Reading Quizzes, Clicker Questions, and Lurking Variables in #Math216

Here’s another clicker question from the statistics course I’m teaching this spring:

Suppose an observational study indicates a positive association between the average global temperature and the National Science Foundation budget.  Which of the following are possible lurking variables in this study?

1. US population size
2. Public interest levels in alternative energy
3. Both 1 and 2
4. Neither 1 and 2

It’s useful to ask clicker questions that target particular student misconceptions. Often, we instructors have to make educated guesses about what confuses our students. That wasn’t the case with this question, since I based the answer choices entirely on responses to a reading quiz the students completed the night before class. (See my March 2010 ProfHacker guest post for some info on my use of pre-class reading quizzes.)

Here’s one of the reading quiz questions from the night before this class session:

These charts from Businessweek make the point that correlation does not imply causation. Identify a possible lurking variable for one of the charts.

One student floated an interesting idea for the “M. Night Shyamalan Movies Scores vs. Newspaper Ad Sales” chart. Short version: Twist endings are easily spoiled by the Internet, as are newspaper ad budgets. As I read through student responses, I noticed that most of the lurking variables they identified fell into two types, as seen in these examples for the “NSF Budget vs. Global Temperature” chart:

Example 1: A lurking variable in the National science foundation may be the amount of CO2 emission that may affect both the research funding that will attempt to reduce the emissions and it will also affect the temperature of the planet.

Example 2: You could say that the increase in average global temperature has effected the public’s interest in discovering new energy sources, which then might have caused the increase in the National Science Foundation’s budget.

In Example 1, the student identifies a third variable (CO2 production) that’s a possible cause of the increases in the original two variables (the NSF budget and global temperatures). This notion of a lurking variable matches the example of a lurking variable in the textbook:

(Use of sunscreen and skin cancer rates are correlated, but increases in both are explained by the lurking variable, sun exposure rates.)

In Example 2, the student identifies a third variable (public interest in alternative energy) that’s a possible reason why an increase in one of the original two variables (global temperature) is leading to an increase in the other (the NSF budget). This notion of a lurking variable doesn’t match the book’s example, but it does satisfies the book’s definition of a lurking variable, “a variable that is correlated with both the explanatory and response variables.”

I found the book’s diagram helpful (<cough>visual learner<cough>), so let’s diagram Examples 1 and 2:

Example 1

Example 2

See the difference? In Example 1, arrows point from the lurking variable to the original two variables. In Example 2, one arrow points toward the lurking variable and one arrow points away.

As I read the students’ responses and doodled diagrams like these, I guessed that some of my students keyed into the textbook’s example and mistakenly thought that lurking variables needed to explain changes in the original two variables. Although those are usually the lurking variables that we care about, the definition of lurking variable is more general: the third variable need only be correlated with the original two variables. Thus, the variables identified in both Examples 1 and 2 can be considered lurking variables.

I also guessed that some students thought a lurking variable must explain why increases in one of the original variables lead to increases in the other. That is, they mistakenly thought a lurking variable justifies the causation we want to see in a correlation. I’m not sure where students got this idea, but there were enough responses like Example 2 to lead me to believe this was a second misconception students had about lurking variables.

How could I determine if some of my students did indeed hold these misconceptions about lurking variables? That’s where the clicker question at the top of this blog post came in. Here it is again:

Suppose an observational study indicates a positive association between the average global temperature and the National Science Foundation budget.  Which of the following are possible lurking variables in this study?

1. US population size
2. Public interest levels in alternative energy
3. Both 1 and 2
4. Neither 1 and 2

(I felt that the idea that increases in CO2 production cause increases in the NSF budget was a little shaky, so I swapped out “CO2 production” for “US population.” It’s easier to argue that increases in the US population lead to bigger government budgets, as well as to global temperatures. I had one student question the “US population causes global warming” argument, but I discovered that he wasn’t a global warming skeptic, he just didn’t realize how much the US contributes to the global carbon footprint.)

It took me a few tries, but I finally figured out answer choices that target the two misconceptions I had identified. Students who (mistakenly) think that lurking variables need to explain the original two variables would go with option 1. Students who (mistakenly) think that lurking variables should justify the causation we see in a correlation would go with option 2. And students who (correctly) understand that lurking variables need only be correlated with the original two variables would go with option 3. (What does option 4 say about a student? I have no idea, but it seemed appropriate for this question!)

How did the question play out in class? When I asked my students to respond before talking with their neighbor, option 3 (the correct answer) was most popular, but only with 32% of the vote. Option 2 was close behind with 30%, option 3 had 21%, and option 4 was on the board with 17%. That’s not far off a four-way tie, which told me that this was a useful question to ask.

Here are the results after small-group discussion:

There was some convergence to the correct answer, but the misconceptions I identified in the students’ pre-class reading quiz responses were still present in over 40% of the class. That meant we needed to spend more time on this topic, so I had a few students volunteer reasons for options 1 and 2. I diagrammed these on the board using the boxes-and-arrows approach, and then offered my take on the question, that option 3 was the correct option because any variable correlated with the original two variables can be considered a lurking variable.

This question seemed to work very well during class as a way to surface and confront student misconceptions about lurking variables, and it only exists because I took a close look at student responses to that pre-class reading question. I suspect that I’ll find other great clicker questions “lurking” in student responses to future reading quizzes…