One more post on Week 2 of “The History and Future of (Mostly) Higher Education,” in which I attempt to defend the use of statistics in education. Take this post with a larger grain of salt than usual, it’s mostly just thinking out loud.
In this week’s videos, Cathy Davidson provides some information on Sir Francis Galton, an English scholar who helped found the field of statistics. She notes that Galton was a eugenicist and wanted to reduce a human being’s value to a deviation from a norm on a single linear scale. That’s reductionist, not to mention ethically repugnant. It’s also pretty much why Galton developed notions of normal distributions and standard deviations. Davidson says in her video, “Every time I see statistics, I think about that origin,” and now I probably will, too, at least when Galton’s name comes up or when I watch someone play Plinko on The Price Is Right.
Davidson goes on to argue against using a similar practice in education, that is, reducing a student’s learning to a deviation from a norm on a single linear scale. That’s reductionist, too, because it hides (if not ignores) the complexities of student learning. Excellent point.
It seems to me, however, that the problem isn’t with statistics (normal distributions and standard deviations) as Davidson makes it out be, it’s with the use of a single linear scale to measure something as complex as student learning. It would be more palatable and more accurate (in the sense of representing complexity) to use multiple linear scales.
Example: I can imagine rating calculus students independently on their understanding of concepts, computational ability, and creative problem solving. One student might get a 40 (out of 100), 90, and 50. Another might get a 80, 30, and 70. That’s certainly a more nuanced assessment of student learning than a single scale. Assuming student ratings were normally distributed (not a guarantee when using criterion-referenced grading), then statistical tools like standard deviations would indeed be useful for making sense of such data.
Of course, we use multiple linear scales in various ways already in education. My daughter just brought home a progress report that included 0-100 scores in reading, writing, math, science, and social studies. At some point in her high school career, those multiple ratings will be averaged into a GPA, but for now, they are assessed and interpreted independently. The more assessments of student learning are rolled up into aggregate measures, the more they hide the complexity.
Upshot: It’s the aggregation across multiple kinds of learning that’s the problem, not the use of statistics.
Image: “Gaussian?“, mjm, Flickr (CC)