The Myth of the Slipping Math Student?


I've been teaching in Colorado for six years, and there's always been a troubling pattern in our state standardized math scores. As students progress from 3rd to 10th grade, the percentage that score proficient and advanced declines dramatically. Here are the percentages of students scoring proficient and advanced by grade level, averaged over all the years the test has been given (typically 2002-2008):












GradeAvg. % P+A
369
469
561
655
744
843
934
1029


The easiest explanation (and the one I've tended to believe) is that students' abilities are, in fact, slipping as they got older. That would be a good assumption if the test at each grade level was equally difficult. But what if the test questions were, on average (and adjusted for grade level), more difficult as students got older? Is it fair to assume a test with increasingly difficult questions would result in lower scores, even with sophisticated score scaling systems that take question difficulty into account?

Fortunately, the state releases "item maps" that describe the difficulty of each item on every test. Using 4 points for an advanced item, 3 points for a proficient item, 2 points for a partially proficient item, and 1 point for an unsatisfactory item, we can come up with an average difficulty for the CSAP at each grade level. Let's add that column to our table:












GradeAvg. DifficultyAvg. % P+A
32.4369.25
42.4368.5
52.5361.14
62.6955.43
72.9644
83.0442.86
93.1334
102.9628.86


This begs for regression analysis. How strong is the correlation between the difficulty of the questions and the scores?


The correlation is surprisingly strong, and the coefficient of determination (R squared) is 0.88, meaning that the average item difficulty is statistically responsible for 88% of the variance in the test scores. 88%? That's big. Statistics rarely tell the whole story, but 88% raises serious doubts that it's just a matter of slipping math students. Why wouldn't the state want to maintain a steady average difficulty year-to-year? Wouldn't that make year-to-year performance comparisons more reliable?

Popular posts from this blog

Effects of Handbrake presets and RF quality settings across AV1, H.265, and H.264

My Podcast Predilection

Think before you shoot