Monday, April 22, 2013

NY State Tests and the Technical Limits of Paper

Reading feedback on NY state's first application of their new Common Core aligned tests, what jumps out most is the reports of many kids running out of time on certain days. As predicted.

Here's a typical sample:

What I didn't prepare them for, I guess, was the fact that so many of them would run out of time. Students who read The Kite Runner and Freakonomics for fun are going to get scores that indicate they are not reading on grade level simply because they did not have time to finish reading and answering questions. I was willing to give the new assessments a chance, but all they are measuring is whether you can work at a pace that is, quite simply, wildly inappropriate for deep readers and thinkers.

This seemed like an incredible unforced error by Pearson and NYSED. Of all the technical problems that a test could have, simply not allowing enough time -- particularly when all your rhetoric leading up to the test has been about careful, close reading, deep analysis and writing thorough arguments -- is one of the few that just about everyone can understand. Start talking about the rigor, and nobody really knows if you're whining. Nobody understands the intricacies of standards alignment. But if many good students don't have time to write the essay, everyone knows that's clearly a problem.

Setting aside incompetence and hubris, both of which are clearly in play given the track records of both Pearson and NYSED, why did this happen?

A few thoughts:

  • Since we've established that an increasing number of NY parents will opt out of separate tests to pilot new questions (I'm entirely sympathetic, btw), all the research on new items has to be embedded in the real tests.
  • They've pretty much maxed out the reasonable length of one round of tests.
  • "...passages in some of the forms given some children at multiple grade levels most likely disadvantaged those 3rd or 4th graders who had to struggle with inappropriately difficult material..." This is presumably so the tests can be vertically aligned, which is particularly important if you want at least a hypothetically accurate growth number, especially if you need to show more than 1 year's worth of growth. But it means more questions.
  • The ELA standards are not actually compact, and to the extent they are shorter than their predecessors, it may be in their omission of standards that weren't tested anyhow (e.g., participating in a literate community, having a habit of reading, etc.). Completely repeating all the reading standards for different text types is not going to make for short tests.

Adaptive computer-based tests should help with the vertical alignment issue. That way at least you can avoid, say, giving seventh graders reading on a sixth grade level an eighth grade level reading passage. Of course, the real "solution" is to just continually blur the distinction between high stakes testing and everyday work, with everything done online and going into one big database.

