- The greater the consequence we attach to test results, the less “predictable” the questions need to be. If we’re going to attach high stakes to tests, we need to make it hard to predict how to narrow their curriculum to the “tested” content at the expense of the full range of knowledge and skills laid out in the standards.
- The greater the consequence we attach to evaluations, the more we need to diversify the indicators. We need to balance numerical data with other information, including qualitative data—which paints a clearer picture of how well a school is doing and how much or how little its students are learning.
- The more we focus on accountability with consequences, the more we need to independently check the data. States could, for instance, invest in inspectorates whose focus is on site visits and other measures that could serve as a “reality check” on the data.
One of the less-emphasized "shifts" of the Common Core ELA is to narrow the scope of the curriculum to make it easier to test the whole thing, although the purity of the initial design was compromised somewhat in the final implementation. But if you look at a standard like this one:
Demonstrate knowledge of eighteenth-, nineteenth- and early-twentieth-century foundational works of American literature, including how two or more texts from the same period treat similar themes or topics.
The first thing you have to point out it is very strange compared to the way standards are generally written, especially outside the US. It is a tight little lump of content (e.g., not late 20th century, how many 18th century foundational works are there?) entangled with a specific task. I think the explanation for this is to make it clear at the level of the standards itself that this is supposed to spawn a very specific and predictable assessment. A predictable assessment which cannot be criticized for narrowing the curriculum because the standards are doing the narrowing work.
Regardless, predictability and reliability is not something that can just be wished away. It seems like an easy question until you listen to experts talk about it for about 10 minutes, and then you realize what a nightmare it truly is. If every five years you throw in a 18th century question for the above standard and everyone's scores that year go down 10 points, exactly what have you measured? Especially in a "high stakes" context? Throwing more surprises into higher stakes tests is an idea only someone living in a wonk bubble could love.
Regarding Porter-Magee's second and third points, those are things that we used to do here, but stopped to follow the agenda of Porter-Magee, Fordham and their allies, so... maybe she would have preferred Linda Darling-Hammond as Secretary of Education?