Tuesday, March 26, 2013

Random Nugget from the NECAP Weeds

2007-2008 NECAP Math Reading Writing Technical Report with Appendices, Appendix F, Page 54:

3. A psychometrician will present and explain the average bookmark placement for the wholegroup based on the Round 2 ratings. Again, based on their Round 2 ratings, panelists will know where they fall relative to the group average. The psychometrician may also present impact data, showing the approximate percentage of students across the three states that would be classified into each achievement level category based on the room average bookmark placements from Round 2.

So basically the panel went through the process of setting, discussing and revising 11th grade math cut scores twice, and only then someone may have pointed out that according to what they had done so far, probably about 40-50% of the students in their state would receive the lowest score and less than two percent would get the highest (of 4), before going through one more revision.

The panel thought the impact data (if they got it) was the least important factor in their decision (average rating of 2.7).

The most important factors to them were their own experience and the items themselves, which makes sense, except that the range of item difficulty was very high compared to other NECAP and NAEP tests, so that would skew the whole standards-setting process from the start.

That's all fine for just comparing schools, districts, or even students, but it is not at all how you'd look at setting the cut scores for a graduation test. A single graduation test that would put 45% of all juniors at direct risk of failure and retention would demand more serious consideration of "impact analysis."

