A famous story in physics education is about how concepts are more complex and have more facets than we realize. David Hestenes has developed some sophisticated and multi-faceted assessments for concepts like “force” — a whole test, just addressing “force.” Eric Mazur at Harvard scoffed at these assessments (as he said at a AAAS meeting I went to a couple of years ago, and quoted in a paper by Dreifus in 2007). His Harvard students would blow these assessments away! Gutsy man that he is, he actually tried them in his classes. His students did no better than the averages that Hestenes was publishing. Mazur was aghast and became a outspoken proponent of better forms of teaching and assessment.
Building up these kinds of assessments takes huge effort but is critically important to measure what learning is really going on. For the most part in Computing Education, we have not done this yet. Grades are a gross measure of learning, and to progress the field, we need fine-grained measures.
As far as I can tell, at the secondary level we just don't have these kind of fine-grained measures for any subject, particularly focusing on conceptual understanding. Lots of good stuff in the original paper. If you convince me you've got a good test first (and admittedly, wtf do I know about Physics), I'm a lot more interested in your data. Also, low-stakes is essential here. Drill kids on these questions, and the value of the assessment goes out the window.
Quite frankly, I don't believe that we have the will or capacity to get this right on a large scale. I'm just not convinced, and nobody is trying to convince me, outside of Duncan throwing $350,000,000 at the problem, but I don't believe we have excess test creation capacity sitting on the sidelines holding out for more money. I think we're already doing the best we can, it is inadequate. We're just going to end up paying more money for the same crap.
This can be fixed, but on a 10-20 year timeline, not as stimulus.