These were simple studies. We asked classroom teachers in grades two to six to rank order the students in their classes in terms of how they would do on the state’s No Child Left Behind accountability test. Following is some information obtained from only the two lowest grades.
In grade two, 36 teachers participated, with class sizes ranging from 17 to 30; in grade three, 30 teachers participated, with class sizes ranging between 22 to 32 students. The correlation coefficients of the teachers’ ranking of their students’ performance with the students’ rank on the state test revealed only strong positive correlation coefficients. In third grade reading and mathematics teachers’ ranks of their students correlated with the rank the student obtained on the test about .84, about as high as the reliability of the tests themselves. Many teachers exhibited correlations greater than .90, indicating that teachers are quite capable of providing the state with information about who needs help and who does not in about 10 minutes, and at the savings of millions of dollars.
In second grade, we expected lower correlations because, as described above, the test scores of children at this age are less reliable. Yet we still found correlations between the teachers ranking and the child’s rank on the test to be about .70 in both reading and mathematics. This correlation is probably as high as the test would correlate with itself a week later (its one week stability reliability), and at the extremes, the rankings by the teachers of the highest and lowest performing students were remarkably accurate.
These results once again indicate that if the state’s interest is identifying students who need help, teachers can do this as well as the test.
Calls for standardized testing is a indirect admission that we do not trust teachers as ethical people or as professionals. But this distrust must be based in something other than evidence because claims that objective data must be used in college entrance decisions, graduation, and scholarships, for example, because without those objective measures, teachers would just give away high grades doesn't match the evidence on standardized tests coming from the College Board itself:And consider this, also from Kobrin, et al. (2008):
The correlation of HSGPA and FYGPA is 0.36 (Adj. r = 0.54), which is slightly higher than the multiple correlation of the SAT (critical reading, math, and writing combined) with FYGPA (r = 0.35, Adj. r = 0.53). (Kobrin, et al., 2008)Table 5
Unadjusted and Adjusted Correlations of Predictors with FYGPA Predictor(s)/ Raw R/ Adj. R
1. HSGPA/ 0.36/ 0.54
2. SAT-CR/ 0.29/ 0.48
3. SAT-M/ 0.26/ 0.47
4. SAT-W/ 0.33/ 0.51
5. SAT-M, SAT-CR/ 0.32/ 0.51
6. HSGPA, SAT-M, SAT-CR/ 0.44/ 0.61
7. SAT-CR, SAT-M, SAT-W/ 0.35/ 0.53
8. HSGPA, SAT-CR, SAT-M, SAT-W/ 0.46/ 0.62
Note: N for all correlations = 151,316. Pooled within-institution
GPA remains slightly better than the SAT at doing the single purpose the SAT is designed to do—predict freshman college success—despite GPA being entirely the product of teacher assessments. As the chart above shows, the so-called objective SAT does add to the use of data (see 6, 7, and 8 above), but the sacred test is less predictive than teacher assessment.