Claiming testing measures are the problem is a little like suggesting the scale is broken because we got too fat.
Actually it is like suggesting that weight alone is of limited value in assessing a person's health and an over-emphasis on weight loss is often misguided.
If this were cancer research, you’d see a bunch of studies use this one as a jumping off point for more experiments. If we change dosage, can we move the 0.22 SD gain to 0.32? What happens if we combine this intervention with, say, some sort of merit pay incentive? What if teachers get this coaching as part of grade-level teams, not as individuals?
But the more direct analogy is not:
teacher coaching : education :: drug trials : health care
teacher coaching : education :: doctor coaching : health care
And I doubt there is much more complex research of the type Goldstein is referring to (e.g., does merit pay for doctors increase the effectiveness of coaching) in health care than in education.