Last week I wrote about an interesting lawsuit over in New York State, where Long Island Teacher Sheri Lederman is suing the New York State Education Department because her value-added assessment, which measures her effectiveness as a teacher based on the standardized test score gains of her students over the course of a year, is “arbitrary and capricious and an abuse of discretion.”
The judge in the case, it turns out, had some pretty interesting questions for the state. According to an article at the Washington Post:
The exasperated New York Supreme Court judge, Roger McDonough, tried to get Assistant Attorney General [Colleen] Galligan to answer his questions. He was looking for clarity and instead got circuitous responses about bell curves, “outliers” and adjustments. Fourth-grade teacher Sheri Lederman’s VAM score of “ineffective” was on trial.
The more Ms. Galligan tried to defend the bell curve of growth scores as science, the more the judge pushed back with common sense. It was clear that he did his homework. He understood that the New York State Education Department’s VAM system artificially set the percentage of “ineffective” teachers at 7 percent. That arbitrary decision clearly troubled him. “Doesn’t the bell curve make it subjective? There has to be failures,” he asked.
The judge seemed confused by what was going on here. “In 2012-13,” according to the article, 68.75 percent of [Lederman’s] New York students met or exceeded state standards in both English and math” and she was deemed “effective.” The next year about the same number were proficient, but because she narrowly missed the computer model’s predicted rate of student growth, she became “ineffective” and her rating as a teacher declined.
Here’s the equation used to measure effectiveness under the value-added assessment model:

Wait, McDonough said, “how could it be that she went from 14 out of 20 points to 1 out of 20 points in one year?” Was she or was she not a good teacher?
Back behind the bell curve Ms. Galligan ran. As she tried to explain once again, the judge said, “Therein lies the imprecise nature of this measure.”
And that’s the trouble with value added scores. It’s useful to be able to predict a teacher’s score, given the previous standardized test performance of her students, sure. But the current policy, as practiced at least in New York, just isn’t doing a good job measuring anything that matters.
Lederman was not the only teacher in the school to get a poor score. In 2014, 21 percent of the staff at E.M. Baker School received a score of “ineffective,” 21 percent “developing” and 57 percent were “effective.” Just the year before, not one teacher received an “ineffective” score.
Is that because the teachers got worse or is that because the rating system used to evaluate teachers here really isn’t very good?