Here’s the response from NCES to the Achieve report I wrote about earlier this week, which suggests that the organization behind the NAEP tests and scores isn’t entirely comfortable with the uses to which the Achieve folks put their work.

As you may recall from this post (What to Make of Last Week’s Big “Honesty Gap” Report & Coverage?), there were at least a few questions about the Achieve report and the coverage it received. But I couldn’t get any clear response from NCES, which produced the NAEP test that Achieve’s report is built on.

All that changed yesterday afternoon, when I received word from NCES that indicates several issues with the Achieve report compared to similar but methodologically more sophisticated reports that it produces comparing state tests and NAEP results:

“Since 2003, NCES has published a series of reports that support Achieve’s overall conclusion that state standards vary widely,” said NCES’ Acting Commissioner, Dr. Peggy Carr in an email via Arnold Goldstein. (Click here for a 2011 NCES mapping report using 2009 results).

So far, so good, right?

However, Carr notes that the NCES reports “map state proficiency standards onto the NAEP scale, rather than comparing percentages of students who attain proficient on the two tests.” And NCES does not characterize states as “truth tellers” or with “biggest gaps,” or necessarily assume that all states “should have the same proficient standard as NAEP.”

According to Carr, NAEP’s proficient standard could be considered aspirational, and NCES “does not assume every state should aspire to NAEP’s proficient standard as Achieve does.”

In essence, NCES is saying that Achieve shouldn’t have tried to compare state proficiency percentages to NAEP proficiency percentages and thus make it seem like the comparison is apples to apples.

But that’s not the only problem. According to Carr, Achieve’s report goes beyond what it’s methodology can support, comparing state results from 2013-2014 to NAEP results from a different year. Had Achieve matched up the years, the outcome would have been “different sets of states falling into their two categories.”

It’s also not clear to NCES whether Achieve used alternative test results, which NAEP does not include. And NCES dings Achieve for ranking states according to percentage proficient rather than standardizing the state proficient cutpoints “so states can be compared when placed on the NAEP scale, a common metric.”

Read the full response below. As I read it, NCES isn’t entirely endorsing the FairTest critique of the Achieve report but it’s bringing up some of the same issues and suggesting that reporters who covered the study might not have fully understood the limitations of the appealingly simple comparisons that Achieve was making. Meantime, I’ll see what Achieve has to say about this.

CES’ Acting Commissioner, Dr. Peggy Carr, commented on the Achieve report on May 15 at the National Assessment Governing Board’s quarterly meeting. She summarized the main conclusions of their report as follows:

State standards differ widely;

Some states were identified as “truth tellers,” i.e., the difference between the percent proficient on the state test was close to the percentage proficient on NAEP;

Some states were placed in a “biggest gaps” category, as having relatively large gaps between those percentages.

Since 2003, NCES has published a series of reports that support Achieve’s overall conclusion that state standards vary widely. The NCES reports map state proficiency standards onto the NAEP scale, rather than comparing percentages of students who attain proficient on the two tests.
NCES does not characterize states as “truth tellers” or with “biggest gaps.” Achieve assumes all states should have the same proficient standard as NAEP. Although NCES reports show the state proficient standards in relation to NAEP’s three achievement standards (Basic, Proficient, Advanced), we recognize that states have the option of setting their own standards, and they are not necessarily intended to be the same as NAEP’s. In fact, NAEP’s proficient standard is often called aspirational, as attaining it requires mastery over challenging subject matter. NCES recognizes that state standards are often designed to measure competency at the specific grade, and does not assume every state should aspire to NAEP’s proficient standard as Achieve does.
Achieve’s conclusions go beyond what their methodology can fully support. Achieve compares state test results from one year, 2013-14, to NAEP results from a different year, 2012-13 (NAEP did not produce results at the state level in 2013-14). A better alignment would have resulted if they had used data from both sources for the same school year. That procedure would actually have resulted in different sets of states falling into their two categories. (NCES, in their reports, does align the years, and in addition uses state test results from the same schools that participated in NAEP.) Also, Achieve may have used inconsistent data for the states, as it is not clear from their report whether the state test data they used included results for students taking alternate tests in some states but not others. Including alternate test results would also introduce an inconsistency with NAEP, which does not administer an alternate test. Finally, Achieve simply ranked states according to the point difference between the percentages proficient on the state test and NAEP, even though each state uses different tests and different scales. NCES, in its methodology, standardizes the state proficient cutpoints so states can be compared when placed on the NAEP scale, a common metric.