Tuesday, March 31, 2015

Foregone conclusion

Here’s a little-reported story about the state assessment task force’s process. To evaluate different possible assessments, the task force created a rubric, asked vendors to respond to a request for information, then planned to score the responses. What happened, though, surprised them: No vendor submitted the Smarter Balanced Assessments for review. Of the proposals that were submitted, the Next Generation Iowa Assessments received by far the highest score. The other proposals received sufficiently low scores that the task force eliminated them from consideration.

At that point, the Next Generation Iowa Assessments became the only proposal under consideration. From the point of view of the task force, that was a problem that had to be solved. At the time, Iowa was still a member of the Smarter Balanced Assessment Consortium; in becoming a member, the state had agreed to adopt the Smarter Balanced Assessments.

The task force decided to issue another Request for Information and “to reach out to specific vendors to ask them to submit the Smarter Balanced Assessments for our review.” (Details here.) Lo and behold, a vendor submitted the Smarter Balanced Assessments for review.

Soon afterward, the state decided to withdraw from the Smarter Balanced Consortium, as a way of “respecting the Assessment Task Force’s independence and ensuring an impartial process.” A few months later, the task force recommended that the state adopt the Smarter Balanced tests.

If it had been the Iowa Testing Programs that had failed to submit a proposal in response to the Request for Information, would the task force have issued a second request? Would it have “reached out” to ask for a submission of the Next Generation Iowa Assessments for review? Or was the task force determined from the outset to recommend Smarter Balanced?


Mary Murphy said...

Did the vendor submitting the Smarter Balanced Assessment proposal have access to the proposal submitted by the Next Generation Iowa Assessments?

Matt Townsley said...

Hey, Chris. I've enjoyed reading your ATF commentary. In the spirit of nuanced dialogue, are there any aspects of the Smarter Balanced Assessments you think are valuable when compared to other options?

Chris said...

Matt – Thanks. My answer is a qualified no. Qualified because I’m not dead-set against everything about Smarter Balanced—certainly if they were free of charge, and took much less time, I’d be much readier to live with their possible imperfections—but because I have yet to be persuaded of any particular advantage they have over, say, the much less expensive Next Generation Iowa Assessments (and I haven’t even been convinced of the value of those). Maybe I could be convinced that computer adaptivity is a plus, but not by the kind of empty platitudes that appear in the task force report. And even then I’d need to be convinced that it’s such a great benefit that it’s worth the much higher price tag.

When it comes to Smarter Balanced, I admit that I am focusing on criticisms, mainly because I think there needs to be a counterbalance to the establishment-driven, heavily-funded lobbying effort to push the tests through (and my little blog makes barely a dent in creating that counterbalance).

I also think that when someone is trying to sell you something for tens of millions of dollars (and won’t even disclose the full price!), the burden should be on that seller to make a convincing case for the sale, and that any public official who might approve the purchase had better do more than just nod uncritically along with the sales pitch. (I know that there is no single entity that will reap the full cost of these tests. Still, the campaign for these tests is a type of sales job, and the price is very high.)

I don’t start with the proposition that we need an annual standardized testing regime, or that such a regime is worth whatever price tag is on it. I need to be convinced of both propositions. How could I be convinced? To start, I’d like someone to explain—in a concrete way, not just with vague generalities of the kind that appear in the task force report—just how the data is going to be used in a way that will improve kids’ educations. If a kid gets a low “problem-solving” score in the last month of the school year, what’s the response? Anything? Is the teacher going to suddenly re-teach that kid what they covered that year, in the hopes that somehow it will fix whatever problem resulted in that score? And then do that for each student in the class, individually, in the last month of school? How will it work?

Or is it just that the aggregate scores will give us some idea of where the kids stand relative to kids elsewhere? If it does, what will we do with the information? How will we know what to fix when the class is taught the following year? Focus more on “problem-solving”? How? At the expense of less focus on what? How will it work?

Or is it that we’ll use the scores to evaluate teachers, based on how their students do? What if the scores correlate with family income levels (as they will)? Does that mean teachers will be penalized for choosing to work in schools that serve poorer students? How will that work?

(I don’t mean to put you on the spot, but I’m genuinely curious to hear what a school administrator would say about how a school would use these test scores to improve a kid’s education.)


Chris said...


What about the concerns about the validity of the tests? How can we evaluate the tests’ validity if we can’t examine the questions? Even if the questions are well-designed, won’t the validity depend not just on a student’s ability, but on whether the student made a genuine effort? Won’t a seven- or eight-hour-long test mean that some students will become so fatigued that they won’t be making a genuine effort? If that risk is real, then what good are the scores?

If the test results are not going to be used in an individualized way, then why wouldn’t sampling a much smaller number of students fulfill the same function at a much lower cost?

What about concerns about the privacy of the data collected on our kids?

What is the real cost of these tests, including what it will take to ensure a tech infrastructure that won’t make the test administration a debacle? Is the task force serious that a school will be considered tech-ready if it has 30 computers to give a 7.5-hour test to 600 students on?

What will get cut to pay for these tests? If no one will say, how can we possibly decide if the tests are worth the cost?

Maybe there are answers to those questions. If so, none of them appear in the task force’s report. I wouldn’t write an eight- or nine-figure check if I didn’t have good answers to them. What is the point of talking about the nuanced differences between Smarter Balanced and other options if those larger questions aren’t answered?

Chris said...

Mary -- Good question.

Matt Townsley said...

Chris, you said: "I don’t start with the proposition that we need an annual standardized testing regime, or that such a regime is worth whatever price tag is on it. I need to be convinced of both propositions."


"I’m genuinely curious to hear what a school administrator would say about how a school would use these test scores to improve a kid’s education."

Good questions. I've shared a few thoughts on the proposed changes to our state assessment.


I probably fall under the "if we're going to have tests, let's try to make them as meaningful as possible" category. My understanding is that NCLB (federal mandate) requires testing and that it requires a good faith effort to assess all students in certain, but I could mistaken in that it is Iowa's implementation of NCLB rather than across all states. This may throw out the random sampling idea. If I could wave my magic wand and change our current assessments, there would be two meaningful changes:
1) Ensure the tests align with the state standards. If we're going to have tests and standards, it makes sense these two are aligned. I haven't read too many (any?) people suggesting our current assessments meet this criteria.
2) Utilize a criterion-referenced test rather than one designed to be norm-referenced I would love to see schools celebrating a bunch of their kids "getting it" on the state assessment and then moving on. Our current tests are not criterion-referenced and therefore created to sort students rather than celebrate them.

If these two changes were in place, I think schools could better use the test data for instructional program evaluation (i.e. How many of our students didn't demonstrate an understanding of x? How did we teach x? What materials did we use to teach x? Do these need to change?) It could also tell us which students are in need of additional time and support (i.e. Suzy does not have a firm grasp of x, so let's get her some extra supports to start the next school year).

Does any of this make sense?

Chris said...

Matt – Thanks for the reply. I can only reply quickly now, but will add to it later. But when you say, “If we’re going to have tests, let’s try to make them as meaningful as possible,” I think about how I could say that about so many aspects of school. If we’re going to have fine arts buildings, we should make sure we can afford orchestra teachers. If we’re going to send kids to school for six-and-a-half hours a day, we should make sure they have decent class sizes, and opportunities for curricular programs like foreign language instruction. My kids’ educations have been compromised in probably a thousand ways by the fact that resources are limited. Why is standardized testing the *only* area where it is assumed that, if we’re going to do it, we must have the “best” program, no matter how insanely expensive it is? That strikes me as the kind of argument that would have no chance of success if it weren’t backed up by an extensive lobbying campaign of the kind we’re seeing for Smarter Balanced.

Unknown said...

Mary--I don't believe any of the vendors saw the other vendors responses prior to submitting their own. You can see the list of questions vendors were asked here.

I can say that some vendors had representatives attending many or all of the meetings, so they certainly could hear whatever we discussed about the other vendors. I might also add, for what it's worth, that these were requests for information and not requests for proposals.

Chris--amen. I'm sure the other task force members grew tired of hearing from me about orchestra, football, and German language cuts but the reality is that public schools have limited resources and we need to prioritize.