I learned a new term this week: “face validity.” “Generally, face validity means that the test ‘looks like’ it will work, as opposed to ‘has been shown to work’.” “Some people use the term face validity only to refer to the validity of a test to observers who are not expert in testing methodologies.” Others have equated it with “pandering to stakeholders.”
The idea of “face validity” seems relevant to the state task force’s recommendation that we adopt the very expensive Smarter Balanced tests. The reason the Smarter Balanced tests cost so much more than the tests we’ve been using is that they use computer adaptive technology—varying the questions based on the student’s responses as the test goes along—and include time-consuming “performance tasks,” which purport to “require students to apply their learning to a real-world problem” (in a classroom, on a standardized test).
Not everyone agrees that expensive question types measure “higher-order thinking” and “real-world problem-solving” appreciably better than multiple-choice questions do. The task force’s dissenting member, Karen Woltman, examines some criticisms of performance task assessments here. Iowa City’s H.D. Hoover wrote twenty years ago that “People who think that multiple-choice tests measure trivial facts and performance assessments measure higher-order processes don’t know much about measurement.” I wonder how much has changed in the interim. His talk critiquing performance tasks is a great read.
One thing is true, though: “Performance tasks” and “computer-adaptivity” do sound so twenty-first century! Adding them to our tests enables the state to point to impressive-looking innovations in assessment. How much of the proposed eight-fold (or more) increase in cost is just paying for that kind of “face validity”?