The widening gulf between large-scale and classroom assessment
Gregory J. Cizek
Next year marks the 10-year anniversary of the Common Core State Standards; the Next Generation Science Standards have been around nearly as long. The majority of states have officially adopted one or both of these sets of standards; most other states have adopted close cousins. Given this history—and the fact that next-gen standards for social science and other subjects may be looming—it seems important to consider a pernicious unintended consequence of these adoptions.
Today’s standards are meatier than their predecessors. More challenging. Demanding deeper and more complex learning. They are great standards for developing a curriculum or guiding classroom instruction perhaps, but in many aspects they are proving to be vexing for assessment.
I know this because I am involved as a technical adviser for some of the large-scale assessment programs that are attempting to create tests aligned to these standards. Beginning in 2012, with the Smarter Balanced Assessment Consortium, I witnessed the difficult work to develop high-quality test questions and performance tasks aligned to the common core. An early pattern emerged: People with deep content expertise, sophisticated question-writing skill, and clear specifications had a very poor initial success rate in writing acceptable items to measure some of the new CCSS standards. Why?
Just look at the English/language arts standards. Here’s one for 7th graders: “Compare and contrast a fictional portrayal of a time, place, or character and a historical account of the same period as a means of understanding how authors of fiction use or alter history.”
“Today’s standards have effectively put standards-aligned classroom assessments out of the reach of practicing educators.”
And here’s one from grade 8: “Analyze a case in which two or more texts provide conflicting information on the same topic and identify where the texts disagree on matters of fact or interpretation.”
I like these standards; they seem like great skills for students to acquire. One thing that makes them challenging for assessment is that the test makers first have to find two (or three!) different reading selections. For 7th grade, one of the reading passages has to be a “fictional portrayal of a time, place, or character” and the other has to be “a historical account of the same period.” It’s possible—hard, but possible—to find such a pair of passages. However, after finding those two needles in two haystacks, the ELA experts had to write acceptable questions to test that students could actually analyze how the authors of the passages “use or alter history.”
With several years and significant resources, the success rate of the ELA experts—people who are devoted full time to this task—in developing aligned assessments has increased. It is no small accomplishment that the large-scale assessments designed to measure more rigorous and cognitively challenging standards have succeeded in doing so.
Don’t even get me started on the NGSS, which require the simultaneous consideration of three different, interrelated aspects: disciplinary core ideas, science and engineering practices, and crosscutting concepts. Yes, they represent standards that are meatier, more rigorous, and so on. No, they aren’t easy to assess. I’m not aware of any assessment program that has demonstrated it can build a science test that measures with fidelity the three-dimensional learning required. Some have had modest success with two dimensions.
But the heart of my discontent—and a concern that should unite educators and policymakers—is the gulf between how standards are assessed on large-scale tests versus regular classroom instruction.
Gulf. Chasm. No, I think abyss is most accurate.
Here’s the big problem: If it takes full-time, highly specialized expertise to generate a handful of standards-aligned items and tasks (with modest success), what prayer does a classroom teacher have of regularly producing them for routine classroom assessments?
In the recent past, many educators could develop classroom assessments that effectively measured student learning against the existing content standards, provided accurate information about student learning, and aided in preparing their students for their annual summative assessments. However, today’s standards have effectively put standards-aligned classroom assessments out of the reach of practicing educators. We have stolen from ELA, math, and science teachers an important tool they need to best serve their students. Social studies, foreign languages, or even gym class could be next!
Realizing the difficulty of developing adequately aligned classroom assessments, schools are likely to: 1) rely on purchasing commercially prepared classroom assessments of dubious alignment; 2) ignore the problem, and get (mis)information about student learning from (mis)aligned locally produced assessments; or 3) simply avoid classroom assessments in those content areas and get no formative information.
Given the consensus among folks in the relevant content areas that next-gen standards are an improvement, reversion is not a reasonable response. So what is needed? Three things, from different stakeholders.
As a first step, state legislatures must mandate assessment literacy for educators. Providing professional development in standards and assessment is too late; only effective integration into preservice training will give all teachers a chance to actually develop aligned assessments for classroom use. It is a national educational embarrassment that, in the majority of states, no formal training in assessment is required for licensure as a teacher, principal, or superintendent.
Second, those who develop summative assessments aligned to more-challenging standards should consider it a professional obligation to also develop tools that educators can use to mirror those challenging approaches in classroom and formative assessments. This does not mean simply offering another product for purchase, but assuming responsibility for fostering linkages between tests for accountability and accountability to those who do the daily work.
Third, researchers must investigate whether the instructional changes deemed necessary for the effectiveness of the new standards are actually taking place. If classroom practices aren’t evolving to align with new standards, any assessment information will lack meaning.
Bonus (and this is perhaps my most controversial recommendation): Policymakers may need to explore the notion that good instructional/curriculum standards do not necessarily make good assessment standards.
Taken together, these ideas may help prevent the corruption of classroom assessment. If effective classroom assessment is to be more than a fleeting catch phrase, we must work to actually empower educators in this area.
Vol. 38, Issue 29, Page 24
Published in Print: April 16, 2019, as Common Core-ruption: The Gulf Between Large-Scale and Classroom Assessment
Back to Top