We know very little about what goes into standardized tests, who really designs them, and how they're scored. iStockphoto hide caption
We know very little about what goes into standardized tests, who really designs them, and how they're scored.iStockphoto
Standardized tests tied to the Common Core are under fire in lots of places for lots of reasons. But who makes them and how they're scored is a mystery.
For a peek behind the curtain, I traveled to the home of the nation's largest test-scoring facility: San Antonio.
The facility is one of Pearson's — the British-owned company that dominates the testing industry in the U.S. and is one of the largest publishing houses behind these mysterious standardized tests.
Scorers look over PARCC tests at a Pearson facility in San Antonio, Texas. Claudio Sanchez/NPR hide caption
Scorers look over PARCC tests at a Pearson facility in San Antonio, Texas.Claudio Sanchez/NPR
The company scores its test results in 21 centers across the country. The one in San Antonio is the largest.
The building is located in an office park right off a long stretch of highway northeast of the city.
Inside the cavernous 58,000-square-foot space, folding tables connected end to end fill the room. About 100 scorers from all walks of life work in silence, two per table.
They're here scoring PARCC tests — a test aligned with the Common Core Standards — aimed to replace old state exams. PARCC, short for Partnership for Assessment of Readiness for College and Careers, is one of two testing consortia that has helped states develop these new tests with $360 million in federal funding.
More than 5 million elementary and high school students took the PARCC test this school year in math, reading and writing.
In the facility, scorers' eyes are glued to computer screens displaying students' work from 10 states and the District of Columbia. These states belong to a consortium that pays Pearson $129 million to write the test questions and score the results.
Donna Vickers is one of the Pearson employees scoring PARCC tests. Claudio Sanchez/NPR hide caption
Donna Vickers, a retired elementary school teacher who has worked for Pearson for eight years now, says the writing portion of this test must be scored by humans, not machines.
"I'm scoring third-grade compositions, probably four questions out of maybe 50 questions, a very small portion of an entire test," Vickers says.
She looks for evidence that students understood what they read, that their writing is coherent and that they used proper grammar. But it's actually not up to Vickers to decide what score a student deserves. Instead, she relies on a three-ring binder filled with "anchor papers." These are samples of students' writing that show what a low-score or a high-score response looks like.
"I compare the composition to the anchors and see which score does the composition match more closely," Vickers says.
Pearson does not allow reporters to describe or provide examples of what students wrote because otherwise, company officials say, everybody would know what's on the test.
So here's a writing exercise Pearson did approve:
It's from a book titled Eliza's Cherry Trees: Japan's Gift to America. The task is for third-graders to describe how Eliza faced challenges to change something in America. Students must identify the main idea, draw evidence from the text and provide supporting details in what they write.
"Scoring supervisors" then make sure that the final scores are not out of whack with the so-called "true" scores from those anchor papers we mentioned earlier. Speed is also a concern, says Bob Sanders, Pearson's director of performance scoring.
"We monitor to make sure they're not scoring too fast, or too slow."
Sanders says some people need more training than others, but if scorers repeatedly fall short of the company's performance guidelines, they're fired. Since April, this scoring center has let 51 scorers go. They've not been hard to replace, though. Pay isn't bad — $12 to $15 an hour if you include bonuses. People without a four-year college degree need not apply.
Pearson officials say that last year, most of the 14,000 people it hired to score tests had at least one year of teaching experience.
But it's not required. So the job has attracted all kinds of folks. Many are stay-at-home parents and retired military who are allowed to work from home.
Then there are people like the ones I met, a former lawyer, a retired longshoreman and a bouncer who handles crowd control at concerts.
There are also people like Pat Squires, a college professor. She's been scoring tests for Pearson and other companies since 2002. She says some people approach the job thinking that scoring and grading are the same. They're not.
"In grading, oftentimes what we're doing is looking at what a student is doing and maybe marking them off for what they're not doing correctly," Squires says.
"But in scoring, you're working against a standard and you're looking to see what students have done correctly."
Still, some scorers point out that what test-makers say a third- or fifth-grader should be able to do sometimes doesn't seem right.
"We don't know how they decided whether this is a third-grade capable response," says Joe Bowker.
He did student-teaching in college and has been scoring tests on and off for several years.
"You have to leave your opinion outside the door," he says.
David Connerty-Marin, a spokesman for PARCC, says it's not up to a scorer or Pearson or PARCC to say, "Gee, we think this is too hard for a fourth-grader."
What is or is not developmentally appropriate, he says, is not an issue because the states have already made that decision based on the Common Core Standards.
"The states, with lots of educators, have reviewed the material and said, 'This is appropriate or not appropriate to the standards,' " says Connerty-Marin. "Our job is to write the test questions that measure whether the student is meeting those standards."
This week Pearson is supposed to wrap up its work on this batch of reading and writing tests. The client states will then get the raw scores, and together they must all agree on the same cut scores to determine which students are at grade level and which ones are not.
Andrew Thompson, the Pearson official who oversees the delivery of these raw scores to the states, says the crucial question is this: Will educators, parents and the public at large trust the results?
"They don't know what we're doing, so there's a lot of misconception about what we do," he says, "and we don't have a way right now to refute that [misconception] and show this is really what we're doing."
Most Americans have been in the dark, says Thompson. So the risk for Pearson, PARCC and the states is that by trying to be more transparent this late in the game, people may very well end up with more questions than answers.
It was the year 2000 and Maine's governor at the time, Angus King, was excited about the Internet. The World Wide Web was still relatively young but King wanted every student in the state to have access to it.
"Go into history class and the teacher says, 'Open your computer. We're going to go to rome.com and we're going to watch an archaeologist explore the Catacombs this morning in real time.' What a learning tool that is!"
Fast-forward a couple of years and that dream became a reality. Maine became the first, and still only, state to offer a statewide laptop program to certain grade levels.
Alison King, no relation, was just a toddler when the program launched. Back then, kids lugged big, bulky iBooks around all day. In her senior year at Gorham High School, she says she uses her laptop — now much smaller — for most of the day, "We hardly ever use paper."
Her American politics class is totally paperless. Alison's teacher, James Welsch, says when he arrived in Gorham seven years ago, he'd never seen so many computers in one classroom. Welsch says it turned the class into an interactive discussion, "It's like, we can put the world on the desk of each kid." His students write blog posts, read each other's work, and share videos and articles — all online.
Then he started to notice that when some students turned in their essays, the writing wasn't as fluid as it was when the students were putting pen to paper. "You could also see an increase in copy-and-paste," he says. "Whether it's from another student, whether it's from a piece online, digital sharing is what these guys do."
Because of that, he says, in some courses he requires his students to write out their essays by hand.
Welsch learned what a lot of teachers, researchers and policymakers in Maine have come to realize over the past 15 years: You can't just put a computer in a kid's hand and expect it to change learning.
Research has shown that "one-to-one" programs, meaning one student one computer, implemented the right way, increase student learning in subjects like writing, math and science. Those results have prompted other states, like Utah and Nevada, to look at implementing their own one-to-one programs in recent years.
Yet, after a decade and a half, and at a cost of about $12 million annually (around 1 percent of the state's education budget), Maine has yet to see any measurable increases on statewide standardized test scores. That's part of why Maine's current governor, Paul LePage, has called the program a "massive failure."
"The fact that we're not seeing large-scale increases in student learning leads us to suspect we still need to do some work with helping schools and teachers understand and keep up with the best ways to use technology for student learning," says Amy Johnson, who researches education policy at the University of Southern Maine.
She says it's tough to measure the effects using a simple standardized test, for example, and that teacher training is necessary to get results, but the state de-emphasized some of that training in recent years.
Johnson says this has created a new kind of divide in Maine. Students in larger schools, with more resources, have learned how to use their laptops in more creative ways.
But in Maine's higher poverty and more rural schools, many students are still just using programs like PowerPoint and Microsoft Word.
I don't want to look at a newspaper. I don't even know where to get a newspaper!"
Nikolas Sharon, student at Gorham High School in Maine
Some educators also worry that new funding cuts and changes to the program's structure could leave those rural schools even further behind.
However, officials say these challenges shouldn't make people forget about the original goal of this program 15 years ago: to give every student, in every part of Maine, access to the same digital tools.
Nikolas Sharon, another student at Gorham High School, says he couldn't imagine social studies class without his laptop and Internet connection.
"I don't think I could do it, honestly. I probably would have dropped the class," he says. "I don't want to look at a newspaper. I don't even know where to get a newspaper!"
Sharon says he gets all that information right on his laptop.