Students and experts say the formula the International Baccalaureate program used to generate grades may be discriminatory.
By Avi Asher-Schapiro
NEW YORK, July 21 (Thomson Reuters Foundation) - When Colorado high school student Isabel Castaneda checked her final grades for the International Baccalaureate program in July, she was shocked.
Despite being one of the top-ranking students in her public school, she failed a number of courses — including high-level Spanish, her native language.
The International Baccalaureate (IB) program - a global standard of educational testing that also allows U.S. high-school students to obtain college credit - cancelled its exams in May, due to the coronavirus pandemic.
Instead of sitting final exams, which usually account for the majority of students' scores, students were assigned their marks based on a mathematical "awarding model", as described by the IB program.
"I come from a low-income family - and my entire last two years were driven by the goal of getting as many college credits as I could to save money on school," Castaneda said in a phone interview. "When I saw those scores, my heart sank."
The COVID-19 pandemic has disrupted exams all over the world, and educational institutions have adapted in a range of ways, from moving tests online to asking students to wear protective gear during testing.
Relying on an algorithm to help determine results comes with its own specific risks, researchers warn.
Depending on the kinds of data the model considers, and how it makes predictions, it has the potential to reproduce - or even exacerbate - existing patterns of inequality for low-income and minority students, they say.
About 160,000 students take IB courses every year, including nearly 90,000 in the United States - and almost 60% of public schools that offer IB in the U.S. are "Title I" schools, with significant low-income student populations, according to the program.
"The choice to use a statistical model in place of a traditional examination warrants several concerns," said Esther Rolf, a PhD candidate at the University of California-Berkeley, who studies algorithmic fairness.
"Using historical records ... often leads to bias against individuals from historical underprivileged groups."
IB spokesman Dan Rene shared with the Thomson Reuters Foundation an explanation of the model which relied on three main components.
They were coursework, predictions teachers made about how students would perform on the exam, and the "school context", which included historical data on predicted results, and performance on past coursework for each subject.
"This process was subjected to rigorous testing by educational statistic specialists," the spokesman said in an emailed statement.
IB also released a statistical May bulletin showing that average scores in 2020 were in line with previous years, and said it had a process to "review extraordinary cases".
LOST COLLEGE OFFERS
In previous years, students' grades have been generated by combining final exams graded by IB and coursework marked by their teachers - which the IB spot-checked, according to its website.
Teachers also make predictions about their students' final grades, which students can use to secure provisional college admissions before taking their final exams.
"[A] school's own record was built into the model" by using "historical data to model predicted grade accuracy, as well as the record of the school to do better or worse on examinations compared with coursework," the IB's statement noted.
Although the IB insists its model is not an algorithm, experts say it is.
Joe Lumsden, secondary principal at Stonehill International School in Bangalore, India, worried that an entire school's record might not be an accurate indicator for an individual student's performance or potential.
"If there are bright students at a struggling school that's never performed well before, the algorithm could have pulled their scores down - we just don't know," said Lumsden.
Several students, as well as parents and teachers, have told the Thomson Reuters Foundation that they have had university offers contingent on certain scores rescinded since the final exam results were published.
TESTING IN A CRISIS
Many testing services have been forced to change their procedures as a result of the coronavirus pandemic.
The College Board, the U.S. non-profit that runs the Advanced Placement (AP) exams - which allow high-school students to earn credit for some U.S. college courses - moved the process online.
The ACT, another exam used in U.S. college admissions, has postponed its testing. Other tests - including a number of state bar exams - have also been moved online.
Iris Palmer, a senior advisor with the Education Policy program at New America, a Washington-based think tank, said she had never heard of a statistical model being used to assign grades.
"The way we use algorithms in education can be especially problematic if there is bias," she said. "The results can determine the course of the rest of your life."
She was particularly worried about how the algorithm may have weighed the historical performance of a school when assigning this year's students' grades.
"In schools with a lot of turnover, or without a lot of resources, this could really not work well," she said.
"Students ... who are black or low-income are probably at a disadvantage from the algorithm," Palmer added.
Nicol Turner Lee, director of the Center for Technology Innovation at the Brookings Institution think tank, agreed it can be hard to build a fair model out of past educational data, given the inequality already baked into the educational system.
"You have to start with the assumption that the algorithm is flawed," she said.
"By default, it has a problem because the data is generated by the discriminatory outcomes our educational system already produces."
More than 20,000 students have signed a petition to the IB, protesting the algorithm.
"I think this is discrimination," said one IB teacher at a U.S. public school, who asked not to be named because they were not authorized to speak to the press.
"They are applying what other teachers and other groups of students did and projecting it on to these kids."
Grace Abuhamad, a public policy advisor at the Canadian artificial intelligence firm Element AI who has studied bias in credit score algorithms, said the IB's decision to try to build a school's past performance into the model makes some sense.
"School context could help balance out fraud," she said, explaining that certain schools could inflate their students' coursework scores or predict grades that were unrealistic, and that the model needed to take that into account.
For Castaneda, her final IB results mean she will not receive the college credits she was expecting when she attends Colorado State University in the fall, which would have allowed her to skip some lower-level university classes and graduate faster.
"It's going to cost me thousands of dollars," she said.
(Reporting by Avi-Asher Schapiro @AASchapiro; Editing by Jumana Farouky and Zoe Tabary. Please credit the Thomson Reuters Foundation, the charitable arm of Thomson Reuters, that covers the lives of people around the world who struggle to live freely or fairly. Visit http://news.trust.org)
Our Standards: The Thomson Reuters Trust Principles.