Effects of Algorithmic Flagging on Fairness: Quasi-experimental Evidence from Wikipedia

Many online platforms are adopting machine learning as a tool to maintain order and high quality information in the face of massive influxes of of user generated content. Of course, machine learning algorithms can be inaccurate, biased or unfair. How do signals from machine learning predictions shape the fairness of online content moderation? How can we measure an algorithmic flagging system’s effects?

In our paper published at CSCW 2021, I (Nate TeBlunthuis) together with Benjamin Mako Hill and Aaron Halfaker analyzed the RCFilters system: an add-on to Wikipedia that highlights and filters edits that a machine learning algorithm called ORES identifies as likely to be damaging to Wikipedia. This system has been deployed on large Wikipedia language editions and is similar to other algorithmic flagging systems that are becoming increasingly widespread. Our work measures the causal effect of being flagged in the RCFilters user interface.

Screenshot of Wikipedia edit metadata on Special:RecentChanges with RCFilters enabled. Highlighted edits with a colored circle to the left side of other metadata are flagged by ORES. Different circle and highlight colors (white, yellow, orange, and red in the figure) correspond to different levels of confidence that the edit is damaging. RCFilters does not specifically flag edits by new accounts or unregistered editors, but does support filtering changes by editor types.

Our work takes advantage of the fact that RCFilters, like many algorithmic flagging systems, create discontinuities in the relationship between the probability that a moderator should take action and whether a moderator actually does. This happens because the output of machine learning systems like ORES is typically a continuous score (in RCFilters, an estimated probability that a Wikipedia edit is damaging), while the flags (in RCFilters, the yellow, orange, or red highlights) are either on or off and are triggered when the score crosses some arbitrary threshold. As a result, edits slightly above the threshold are both more visible to moderators and appear more likely to be damaging than edits slightly below. Even though edits on either side of the threshold have virtually the same likelihood of truly being damaging, the flagged edits are substantially more likely to be reverted. This fact lets us use a method called regression discontinuity to make causal estimates of the effect of being flagged in RCFilters.

Charts showing the probability that an edit will be reverted as function of ORES scores in the neighborhood of the discontinuous threshold that triggers the RCfilters flag. The jump in the increase in reversion chances is larger for registered editors compared to unregistered editors at both thresholds.

To understand how this system may effect the fairness of Wikipedia moderation, we estimate the effects of flagging on edits on different groups of editors. Comparing the magnitude of these estimates lets us measure how flagging is associated with several different definitions of fairness. Surprisingly, we found evidence that these flags improved fairness for categories of editors that have been widely perceived as troublesome—particularly unregistered (anonymous) editors. This occurred because flagging has a much stronger effect on edits by the registered than on edits by the unregistered.

We believe that our results are driven by the fact algorithmic flags are especially helpful for finding damage that can’t be easily detected otherwise. Wikipedia moderators can see the editor’s registration status in the recent changes, watchlists, and edit history. Because unregistered editors are often troublesome, Wikipedia moderators’ attention is often focused on their contributions, with or without algorithmic flags. Algorithmic flags make damage by registered editors (in addition to unregistered editors) much more detectable to moderators and so help moderators focus on damage overall, not just damage by suspicious editors. As a result, the algorithmic flagging system decreases the bias that moderators have against unregistered editors.

This finding is particularly surprising because the ORES algorithm we analyzed was itself demonstrably biased against unregistered editors (i.e., the algorithm tended to greatly overestimate the probability that edits by these editors were damaging). Despite the fact that the algorithms were biased, their introduction could still lead to less biased outcomes overall.

Our work shows that although it is important to design predictive algorithms to not have such biases, it is equally important to study fairness at the level of the broader sociotechnical system. Since we first published a preprint of our paper, a followup piece by Leijie Wang and Haiyi Zhu replicated much of our work and showed that differences between different Wikipedia communities may be another important factor driving the effect of the system. Overall, this work suggests that social signals and social context can interact with algorithmic signals and together these can influence behavior in important and unexpected ways.


The full citation for the paper is: TeBlunthuis, Nathan, Benjamin Mako Hill, and Aaron Halfaker. 2021. “Effects of Algorithmic Flagging on Fairness: Quasi-Experimental Evidence from Wikipedia.” Proceedings of the ACM on Human-Computer Interaction 5 (CSCW): 56:1-56:27. https://doi.org/10.1145/3449130.

We have also released replication materials for the paper, including all the data and code used to conduct the analysis and compile the paper itself.

Jacobs Fellowship to study new frontier in tech education

This article is a reposted article from Doug Parry’s article in the UW iSchool News Website. The project is being driven by Stefania Druga who is part of the Community Data Science learning team and Mako. Jason Yip is a group friend.

Partners in the AI Literacy Project funded by Jacobs Fellowship (from top left to right): Jason Yip – Assistant Professor iSchool University of Washington, Stefania Druga – Doctoral Student iSchool University of Washington, Benjamin Mako Hill – Assistant Professor Department of Communications University of Washington, Indra Kubicek – CFO at Kids Code Jeunesse, David Moinina Sengeh – Minister Of Basic and Senior Secondary Education at Government of Sierra Leone, Kate Arthur – Founder & CEO at Kids Code Jeunesse, Michael Preston – co-founder CSforALL & Executive Director of Joan Ganz Cooney Center at Sesame Workshop.

A decade ago, teaching kids to code might have seemed far-fetched to some, but now coding curriculum is being widely adopted across the country. Recently researchers have turned their eye to the next wave of technology: artificial intelligence. As AI makes a growing impact on our lives, can kids benefit from learning how it works?

A three-year, $150,000 award from the Jacobs Foundation Research Fellowship Program will help answer that question. The fellowship awarded to Jason Yip, an assistant professor at the University of Washington Information School, will allow a team of researchers to investigate ways to educate kids about AI.

Stefania Druga, a first-year Ph.D. student advised by Yip , is among the researchers spearheading the effort. Druga came to the iSchool after earning her master’s at the Massachusetts Institute of Technology, where she launched Cognimates, a platform that teaches children how to train AI models and interact with them.

Druga’s desire to take Cognimates to the next level brought her to the University of Washington Information School and to her advisor, Yip, whose KidsTeam UW works with children to design technology. KidsTeam treats children as equal partners in the design process, ensuring the technology meets their needs — an approach known as co-design.

At MIT, “I realized there was only so far we could go,” Druga said. “In order for us to imagine what the future interfaces of AI learning for kids would look like, we need to have this longer-term relationship and partnership with kids, and co-design with kids, which is something Jason and the team here have done very well.”

Built on the widely used Scratch programming language, Cognimates is an open-source platform that gives kids the tools to teach computers how to recognize images and text and play games. Druga hopes the next iteration will help children truly understand the concepts behind AI — what is the robot “thinking” and who taught it to think that way? Even if they don’t grow up to be programmers or software engineers, the generation of “AI natives” will need to understand how technology works in order to be critical users.

“It matters as a new literacy,” Druga said, “especially for new generations who are growing up with technologies that become so embedded in things we use on a regular basis.”

Over the course of the fellowship, the research team will work with international partners to develop an AI literacy educational platform and curriculum in multiple languages for use in different settings, in both more- and less-developed parts of the world.

Partners include Kate Arthur, CEO of Kids Code Jeunesse in Montreal; Michael Preston, executive director of the Joan Ganz Cooney Center at Sesame WorkshopDavid Sengeh, the minister of basic and secondary education for the government of Sierra Leone; and Benjamin Mako Hill, an assistant professor in the UW Department of Communication.

For Yip, the project brings the work of his Ph.D. student together with his work with KidsTeam with other recent research he has conducted on how families interact with AI.

“For me, it’s a proud moment when an advisee has a really cool vision that we can build together as a team,” Yip said. “This is a nice intersection of all of us coming together and thinking about what families need to understand artificial intelligence.”

The Jacobs Foundation fellowship program is open to early- and mid-career researchers from all scholarly disciplines around the world whose work contributes to the development and living conditions of children and youth. It’s highly competitive, with 10-15 fellowships chosen from hundreds of submissions each year.

If you are interested to get involved with this project or support in any way you may contact us at cognimates[a]gmail.com.

Further information about this project available here: http://cognimates.me/research