Tor users: An untapped resource for Wikipedia?

screenshot of Wikipedia banning tor users
An image displaying the message that Tor users typically receive when trying to make edits on Wikipedia, stating that the user’s IP address has been identified as a Tor exit node, and that “editing through Tor is blocked to prevent abuse.”

Like everyone else, Internet users who protect their privacy by using the anonymous browsing software Tor are welcome to read Wikipedia. However, when Tor users try to contribute to the self-described “encyclopedia that anybody can edit,” they typically come face-to-face with a notice explaining that their participation is not welcome.

Our new paper—led by Chau Tran at NYU and authored by a group of researchers from the University of Washington, the Community Data Science Collective, Drexel, and New York University—was published and presented this week at the IEEE Symposium on Security & Privacy and provides insight into what Wikipedia might be missing out on by blocking Tor. By comparing contributions from Tor that slip past Wikipedia’s ban to edits made by other types of contributors, we find that Tor users make contributions to Wikipedia that are just as valuable as those made by new and unregistered Wikipedia editors. We also found that Tor users are more likely to engage with certain controversial topics.

One-minute “Trailer” for our paper and talk at the IEEE Symposium on Security & Privacy. Video was produced by Tommy Ferguson at the UW Department of Communication.

To conduct our study, we first identified more than 11,000 Wikipedia edits made by Tor users who were able to bypass Wikipedia’s ban on contributions from Tor between 2007 and 2018. We then used a series of quantitative techniques to evaluate the quality of these contributions. We found that Tor users made contributions that were similar in quality to, and in some senses even better than, contributions made by other users without accounts and newcomers making their first edits.

An image from the study showing the differences in topics edited by Tor users and other Wikipedia users. The image suggests that Tor users are more likely to edit pages discussing topics such as politics, religion, and technology. Other types of users, including IP, First-time, and Registered editors, are more likely to edit pages discussing topics such as music and sports.

We used a range of analytical techniques including direct parsing of article histories, manual inspections of article changes, and a machine learning platform called ORES to analyze contributions. We also used a machine learning technique called topic modeling to analyze Tor users’ areas of interest by checking their edits against clusters of keywords. We found that Tor-based editors are more likely than other users to focus on topics that may be considered controversial, such as politics, technology, and religion.

In a closely connected study led by Kaylea Champion and published several months ago in the Proceedings of the ACM on Human Computer Interaction (CSCW), we conducted a forensic qualitative analysis of contributions of the same dataset. Our results in that study are described in a separate blog post about that project and paint a complementary picture of Tor users engaged—in large part—in uncontroversial and quotidian types of editing behavior.

Across the two papers, our results are similar to other work that suggests that Tor users are very similar to other internet users. For example, one previous study has shown that Tor users frequently visit websites in the Alexa top one million.

Much of the discourse about anonymity online tends toward extreme claims backed up by very little in the way of empirical evidence or systematic study. Our work is a step toward remedying this gap and has implications for many websites that limit participation by users of anonymous browsing software like Tor. In the future, we hope to conduct similar systematic studies in contexts beyond Wikipedia.

Video of the conference presentation at the IEEE Symposium on Security & Privacy 2020 by Chau Tran.

In terms of Wikipedia’s own policy decisions about anonymous participation, we believe that our paper suggests that the benefits of a “pathway to legitimacy” for Tor contributors to Wikipedia might exceed the potential harm due to the value of their contributions. We are particularly excited about exploring ways to allow contributors from anonymity-seeking users under certain conditions: for example, requiring review prior to changes going live. Of course, these are questions for the Wikipedia community to decide but it’s a conversation that we hope our research can inform and that we look forward to participating in.

Authors of the paper, “Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor,” include Chau Tran (NYU), Kaylea Champion (UW & CDSC), Andrea Forte (Drexel), Benjamin Mako Hill (UW & CDSC), and Rachel Greenstadt (NYU). The paper was published at the 2020 IEEE Symposium on Security & Privacy between May 18 and 20. Originally to be held in San Francisco, the event was held digitally due to the COVID-19 pandemic. This blog post borrows with permission from this news release by Andrew Laurent at NYU.

Paper Citation: Tran, Chau, Kaylea Champion, Andrea Forte, Benjamin Mako Hill, and Rachel Greenstadt. “Are Anonymity-Seekers Just like Everybody Else? An Analysis of Contributions to Wikipedia from Tor.” In 2020 IEEE Symposium on Security and Privacy (SP), 1:974–90. San Francisco, California: IEEE Computer Society, 2020.

The research was funded by the National Science Foundation.

One Reply to “Tor users: An untapped resource for Wikipedia?”

Leave a Reply