Sources of Underproduction in Open Source Software

Although the world relies on free/libre open source software (FLOSS) for essential digital infrastructure such as the web and cloud, the software that supports that infrastructure are not always as high quality as we might hope, given our level of reliance on them. How can we find this misalignment of quality and importance (or underproduction) before it causes major failures?

How can we find misalignment of quality and importance (underproduction) before it causes major failures?

In previous work, we found that underproduction is widespread in packages maintained by the Debian community, and when we shared this work in the Debian and FLOSS community, developers suggested that the age and language of the packages might be a factor, and tech managers suggested looking at the teams doing the maintenance work. Software engineering literature had found some support for these suspicions as well, and we embarked on a study to dig deeper into some of the factors associated with underproduction.

Our study was able to partially confirm this perspective using the underproduction analysis dataset from our previous study: software risk due to underproduction increases with age of both the package and its language, although many older packages and those written in older languages are and continue to be very well-maintained.

In this plot, dots represent software packages and their age, with higher underproduction factor indicating higher risk. The blue line is a smoothed average: note that we see an increase over time initially, but the trend flattens out for older packages.

This plot shows the spread of the data across the range of underproduction factor, grouped by language, where higher values are indications of higher risk. Languages are sorted from oldest on the left (Lisp) to youngest on the right (Java). Although newer languages overall are associated with lower risk, we see a great deal of variation.

However, we found the resource question more complex: additional contributors were associated with higher risk instead of decreasing it as we hypothesized. We also found that underproduction is associated with higher eigenvector centrality in the network formed if we take packages as nodes and edges by having shared maintainers; that is, underproduced packages were likely to be maintained by the same people maintaining other parts of Debian, and not isolated efforts. This suggests that these high-risk packages are drawing from the same resource pool as those which are performing well. A lack of turnover in maintainership and being maintained by a team were not statistically significant once we included maintainer network structure and age in our model.

How should software communities respond? Underproduction appears in part to be associated with age, meaning that all communities sooner or later may need to confront it, and new projects should be thoughtful about using older languages. Distributions and upstream project developers are all part of the supply chain and have a role to play in the work of preventing and countering underproduction. Our findings about resources and organizational structure suggest that “more eyeballs” alone are not the answer: supporting key resources may be of particular value as a means to counter underproduction.

This paper will be presented as part of the International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024 in Rovaniemi, Finland. Preprint available HERE; code and data released HERE.

This work would not have been possible without the generosity of the Debian community. We are indebted to these volunteers who, in addition to producing Free/Libre Open Source Software software, have also made their records available to the public. We also gratefully acknowledge support from the Sloan Foundation through the Ford/Sloan Digital Infrastructure Initiative, Sloan Award 2018-11356 as well as the National Science Foundation (Grant IIS-2045055). This work was conducted using the Hyak supercomputer at the University of Washington as well as research computing resources at Northwestern University.

FLOSS project risk and community formality

What structure and rules are best for communities producing high-quality free/libre and open source software (FLOSS)? The stakes are high: cybersecurity researchers are raising the alarm about cybersecurity risk due to undermaintained components in the global software supply chain—much of which is FLOSS. In work that’s just been accepted to the IEEE International Conference on Software Analysis, Evolution and Reengineering (‘SANER’), we studied 182 Python-language packages in the GNU/Linux Debian distribution, examining the relationship between their levels of engineering formality and software risk. We found that more formal developer organization is associated with higher levels of software risk, and more widely spread developer responsibility is associated with lower levels of software risk.

We studied software risk through the underproduction metric initially developed by Champion and Hill (2021). Underproduction is a measurement of misalignment between the usage demands of a software project and the contributions of the project’s developer community. As such, underproduction measures the risk that software will be undermaintained, possibly including a security bug.

Our work examines the relationship between risk due to underproduction and governance formality. We employed measures initially developed by Tamburri et al. (2013) and later re-implemented in Tamburri et al. (2019). These metrics use multiple measures of software project formality — such as the average contributor type, usage of GitHub milestones, and age — to evaluate how formally structured a given project is.

Plot of the relationship between mean underproduction factor and mean membership type (MMT), a metric encapsulating the diffusion of merge responsibility across a project’s developer community.



We used linear regression to conclude that more formal project structures are associated with higher levels of underproduction and thus, increased project risk. We also found that the share of community-members who have merged code into the main development branch is also related to underproduction, with lower levels of underproduction correlated with larger shares of community mergers.

Evaluated together, these two conclusions suggest that operating less formally and sharing power more equally is associated with lower underproduction risk. The development of FLOSS project engineering is a process laden with tradeoffs, we hope that our conclusions can help better inform community decision making and organization.

For more details, visualizations, statistics, and more, we hope you’ll take a look at our paper. If you are attending SANER in March 2024, we hope you’ll talk to us in Rovaniemi, Finland!

—————

The full citation for the paper is:

Gaughan, Matthew, Champion, Kaylea, and & Hwang, Sohyeon. (2024) “Engineering Formality and Software Risk in Debian Python Packages.” In 31st IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER2024) (Short Paper and Posters Track). Rovaniemi, Finland.

We have also released replication materials for the paper, including all the data and code used to conduct the analyses.

This blog post and the paper it describes are collaborative work by Matt Gaughan, Kaylea Champion, and Sohyeon Hwang.

A new paper on the risk of nationalist governance capture in self-governed Wikipedia projects

Wikipedia is one of the most visited websites in the world and the largest online repository of human knowledge. It is also both a target of and a defense against misinformation, disinformation, and other forms of online information manipulation. Importantly, its 300 language editions are self-governed—i.e., they set most of their rules and policies. Our new paper asks: What types of governance arrangements make some self-governed online groups more vulnerable to disinformation campaigns? We answer this question by comparing two Wikipedia language editions—Croatian and Serbian Wikipedia. Despite relying on common software and being situated in a common sociolinguistic environment, these communities differed in how successfully they responded to disinformation-related threats.

For nearly a decade, the Croatian language version of Wikipedia was run by a cabal of far-right nationalists who edited articles in ways that promoted fringe political ideas and involved cases of historical revisionism related to the Ustaše regime, a fascist movement that ruled the Nazi puppet state called the Independent State of Croatia during World War II. This cabal seized complete control of the governance of the encyclopedia, banned and blocked those who disagreed with them, and operated a network of fake accounts to give the appearance of grassroots support for their policies.

Thankfully, Croatian Wikipedia appears to be an outlier. Though both the Croatian and Serbian language editions have been documented to contain nationalist bias and historical revisionism, Croatian Wikipedia alone seems to have succumbed to governance capture: a takeover of the project’s mechanisms and institutions of governance by a small group of users.

The situation in Croatian Wikipedia was well-documented and is now largely fixed, but still know very little about why Croatian Wikipedia was taken over, while other language editions seem to have rebuffed similar capture attempts. In a new paper that is accepted for publication in the Proceedings of the ACM: Human-Computer Interaction (CSCW), we present an interview-based study that tries to explain why Croatian was captured while several other editions facing similar contexts and threats fared better.

Short video presentation of the work given at Wikimania in August 2023.

We interviewed 15 participants from both the Croatian and Serbian Wikipedia projects, as well as the broader Wikimedia movement. Based on insights from these interviews, we arrived at three propositions that, together, help explain why Croatian Wikipedia succumbed to capture while Serbian Wikipedia did not: 

  1. Perceived Value as a Target. Is the project worth expending the effort to capture?
  2. Bureaucratic Openness. How easy is it for contributors outside the core founding team to ascend to local governance positions?
  3. Institutional Formalization. To what degree does the project prefer personalistic, informal forms of organization over formal ones?
The conceptual model from our paper, visualizing possible institutional configurations among Wikipedia projects that affect the risk of governance capture. 

We found that both Croatian Wikipedia and Serbian Wikipedia were attractive targets for far-right nationalist capture due to their sizable readership and resonance with a national identity. However, we also found that the two projects diverged early on in their trajectories in terms of how open they remained to new contributors ascending to local governance positions and the degree to which they privileged informal relationships over formal rules and processes as organizing principles of the project. Ultimately, Croatian’s relative lack of bureaucratic openness and rules constraining administrator behavior created a window of opportunity for a motivated contingent of editors to seize control of the governance mechanisms of the project. 

Though our empirical setting was Wikipedia, our theoretical model may offer insight into the challenges faced by self-governed online communities more broadly. As interest in decentralized alternatives to Facebook and X (formerly Twitter) grows, communities on these sites will likely face similar threats from motivated actors. Understanding the vulnerabilities inherent in these self-governing systems is crucial to building resilient defenses against threats like disinformation. 

For more details on our findings, take a look at the preprint of our paper.


Preprint on arxiv.org: https://arxiv.org/abs/2311.03616. The paper has been accepted for publication in Proceedings of the ACM on Human-Computer Interaction (CSCW) and will be presented at CSCW in 2024. This blog post and the paper it describes are collaborative work by Zarine Kharazian, Benjamin Mako Hill, and Kate Starbird.

Let’s talk about taboo! A new paper on how taboo shapes activity on Wikipedia

Taboo subjects—such as sexuality and mental health—are as important to discuss as they are difficult to raise in conversation. Although many people turn to online resources for information on taboo subjects, censorship and low quality information are common in search results. In work that has just been published at CSCW this week, we present a series of analyses that describe how taboo shapes the process of collaborative knowledge building on English Wikipedia. Our work shows that articles on taboo subjects are much more popular and the subject of more vandalism than articles on non-taboo topics. In surprising news, we also found that they were edited more often and were of higher quality! We also found that contributors to taboo articles did less to hide their identity than we expected.

Short video of a our presentation of the work given at Wikimania in August 2023.

The first challenge we faced in conducting our study was building a list of Wikipedia articles on taboo topics. This was challenging because while taboo is deeply cultural and can seem natural, our individual perspectives of what is and isn’t taboo is privileged and limited. In building our list, we wanted to avoid relying on our own intuition about what qualifies as taboo. Our approach was to make use of an insight from linguistics: people develop euphemisms as ways to talk about taboos. Think about all the euphemisms we’ve devised for death, or sex, or menstruation, or mental health. Using figurative languages lets us distance ourselves from the pollution of a taboo.

We used this insight to build a new machine learning classifier based on dictionary definitions in English Wiktionary. If a ‘sense’ of a word was tagged as a euphemism, we treated the words in the definition as indicators of taboo. The end result of this analysis is a series of words and phrases that most powerfully differentiate taboo from non-taboo. We then did a simple match between those words and phrases and Wikipedia article titles. We built a comparison sample of articles whose titles are words that, like our taboo articles, appear in Wiktionary definitions.

We used this new dataset to test a series of hypotheses about how taboo shapes collaborative production in Wikipedia. Our initial hypotheses were based on the idea that taboo information is often in high demand but that Wikipedians might be reluctant to associate their names (or usernames) with taboo topics. The result, we argued, would be articles that were in high demand but of low quality. What we found was that taboo articles are thriving on Wikipedia! In summary, we found in comparison to non-taboo articles:

  • Taboo articles are more popular (as expected).
  • Taboo articles receive more contributions (contrary to expectations).
  • Taboo articles receive more low-quality contributions (as expected).
  • Taboo articles are higher quality (contrary to expectations).
  • Taboo article contributors are more likely to contribute without an account (as expected), and have less experience (as expected), but that accountholders are more likely to make themselves more identifiable by having a user page, disclosing their gender, and making themselves emailable (all three of these are contrary to expectation!).

For more details, visualizations, statistics, and more, we hope you’ll take a look at our paper. If you are attending CSCW in October 2023, we also hope and come to our CSCW presentation in Minneapolis!


The full citation for the paper is: Champion, Kaylea, and Benjamin Mako Hill. 2023. “Taboo and Collaborative Knowledge Production: Evidence from Wikipedia.” Proceedings of the ACM on Human-Computer Interaction 7 (CSCW2): 299:1-299:25. https://doi.org/10.1145/3610090.

We have also released replication materials for the paper, including all the data and code used to conduct the analyses.

This blog post and the paper it describes are collaborative work by Kaylea Champion and Benjamin Mako Hill.

The social structure of new wiki communities

A new paper that our that our group has published seeks to test whether the kind of communication patterns associated with successful offline teams also predict success in online collaborative settings. Surprisingly, we find that it does not. In the rest of this blog post, we summarize that research and unpack that result.

Many of us have been part of a work team where everyone clicked. Everyone liked and respected each other, maybe you even hung out together outside of work. In a team like that, when someone asks you to cover a shift, or asks you to stay late to help them finish a project, you do it.

This anecdotal experience that many of us have is borne out by research. When members of work groups in corporate settings feel integrated into a group, and particularly when their identity is connected to their group membership, they are more willing to contribute to the group’s goals. Integrative groups (where there isn’t a strong hierarchy and where very few people are on the periphery) are also able to communicate and coordinate their work better.

One way to measure whether a group is “integrative” is to look at the group’s conversation networks, as shown in the figure below. Groups where few people are on the periphery (like on the left) usually perform better along a number of dimensions, such as creativity and productivity.

Examples of two possible configurations of a work group. The work group on the left is much more “integrative,” and we would expect it to be more creative and productive.

In our new paper, we set out to look for evidence that early online wiki communities at Fandom.com work the same way as work groups. When communities are getting started, there are lots of reasons to think that they would also benefit from integrative networks. Their members typically don’t know each other and communicate mostly via text—conditions that should make building a shared identity tough. In addition, they are volunteers who can easily leave at any time. The research on work groups made us think that integrative social structures would be especially important in making new wikis successful.

Communication network of the Spongebob wiki after 700 edits

In order to measure the social structure of these communities, we created communication networks for almost 1,000 wikis for the talk that happened during their firs 700 main page edits. Connections between people were based on who talked to whom on Talk pages. These are wiki pages connected to each page and each registered user on a wiki. We connected users who talked to each other at least a few times on the same talk pages, and looked at whether how integrative a communication network was predicted 1) how much people contributed and 2) how long a wiki remained active.

Surprisingly, we found that no matter how we measured communication networks, and no matter how we measured success, integrative network measures were not good at predicting that a wiki would survive or be productive. While a few of our control variables helped to predict productivity and survival, none of the network measures (nor all of them taken together) helped much to predict either of our success measures, as shown in Figures 5 and 6 from the paper.

Figure 5. Estimated coefficients predicting the productivity of a wiki.
Figure 6. Estimated coefficients predicting how quickly a wiki will become inactive.

So, what is going on here?

We have a few possible explanations for why communication network structures don’t seem to matter. One is that group identity for wiki members may not be influenced much by network structure. In a work group, it can be painfully obvious if you are on the periphery and not included in conversations or activities. Even though wiki conversations are technically all public and visible, in practice it’s very easy for group members to be unaware of conversations happening in other parts of the site. This idea is supported by research led by Sohyeon Hwang, which showed that people can build identity in an online community even without personal relationships.

Another complementary explanation for how groups coordinate work without integrative communication networks is that wiki software helps to organize what needs to be done without explicit communication. Much of this happens just because the central artifact of the community—the wiki—is continuously updated, so it is (relatively) clear what has been done and what needs to be done. In addition, there are opportunities for stigmergy. Stigmergy occurs when actors modifying the environment as a way of communicating. Then, others make decisions based on observing the environment. The canonical example is ants who leave pheremone trails for other ants to find and follow.

In wikis, this can be accomplished in a few ways. For example, contributors can create a link to a page that doesn’t yet exist. By default, these show up as red links, suggesting to others that a page needs to be created.

A final possible explanation for our results is based on how easy it is to join and leave online communities. It may be that integrative structures are so important because they help groups to overcome and navigate conflicts; in online communities contributors may be more likely to simply disengage instead of trying to resolve a conflict.

As we conclude in the paper:

Why do communication networks—important predictors of group performance outcomes across diverse domains—not predict productivity or survival in peer production? Our findings suggest that the relationship of communication structure to effective collaboration and organization is not universal but contingent. While all groups require coordination and undergo social influence, groups composed of different types of people or working in different technological contexts may have different communicative needs. Wikis provide a context where coordination via stigmergy may suffice and where the role of cheap exit as well as the difficulty of group-level conversation may lead to consensus-by-attrition.

We hope that others will help us to study some of these mechanisms more directly, and look forward to talking more with researchers and others interested in how and why online groups succeed.


The full citation for the paper is: Foote, Jeremy, Aaron Shaw, and Benjamin Mako Hill. 2023. “Communication Networks Do Not Predict Success in Attempts at Peer Production.” Journal of Computer-Mediated Communication 28 (3): zmad002. https://doi.org/10.1093/jcmc/zmad002.

We have also released replication materials for the paper, including all the data and code used to conduct the analyses.