Uncategorized – Page 4 – Community Data Science Collective

January 11, 2025January 31, 2025

Thinking about AI harms to communities? Submit to our upcoming 4S Panel!

We advocate for consideration of harms to communities (including online communities) as they respond to, are used for, and incorporate generative AI algorithms. One area of risk is one we call ‘Social Model Collapse,’ the notion that changes to community dynamics — social and technical — can disrupt the fundamental processes they rely on to sustain themselves and flourish. We see this as a clear point of shared concern among STS, HCI, Sociology, and Communication scholars, and are hosting an open panel at the upcoming meeting of the Society for Social Studies of Science (4S), September 3-7 in Seattle. This panel is being coordinated by CDSC members Kaylea Champion (University of Washington) and Sohyeon Hwang (Princeton University) along with our colleague Hanlin Li (University of Texas).

From our call for submissions:

Model collapse in machine learning refers to the deterioration such a model faces if it is re-fed with its own output, removing variation and generating poor output; in this panel, we extend this notion to ask in what ways the use of algorithmic output in place of human participation in social systems places those social systems at risk. Recent research findings in the generation of synthetic text using large language models have fed and been fed by a rush to extract value from, and engage with, online communities. Such communities include the discussion forum Reddit, the software development communities producing open source, the participants in the question and answer forum StackExchange, and the contributors to the online knowledge base Wikipedia.

The success of these communities depends on a range of social phenomena threatened by adoption of synthetic text generation as a modality replacing human authors. Newcomers who ask naive questions are a source of members and leaders, but may shift their inquiries to LLMs and never join the community as contributors. Software communities are to some extent reliant on a sense of generalized reciprocity to turn users into contributors; such appreciation may falter if their apparent benefactor is a tireless bot. Knowledge communities are dependent on human curation, inquiry, and effort to create new knowledge, which may be systemically diluted by the presence of purported participants who are only algorithms echoing back reconstructions of the others. Meanwhile, extractive technology firms profit from anyone still engaging in a genuine manner or following their own inquiries.

In this panel, we invite consideration of current forms of social model collapse driven by a rush of scientific-industrial activity, as well as reflection on past examples of social model collapse to better contextualize and understand our present moment.

Submissions are 250-word abstracts due February 2nd; our panel is #223, “Risks of ‘Social Model Collapse’ in the Face of Scientific and Technological Advances” [Submission site link].

December 17, 2024

Puget Sound Python community hosting research talk by Kaylea Champion 12/18

Kaylea Champion will be presenting a practitioner-focused talk to the Puget Sound Python user group (PuPPY) on December 18th at 5:30 p.m. PST at the GitHub offices in Bellevue, Washington. Details on how to attend this free event are available here. This practitioner-oriented talk will draw from her own research as well findings from across the academy.

November 20, 2024November 23, 2024

On The Challenges of Governing the Online Commons

“Elinor Ostrom and the eight principles of governing the commons.” Picture: Inkylab. License: CC-BY-SA 4.0

Over the past several months (post-general exam!), I have been thinking and reading about organizational and institutional perspectives on the governance of platforms and the online communities that populate them. While much of the research on the emerging area of “platform governance”¹ draws from legal traditions or socio-technical approaches, there is also a smaller subset of scholars drawing from political science and democratic theory, thinking about designing governance structures at the level of groups, organizations, and institutions that prove resilient to various collective threats.

I think these approaches hold a lot of promise. As far as addressing one collective threat I am interested in – the strategic manipulation of information environments – most interventions I have seen have either focused on empowering individuals to be more discerning of the information they encounter online or proposing structural changes to features of platforms, such as algorithmic ranking, that dampen the virality of false or misleading information. These are, respectively, micro and macro-level interventions. The integration of participatory and distributed self-governance approaches into existing and emerging platforms is distinct: it is a meso-level intervention, and meso-level approaches remain both theoretically and empirically under-explored in discussions of platform governance.

I recently read three works that do explore this meso layer, however: Paul Gowder’s The Networked Leviathan, Nathan Schneider’s Governable Spaces, and Jennifer Forestal’s Beyond Gatekeeping. All three draw on the work of scholars that look at governance dynamics in offline spaces – in particular, the ideas of political economist Elinor Ostrom and philosopher John Dewey feature prominently – to argue that centralized platforms that practice top-down content moderation are fundamentally hostile to democratic inquiry and practice. Gowder, for example, describes this condition as a “democratic deficit” in the form of governance structures that are fundamentally unaccountable to their users. Naturally, this democratic deficit leads to negative outcomes – online spaces are easily manipulated and degraded by motivated actors. To guard against this, Gowder, Schneider, and Forestal offer various proposals for the integration of participatory structures into these platforms –ones composed of workers, civil society members, and everyday users — into platform governance and decision-marking.

I am on board with these approaches’ diagnosis of the problem, but I think the proposed solutions require more iteration. One thing I worry about is that proposals for integrating participatory and distributed governance into online platforms do not sufficiently take into account the qualitative differences between online spaces and the offline settings researchers have previously studied. When I was reading Ostrom’s Governing the Commons, for example, from which many of these interventions take at least some inspiration, I was struck by the three similarities that she noted virtually all of the common-pool resource settings she analyzed shared:

They had stable populations over long periods of time. Here’s how Ostrom describes it: “Individuals have shared a past and expect to share a future. It is important for individuals to maintain their reputations as reliable members of the community. These individuals live side by side and farm the same plots year after year. They expect their children and their grandchildren to inherit their land. In other words, their discount rates are low. If costly investments in provision are made at one point in time, the proprietors – or their families – are likely to reap the benefits.”
Norms of reciprocity and interdependence evolved in these settings among a largely similar group of individuals with shared interests. Ostrom explains: “Many of these norms make it feasible for individuals to live in close interdependence on many fronts without excessive conflict. Further, a reputation for keeping promises, honest dealings, and reliability in one arena is a valuable asset. Prudent, long-term self-interest reinforces the acceptance of the norms of proper behavior. None of these situations involves participants who vary greatly in regard to ownership of assets, skills, knowledge. ethnicity, race, or other variables that could strongly divide a group of individuals (R.Johnson and Libecap 1982).”
These cases were the success stories! Ostrom clarifies that the cases she analyzed “were specifically selected because they have endured while others have failed.” In other words, they already had sustainable resource systems and robust institutions in place.

Most (virtually all?) online platforms, and the communities that inhabit them, do not share these properties. In online spaces, individuals tend to be geographically scattered across the globe, and there’s no incentive to sustainably maintain the community for future generations to inherit, like there is with a plot of land. Moreover, members of online communities tend to have varying levels of commitment, and the anonymity and distance offered by technology makes norms of social reciprocity and interdependence harder (although not impossible) to cultivate.

The CPRs Ostrom studied were already facing uncertain and complex background conditions — but they also possessed distinct qualities conducive for success. I generally think online spaces, and the digital institutions that govern them, do not possess these qualities, and are thus even more vulnerable to threats like appropriation, pollution, or capture than the CPRs Ostrom studied. Because of this, I think a direct porting of most of Ostrom’s design principles to online governing institutions is probably insufficient. But I see an evolved set of these principles that explicitly addresses the power differentials and adversarial incentives baked into the design of social software as one way forward. What these principles could look like should be the subject of future empirical research, and maybe a future post on this blog. I am excited that researchers are exploring these meso-level interventions, which is where I think a lot of the solution lies.

Gorwa (2019) offers a definition of platform governance: “a concept intended to capture the layers of governance relationships structuring interactions between key parties in today’s platform society, including platform companies, users, advertisers, governments, and other political actors.” ↩︎

November 10, 2024

CDSC at CSCW 2024: Moderation, Bots, Taboos, and Governance Capture!

If you are attending the ACM conference on Computer-supported Cooperative Work and Social Computing (CSCW) this year CSCW in San José, Costa Rica. You are warmly invited to join CDSC members during our talks and other scheduled events. Please come say hi!

This CDSC has four papers at CSCW, which we will be presenting over the next three days:

Monday: At 11:00 am in Talamanca, Kaylea Champion will be presenting “Life Histories of Taboo Knowledge Artifacts” (full paper)

Tuesday: At 11:00 am in Central 3, Zarine Kharazian will be presenting “Governance Capture in a Self-Governing Community: A Qualitative Comparison of the Croatian, Serbian, Bosnian, and Serbo-Croatian Wikipedias” (full paper, blog post), followed by Sohyeon Hwang presenting “Adopting third-party bots for managing online communities” (full paper, blog post)

Wednesday: At 2:30 pm in Guanacaste 3, Kaylea Champion will be co-presenting “Challenges in Restructuring Community-based Moderation” (full paper, preprint)

If you’re at CSCW, feel free to get in touch in person or via Discord!

November 9, 2024

Dr. Yoel Roth: Online Safety and Security

On Oct. 23, 2024, Dr. Yoel Roth gave a lecture titled as “Decentralizing online safety and security: The promises and perils of federated social media” hosted by the Department of Human-Centered Design and Engineering at University of Washington, and a number of CDSC faculty and students were present and discussed issues of digital governance with Dr. Roth.

Years of dedicated research, activism, regulatory efforts, and investment by civil society and technology companies have increased awareness and established some control over the harmful impacts of social media platforms. Today, as social media undergoes its most significant transformation in over a decade—with alternative platforms like Mastodon, Bluesky, and Threads gaining traction in the wake of Twitter’s decline—there is optimism about these new platforms and their potential for alternative governance models that offer users greater empowerment. Dr. Roth has highlighted how self-organizing practices in these emerging communities are constructing norms for content moderation in decentralized platforms, which resonates with a number of research conducted by our research group (see Colglazier, TeBlunthuis, & Shaw, 2024).

However, these platforms retain many of the same design features and risks for misuse as mainstream platforms, yet lack the robust moderation and detection systems that have been painstakingly developed elsewhere. Additionally, significant technological, governance, and financial challenges hinder the development of these essential safeguards. Drawing on empirical research into platform moderation capacities, Dr. Roth examines the complex outcomes of this transformation in social media and suggests the following potential solutions for collective safety and addressing security risks identified: (1) institutionalize shared responses to critical harms, (2) build transparent governance into the system, (3) invest in open-source tooling, and (4) enable data sharing across instances (Roth & Lai, 2024).

References

Colglazier, C., TeBlunthuis, N., & Shaw, A. (2024, May). The Effects of Group Sanctions on Participation and Toxicity: Quasi-experimental Evidence from the Fediverse. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 18, pp. 315-328).Roth, Y., & Lai, S. (2024). Securing federated platforms: Collective risks and responses. Journal of Online Trust and Safety, 2(2).

October 23, 2024October 15, 2024

FOSSY 2024 Wrap Up: Sophia Vargas on “A review of valuation models and their application to open source models”

In the seventh talk of the Science of Community track we organized for FOSSY, Google FOSS researcher Sophia Vargas offered an overview of different strategies for measuring the value of open source (particularly in the context of a company thinking about how to engage with FOSS).

Some of Sophia’s key insights are: models for measuring one-time cost are relatively widespread (but depend on outcome metrics like lines of code rather than difficulty); understanding the cost of maintenance and community is still in formative stages; and that business leaders can make use of research-grounded models developed to measure value and risk in an academic context into decisionmaking tools within a business context.

This is part 7 of an 8-part series sharing highlights from the Science of Community track at FOSSY. Visit the FOSSY site for more bio details and an abstract of the talk.

October 23, 2024

Check Out the PhD Q&A Session!

Missed the prospective student Q&A session? Fear not, you can still hear from our faculty members, see a few examples of current students research, and listen to answers for our prospective student audience.

You can find more resources about the Community Data Science Collective below:

Still have questions for our group? Check out our people page to learn more about the faculty, students, and other members of the collective.

October 22, 2024October 14, 2024

FOSSY 2024 Wrap Up: Darius Kazemi on “Community governance models on small-to-mid-size Mastodon servers

In the sixth talk of the Science of Community track we organized for FOSSY, independent FOSS researcher Darius Kazemi described the results of an interview study to learn from the moderation teams of decentralized social network servers. One of Darius’ key observations is the extensive compliance and legally-required work that running such a server requires.

This is part 6 of an 8-part series sharing highlights from the Science of Community track at FOSSY. Visit the FOSSY site for more bio details and an abstract of the talk.

October 21, 2024October 21, 2024

FOSSY 2024 Wrap Up: Bogdan Vasilescu on “Navigating Dependency Abandonment”

In the final talk of the Science of Community track we organized for FOSSY, Computer Science professor and FOSS researcher Dr. Bogdan Vasilescu described his team’s work to understand how developers think about abandoned dependencies. One of the key insights from this work is that abandonment of dependencies is quite common, but that updating a package to remove the abandoned dependencies is very slow — and that one of the factors that drives faster replacement is when projects explicitly announce that they are ending maintenance.

This is part 8 of an 8-part series sharing highlights from the Science of Community track at FOSSY. Visit the FOSSY site for more bio details and an abstract of the talk.

October 21, 2024October 15, 2024

FOSSY 2024 Wrap Up: Kaylea Champion on “Research Says…..Insights on Building, Leading, and Sustaining Open Source”

In the fifth talk of the Science of Community track we organized for FOSSY, Dr. Kaylea Champion describes a series of research results on both how to build high-quality FOSS and how to sustain a community alongside it. One of her key insights is that a great community is no guarantee of a high-quality project — and to serve the public, we need both.

This is part 5 of an 8-part series sharing highlights from the Science of Community track at FOSSY. Visit the FOSSY site for more bio details and an abstract of the talk.