Community Data Science Collective

March 19, 2025

Community Dialogue on The Role of Community Governance

Join the Community Data Science Collective (CDSC) for our 11th Science of Community Dialogue! This Community Dialogue will take place on April 4th at 12:00 pm CT. This Dialogue focuses on resisting online information manipulation and the role of community governance. Professor Paul Gowder (Northwestern University) will join Zarine Kharazian (University of Washington) to present recent research on topics including:

Exploring threats like misinformation and propaganda in online communities.
Limitations of approaches that neglect community governance.
Tradeoffs in governance models, such as those of Facebook, Bluesky, and Wikipedia.
Strategies to protect information commons.
Participatory governance for platforms.
Insights on democratizing platforms and society.

A full session descriptions is on our website. Register online.

What is a Dialogue?

The Science of Community Dialogue Series is a series of conversations between researchers, experts, community organizers, and other people who are interested in how communities work, collaborate, and succeed. You can watch this short introduction video with Aaron Shaw.

What is the CDSC?

The Community Data Science Collective (CDSC) is an interdisciplinary research group made of up of faculty and students at the University of Washington Department of Communication, the Northwestern University Department of Communication Studies, the Carleton College Computer Science Department, the School of Information at UT Austin, and the Purdue University School of Communication.

Learn more

If you’d like to learn more or get future updates about the Science of Community Dialogues, please join the low volume announcement list.

March 17, 2025March 17, 2025

The Relationship Between Facebook and Tradition or How a Peacock Invaded a Garden

A question that’s been bugging me for the past months is how a digital medium can amplify remnants of traditional pre-modern life.

I’m only a recent addition to the Collective and, despite thinking a lot about digitality and how it affects social processes, I’ve only marginally been part of online communities myself.

However, living back and forth, in a perpetual state of limbo, between Greece and the US for the past two and a half years pursuing graduate school, I’ve inevitably become more active online.

That is when I noticed all these Facebook groups popping up in my feed (yes, I am one of those people who find comfort in the platform’s slower mechanics and less engaging feed).

Groups about life in one’s Greek village, groups about homemade traditional delicacies (as grotesque as lamb intestines and brains), groups about tsipouro (the Greek eau de vie, rakı, grappa, moonshine, …).

Those groups seem to be composed by active middle aged members, who barely know how to use the medium in socially acceptable ways devised by my generation (one of the first one’s in Greece to go online), yet spent hours posting facts about how to make cheese or definitions for words no more used, commenting on each other’s drinking habits, discussing about the “proper” way to make moussaka or arguing about the “proper” meze (small dishes, appetizers, tapas, antipasti, …) for tsipouro. On top of that, youngsters invade them as bystanders, looking, smiling, laughing, making fun of the surreal discussions, uppercase comments, text-to-speech mistakes; chaos.

I’m afraid that these groups are part of the general Zeitgeist in the country: dissapointment and longing for long-lost glory, calmness, or simplicity. It is evident in parts of its cultural production. Books about 19th century Greece, movies and TV series about times of old. A remaining obsession with ancient times. Music referring to the country’s folk traditions.

Relatedly and perhaps subconsciously associated, I recently found myself in a concert by the promising Themos Skandamis. Skandamis produces neo-folk (might I say avantgardish [?] songs) which he writes himself. During his concert, comedic elements imitating the local Cretan accent entered his performance. By the end, as a farewell he beautifully sang a capella a traditional song I’ve never heard before.

It’s lyrics follow (original sourced from here, and freely translated by myself with the benevolent help and suggestions of GPT-4):

Για δες περβό… για δες περβόλιν όμορφο,
για δες κατάκρυα βρύση το περιβόλι μας
για δες κατάκρυα βρύση το περβό… τ’ όριο περβόλι μας τ’ όμορφο.

Κι όσα δέντρα κι όσα δέντρα ‘πεμψεν ο Θιος,
μέσα είναι φυτεμένα το περιβόλι μας
μέσα είναι φυτεμένα το περβό… τ’ όριο περβόλι μας τ’ όμορφο.

Κι όσα πουλιά κι όσα πουλιά πετούμενα,
μέσα είναι φωλεμένά το περιβόλι μας
μέσα είναι φωλεμένα το περβό… τ’ όριο περβόλι μας τ’ όμορφο.

Μέσα σε ‘κεί… μέσα σε ‘κείνα (ν)τα πουλιά,
εβρέθη ένα παγώνι το παγωνάκι μας
εβρέθη ένα παγώνι το παγώ… τ’ όριο παγώνι μας τ’ όμορφο.

Και χτίζει τη και χτίζει τη φωλίτσα του,
σε μιας μηλιάς κλωνάρί το παγωνάκι μας
σε μιας μηλιάς κλωνάρι το παγώ… τ’ όριο παγώνι μας τ’ όμορφο.

Behold the garden… behold our garden’s beauty,
behold the cool spring, our garden
behold the cool spring, our gard… our garden’s beauty.

And all the trees, all the trees God has sent,
are planted in, our garden
are planted in, our gard… our garden’s beauty.

And all the bird and all the birds that roam the skies,
have nested in, our garden
have nested in, our gard… our garden’s beauty.

Among those birds… among those birds,
a peacock found, our little peacock,
a peacock found, the pea… our peacock’s beauty.

And it builds its… and it builds its nest,
on the apple tree’s branch, our peacock
on the apple tree’s branch, the pea… our peacock’s beauty.

I kept wondering, what does it mean, and most importantly why a traditional folk song would mark the end of the concert.

And then it strike me, there is this clear metaphor and parallelism between the peacock and digitality. The peacock appears out of context, yet builds its nest in the garden. Digital media invade our analog lives and impose themselves; they become addictively habitual and naturalized.

What is interesting though here is the reversal of roles. The digital sphere, now established, turned into a garden, leaves room for trad life to invade post-modernity; whether it is a cause or an effect of the world’s turmoil remains to be seen.

On the other hand, perhaps I’m thinking too much.

February 27, 2025February 26, 2025

Decentralization and the social web revisited

Diagram of centralized, decentralized (hierarchical), and distributed communications networks from Paul Baran's 1964 article "On distributed communications networks." — Figure from Paul Baran’s 1964 article “On distributed communications networks” (https://doi.org/10.1109/TCOM.1964.1088883) and referenced by Christine Lemmer-Webber.

Last Spring, I hosted a “thought leader dialogue” on decentralizing social media through the Northwestern Center for Human-Computer Interaction and Design (HCI+D). It was fantastic and I highly recommend you check out the video.

Afterwards, two of the eponymous “thought-leaders” who headlined that session, Christine Lemmer-Webber and Bryan Newbold followed up with each other in a series of blogposts about decentralization, Bluesky, and the Fediverse.

How decentralized is Bluesky really? by Christine Lemmer-Webber.
Reply on Bluesky and decentralization by Bryan Newbold.
Re: Re: Bluesky and decentralization by Christine Lemmer-Webber

I recently re-read these while preparing for a guest lecture in an undergraduate class here at NU. Fair warning: Despite the subtext of a debate about systems mainly used for micro-blogging, the posts are very long. That said, if you care about these topics (or are maybe just curious to learn more), all three are pure gold.

It helps to understand a bit about exactly who Christine and Bryan are, or more specifically, why they have among the very most informed and interesting perspectives on the topic. Christine helped to create the ActivityPub protocol and standard that is at the heart of the Fediverse (Mastodon, Lemmy, Pixelfed, PeerTube). She currently leads the Spritely Institute pursuing the design and implementation of the next generation of decentralized communication tech. Bryan is the protocol engineer at Bluesky and, as such, one of the leaders building out the AT Protocol, which powers Bluesky and a whole other small-but-seemingly-growing ecosystem of applications.

The crux of their conversation revolves around the competing visions for the future of the social web being pursued within the respective ActivityPub and AT Protocol “universes.” The posts underscore some of the more profound differences between the two, in particular the questions around “shared heap” vs. “message passing” architectures, the relative degree of (de)centralization each system affords (this is where the diagram originally created by Paul Baran and reproduced at the top of this post comes in), and (more implicitly) the theories of change motivating the design and implementation choices prioritized in their respective work.

It’s worth nothing that both Bryan and Christine are very careful and intentional to call out the overlaps between ActivityPub and AT Proto as well. I also appreciate how respectful and thoughtful they each are given that there are some substantive and high stakes areas of disagreement being addressed in their conversation.

If you read the posts (and I hope you do!) and want to share your responses, please do so in the comments, via email, Toot, Skeet, or sundry other means. There are several of us in the lab pursuing work in these areas and I’m eager to understand perspectives on the topic.

January 17, 2025

Exit, Voice, and Fork: From One Community to a Network of Groups

Everyone knows that making friends can be a bit daunting as a new student (especially international or if you’re not from the area). With that in mind, a while after I arrived in Chicago to begin my PhD program last Fall, a Brazilian friend and I made a Northwestern WhatsApp group for international graduate students! Since the point of the group was to make friends, we were pretty laid back in there (still are!), and people mostly shared events across campus and the city of Evanston and Chicago – especially those offering free food.

At some point, the group started to grow fast, and my friend and I lost track of people who were joining. It reached 81(!) members. Eventually, we started to make sub-groups. First, we made a group for the Brazilian grad students; then another group was made for the women called “Girl Energy”; later, one of my friends made a group only for Latinos (the name is interestingly only “🥳🥳🥳”). Lastly, the group “Friendos” emerged, including our guy friends this time!

After a while, activity on the all international grad students group died down. It goes weeks without a message. People have basically spread out into smaller groups. Now… Why is that?

I went to my advisor (Aaron Shaw) and we basically started geeking out. I didn’t plan on doing some random experiment, but I accidentally observed what is called a “fork” of online communities!

To better explain this, let’s go back to the 1970s. Albert O. Hirschman published the influential book Exit, Voice, and Loyalty: Responses to Decline in Firms, Organizations, and States. In the book, Hirschman makes the argument that consumers can show their dissatisfaction in two ways: they can either exit (stop using that service or buying that product) or they can use voice (communicate a complaint and try to suggest a change). The simplicity of this argument makes it applicable in a range of different fields, such as “personal relationships, emigration, workplace relations, political parties, as well as public policy” (Dowding, 2016). And, more recently, online communities!

There are a lot of works based on Hirschman’s book, and mostly recently the idea of “fork” has also been added as a concept together with exit and voice:

“Forking is a form of group secession (exit) that takes an existing set of institutions and creates a new ‘society’ with a shared history but divergent futures.” (Berg and Berg, 2020)

The term originally comes from open source software communities, where developers are allowed to copy a code repository, work on it separately, modify it, and release it in different forms. Seeing it this way, it makes a lot of sense that it could be applied to online communities as well. In fact, studies related to online community migration keep growing. For example, Fiesler and Dym (2020) explored how transformative fandom communities migrate across platforms over a period of 20 years. Migration was driven by changes in platform policies, user needs, or technical issues. Their work highlights how these migrations can lead to social fragmentation, the loss of shared cultural artifacts, and the reformation of communities in new spaces.

In our case, the international graduate students’ group provided the foundation for forming meaningful connections. However, as people developed closer friendships and found more specific communities of interest (e.g., Brazilian grad students or Girl Energy), the need for the larger, general group diminished. This isn’t necessarily a sign of failure for the original community but rather an indication of its success in fulfilling its initial purpose!

It’s fascinating to observe how the lifecycle of online communities can parallel theoretical concepts like Hirschman’s exit, voice, and loyalty and grow from there. These frameworks help explain not just why communities evolve but also how users actively shape their social environments to meet their changing needs. Have you noticed similar patterns in the communities you’re a part of?

January 11, 2025January 31, 2025

Thinking about AI harms to communities? Submit to our upcoming 4S Panel!

We advocate for consideration of harms to communities (including online communities) as they respond to, are used for, and incorporate generative AI algorithms. One area of risk is one we call ‘Social Model Collapse,’ the notion that changes to community dynamics — social and technical — can disrupt the fundamental processes they rely on to sustain themselves and flourish. We see this as a clear point of shared concern among STS, HCI, Sociology, and Communication scholars, and are hosting an open panel at the upcoming meeting of the Society for Social Studies of Science (4S), September 3-7 in Seattle. This panel is being coordinated by CDSC members Kaylea Champion (University of Washington) and Sohyeon Hwang (Princeton University) along with our colleague Hanlin Li (University of Texas).

From our call for submissions:

Model collapse in machine learning refers to the deterioration such a model faces if it is re-fed with its own output, removing variation and generating poor output; in this panel, we extend this notion to ask in what ways the use of algorithmic output in place of human participation in social systems places those social systems at risk. Recent research findings in the generation of synthetic text using large language models have fed and been fed by a rush to extract value from, and engage with, online communities. Such communities include the discussion forum Reddit, the software development communities producing open source, the participants in the question and answer forum StackExchange, and the contributors to the online knowledge base Wikipedia.

The success of these communities depends on a range of social phenomena threatened by adoption of synthetic text generation as a modality replacing human authors. Newcomers who ask naive questions are a source of members and leaders, but may shift their inquiries to LLMs and never join the community as contributors. Software communities are to some extent reliant on a sense of generalized reciprocity to turn users into contributors; such appreciation may falter if their apparent benefactor is a tireless bot. Knowledge communities are dependent on human curation, inquiry, and effort to create new knowledge, which may be systemically diluted by the presence of purported participants who are only algorithms echoing back reconstructions of the others. Meanwhile, extractive technology firms profit from anyone still engaging in a genuine manner or following their own inquiries.

In this panel, we invite consideration of current forms of social model collapse driven by a rush of scientific-industrial activity, as well as reflection on past examples of social model collapse to better contextualize and understand our present moment.

Submissions are 250-word abstracts due February 2nd; our panel is #223, “Risks of ‘Social Model Collapse’ in the Face of Scientific and Technological Advances” [Submission site link].

December 17, 2024

Puget Sound Python community hosting research talk by Kaylea Champion 12/18

Kaylea Champion will be presenting a practitioner-focused talk to the Puget Sound Python user group (PuPPY) on December 18th at 5:30 p.m. PST at the GitHub offices in Bellevue, Washington. Details on how to attend this free event are available here. This practitioner-oriented talk will draw from her own research as well findings from across the academy.

November 20, 2024November 23, 2024

On The Challenges of Governing the Online Commons

“Elinor Ostrom and the eight principles of governing the commons.” Picture: Inkylab. License: CC-BY-SA 4.0

Over the past several months (post-general exam!), I have been thinking and reading about organizational and institutional perspectives on the governance of platforms and the online communities that populate them. While much of the research on the emerging area of “platform governance”¹ draws from legal traditions or socio-technical approaches, there is also a smaller subset of scholars drawing from political science and democratic theory, thinking about designing governance structures at the level of groups, organizations, and institutions that prove resilient to various collective threats.

I think these approaches hold a lot of promise. As far as addressing one collective threat I am interested in – the strategic manipulation of information environments – most interventions I have seen have either focused on empowering individuals to be more discerning of the information they encounter online or proposing structural changes to features of platforms, such as algorithmic ranking, that dampen the virality of false or misleading information. These are, respectively, micro and macro-level interventions. The integration of participatory and distributed self-governance approaches into existing and emerging platforms is distinct: it is a meso-level intervention, and meso-level approaches remain both theoretically and empirically under-explored in discussions of platform governance.

I recently read three works that do explore this meso layer, however: Paul Gowder’s The Networked Leviathan, Nathan Schneider’s Governable Spaces, and Jennifer Forestal’s Beyond Gatekeeping. All three draw on the work of scholars that look at governance dynamics in offline spaces – in particular, the ideas of political economist Elinor Ostrom and philosopher John Dewey feature prominently – to argue that centralized platforms that practice top-down content moderation are fundamentally hostile to democratic inquiry and practice. Gowder, for example, describes this condition as a “democratic deficit” in the form of governance structures that are fundamentally unaccountable to their users. Naturally, this democratic deficit leads to negative outcomes – online spaces are easily manipulated and degraded by motivated actors. To guard against this, Gowder, Schneider, and Forestal offer various proposals for the integration of participatory structures into these platforms –ones composed of workers, civil society members, and everyday users — into platform governance and decision-marking.

I am on board with these approaches’ diagnosis of the problem, but I think the proposed solutions require more iteration. One thing I worry about is that proposals for integrating participatory and distributed governance into online platforms do not sufficiently take into account the qualitative differences between online spaces and the offline settings researchers have previously studied. When I was reading Ostrom’s Governing the Commons, for example, from which many of these interventions take at least some inspiration, I was struck by the three similarities that she noted virtually all of the common-pool resource settings she analyzed shared:

They had stable populations over long periods of time. Here’s how Ostrom describes it: “Individuals have shared a past and expect to share a future. It is important for individuals to maintain their reputations as reliable members of the community. These individuals live side by side and farm the same plots year after year. They expect their children and their grandchildren to inherit their land. In other words, their discount rates are low. If costly investments in provision are made at one point in time, the proprietors – or their families – are likely to reap the benefits.”
Norms of reciprocity and interdependence evolved in these settings among a largely similar group of individuals with shared interests. Ostrom explains: “Many of these norms make it feasible for individuals to live in close interdependence on many fronts without excessive conflict. Further, a reputation for keeping promises, honest dealings, and reliability in one arena is a valuable asset. Prudent, long-term self-interest reinforces the acceptance of the norms of proper behavior. None of these situations involves participants who vary greatly in regard to ownership of assets, skills, knowledge. ethnicity, race, or other variables that could strongly divide a group of individuals (R.Johnson and Libecap 1982).”
These cases were the success stories! Ostrom clarifies that the cases she analyzed “were specifically selected because they have endured while others have failed.” In other words, they already had sustainable resource systems and robust institutions in place.

Most (virtually all?) online platforms, and the communities that inhabit them, do not share these properties. In online spaces, individuals tend to be geographically scattered across the globe, and there’s no incentive to sustainably maintain the community for future generations to inherit, like there is with a plot of land. Moreover, members of online communities tend to have varying levels of commitment, and the anonymity and distance offered by technology makes norms of social reciprocity and interdependence harder (although not impossible) to cultivate.

The CPRs Ostrom studied were already facing uncertain and complex background conditions — but they also possessed distinct qualities conducive for success. I generally think online spaces, and the digital institutions that govern them, do not possess these qualities, and are thus even more vulnerable to threats like appropriation, pollution, or capture than the CPRs Ostrom studied. Because of this, I think a direct porting of most of Ostrom’s design principles to online governing institutions is probably insufficient. But I see an evolved set of these principles that explicitly addresses the power differentials and adversarial incentives baked into the design of social software as one way forward. What these principles could look like should be the subject of future empirical research, and maybe a future post on this blog. I am excited that researchers are exploring these meso-level interventions, which is where I think a lot of the solution lies.

Gorwa (2019) offers a definition of platform governance: “a concept intended to capture the layers of governance relationships structuring interactions between key parties in today’s platform society, including platform companies, users, advertisers, governments, and other political actors.” ↩︎

November 10, 2024

CDSC at CSCW 2024: Moderation, Bots, Taboos, and Governance Capture!

If you are attending the ACM conference on Computer-supported Cooperative Work and Social Computing (CSCW) this year CSCW in San José, Costa Rica. You are warmly invited to join CDSC members during our talks and other scheduled events. Please come say hi!

This CDSC has four papers at CSCW, which we will be presenting over the next three days:

Monday: At 11:00 am in Talamanca, Kaylea Champion will be presenting “Life Histories of Taboo Knowledge Artifacts” (full paper)

Tuesday: At 11:00 am in Central 3, Zarine Kharazian will be presenting “Governance Capture in a Self-Governing Community: A Qualitative Comparison of the Croatian, Serbian, Bosnian, and Serbo-Croatian Wikipedias” (full paper, blog post), followed by Sohyeon Hwang presenting “Adopting third-party bots for managing online communities” (full paper, blog post)

Wednesday: At 2:30 pm in Guanacaste 3, Kaylea Champion will be co-presenting “Challenges in Restructuring Community-based Moderation” (full paper, preprint)

If you’re at CSCW, feel free to get in touch in person or via Discord!

November 9, 2024

Dr. Yoel Roth: Online Safety and Security

On Oct. 23, 2024, Dr. Yoel Roth gave a lecture titled as “Decentralizing online safety and security: The promises and perils of federated social media” hosted by the Department of Human-Centered Design and Engineering at University of Washington, and a number of CDSC faculty and students were present and discussed issues of digital governance with Dr. Roth.

Years of dedicated research, activism, regulatory efforts, and investment by civil society and technology companies have increased awareness and established some control over the harmful impacts of social media platforms. Today, as social media undergoes its most significant transformation in over a decade—with alternative platforms like Mastodon, Bluesky, and Threads gaining traction in the wake of Twitter’s decline—there is optimism about these new platforms and their potential for alternative governance models that offer users greater empowerment. Dr. Roth has highlighted how self-organizing practices in these emerging communities are constructing norms for content moderation in decentralized platforms, which resonates with a number of research conducted by our research group (see Colglazier, TeBlunthuis, & Shaw, 2024).

However, these platforms retain many of the same design features and risks for misuse as mainstream platforms, yet lack the robust moderation and detection systems that have been painstakingly developed elsewhere. Additionally, significant technological, governance, and financial challenges hinder the development of these essential safeguards. Drawing on empirical research into platform moderation capacities, Dr. Roth examines the complex outcomes of this transformation in social media and suggests the following potential solutions for collective safety and addressing security risks identified: (1) institutionalize shared responses to critical harms, (2) build transparent governance into the system, (3) invest in open-source tooling, and (4) enable data sharing across instances (Roth & Lai, 2024).

References

Colglazier, C., TeBlunthuis, N., & Shaw, A. (2024, May). The Effects of Group Sanctions on Participation and Toxicity: Quasi-experimental Evidence from the Fediverse. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 18, pp. 315-328).Roth, Y., & Lai, S. (2024). Securing federated platforms: Collective risks and responses. Journal of Online Trust and Safety, 2(2).

October 23, 2024October 15, 2024

FOSSY 2024 Wrap Up: Sophia Vargas on “A review of valuation models and their application to open source models”

In the seventh talk of the Science of Community track we organized for FOSSY, Google FOSS researcher Sophia Vargas offered an overview of different strategies for measuring the value of open source (particularly in the context of a company thinking about how to engage with FOSS).

Some of Sophia’s key insights are: models for measuring one-time cost are relatively widespread (but depend on outcome metrics like lines of code rather than difficulty); understanding the cost of maintenance and community is still in formative stages; and that business leaders can make use of research-grounded models developed to measure value and risk in an academic context into decisionmaking tools within a business context.

This is part 7 of an 8-part series sharing highlights from the Science of Community track at FOSSY. Visit the FOSSY site for more bio details and an abstract of the talk.