Meet us at FOSSY!

The Free and Open Source Software Yearly conference (FOSSY) is in less than a week and we will be there!

We will be running the Science of Community track on Saturday July 15.

Two photos. In one is Kaylea Chamption, who has purple hair and a blue shirt. In the other is Sejal Khatri, Benjamin Mako Hill, and Aaron Shaw.
Kaylea Champion, and Benjamin Mako Hill and Aaron Shaw with Sejal Khatri (who won’t be at FOSSY)

The Science of Community track is inspired by the CDSC Science of Community Dialogues, which aim to bring together practitioners and researchers to discuss scholarly work that is relevant to the efforts of practitioners. As researchers, we get so much from the communities we work with and study and we want them to also learn from the research they so generously take part in. While the Dialogues cover a broad range of topics and communities, FOSSY presentations focus on how that work related to free and open source software communities, projects, and practitioners.

At FOSSY, we will have a number of really amazing researchers presenting their work. We wanted to share some highlights from the schedule.

Sophia Vargas, from Google’s Open Source Programs Office, will be presenting on how metrics can help us understand contributor burnout. Professor Shoji Kajita, from Kyoto University, will discuss research data management for FOSS communities. Mariam Guizani, from Oregon State University, will cover research on the why and how of corporate participation in FOSS. We will additionally have lightning talks by Adam Hyde, Anita Sarma, Shauna Gordon-McKeon, and incoming Northwestern Ph.D. student Matthew Gaughan.

We are really excited about our workshop “Let’s Get Real: Putting Research Findings Into Practice.” This workshop, designed for FOSS contributors and practitioners, will help guide you on how to get the most out of the incredible research on and relevant to FOSS. If you want to learn how to navigate the sheer volume of interesting research work happening or how to understand what it means, this is the session for you! Our workshop will be led by Kaylea Chamption and Professors Aaron Shaw and Benjamin Mako Hill. You can read more on our wiki.

Due to scheduling issues, Eriol Fox will be presenting their talk, “Community lead user research and usability in Science and Research OSS: What we learned,” in the Wildcard Track. We recommend going!

We hope to see you at FOSSY. Even if you can’t make it to our sessions, we’ll be at the conference so stop by and say hello!

Community Dialogue on Digital Inequalities

Join the Community Data Science Collective (CDSC) for our 5th Science of Community Dialogue! This Community Dialogue will take place on May 19 at 10:00 am PDT (18:00 UTC). This Dialogue focuses on digital inequalities and online community participation. Professor Hernan Galperin (University of Southern California) will join Floor Fiers (Northwestern University) to present recent research on topics including:

  • Inequalities in online access and participation
  • Differentiated participation in online communities
  • Causes and consequences of online inequalities
  • Digital skills as a barrier to online participation
  • Combating digital discrimination

A full session descriptions is on our website. Register online

What is a Dialogue?

The Science of Community Dialogue Series is a series of conversations between researchers, experts, community organizers, and other people who are interested in how communities work, collaborate, and succeed. You can watch this short introduction video with Aaron Shaw.

What is the CDSC?

The Community Data Science Collective (CDSC) is an interdisciplinary research group made of up of faculty and students at the University of Washington Department of Communication, the Northwestern University Department of Communication Studies, the Carleton College Computer Science Department, and the Purdue University School of Communication.

Learn more

If you’d like to learn more or get future updates about the Science of Community Dialogues, please join the low volume announcement list.

CDSC @ FOSSY: Call for Proposals

Help us build a dynamic and exciting program to facilitate conversations between free and open source software (FOSS) researchers and practitioners! Submit a session proposal for FOSSY! The deadline for submissions is May 14 May 18 (Edit: The deadline has been moved.).

A photo of a classroom full of people on laptops. The seating is tiered, with people in the back higher up.
A Spring 2015 CDSC workshop. Photo by Sage Ross, licensed Creative Commons Attribution Share Alike

Although scholars publish hundreds of papers about free and open source software, online governance, licensing, and other topics very relevant to FOSS communities each year, much of this work never makes it out of academic journals and conferences and back to FOSS communities. At the same time, FOSS communities have a range of insights, questions, and data that researchers studying FOSS could benefit from enormously.

This gap between research and communities inspired us to propose a track at FOSSY, the Free and Open Source Software Yearly conference. Our track is called FOSS Research for All: The Science of Community.

The goal of this track is to build bridges between FOSS communities and scientific research conducted with and about FOSS communities. We hope to provide opportunities for community members to hear about exciting results from researchers; opportunities for researchers to learn from the FOSS community members; and spaces for the FOSS community to think together about how to improve FOSS projects by leveraging research insights and research.

This track will include opportunities for:

  • researchers to talk with practitioners (about their research)
  • practitioners to talk with researchers (about their needs)
  • researchers to talk with other researchers (for learning and collaboration)

FOSSY runs July 13 – 16, and we are hoping to have 2-3 days of content. Towards that end, we are seeking proposals! If you are a researcher with work of relevance to FOSS community members; a FOSS community member with experiences or opportunities relevant to research; or just want to be involved in this conversation, please consider proposing in one of the following formats:

  • Short Talks. Do you have a recent project to share in some depth? A topic that needs time to unpack? Take 20 minutes to present your thoughts. Following each presentation there will be time for group discussion to help participants apply your work to their practice.
  • Lightning Talks. Want to make a focused point, pitch, or problem report to a great audience? Bring your 5-minute talk to our lightning round.
  • Panels. We will be facilitating dialogue between researchers and community members. Would you be willing to share your thoughts as a panelist? Let us know your expertise and a few notes on your perspective so that we can develop a diverse and engaging panel. Contact us directly (details below) if you are interested in being on a panel.

If you have an idea that doesn’t fit into these formats, let’s chat! You can reach out to Kaylea (kaylea@uw.edu), Molly (molly.deblanc@northwestern.edu) or submit your idea as a proposal via the FOSSY form.

Submissions are non-archival, so we welcome ongoing, completed, and already published research work. Non-archival means that presentation of work at FOSSY does not constitute a publication. It’s just a way to get your work out there! Work that synthesizes or draws across a body of published papers is particularly welcome.

What kind of research are you looking to have presented? 

We are interested in any topic related to FOSS communities! This might include research from computing (including software engineering, computer security, social computing, HCI), the social sciences and humanities (including management, philosophy, law, economics, sociology, communication, and more), information sciences, and beyond.

For example: how to identify undermaintained FOSS packages and what to do about it; community growth and how to find success in small communities; effective rule making and enforcement in online communities. 

If it involves FOSS or is of interest to FOSS practitioners, we welcome it! We are eager to help you put your results into the hands of practitioners who can use your findings to inform their own community’s practices and policies on social, governance, and technical topics. 

We hope to welcome scholars and researchers from across academia, government, industry, or wherever else you are from!

Who will I be speaking with?

We expect a multi-disciplinary audience. FOSSY will be bringing together free and open source software practitioners including community managers, designers, legal experts, non-profit and project leaders, technical developers, technical writers, and researchers.

The track is being organized by:

Kaylea Champion, Community Data Science Collective and the University of Washington

Molly de Blanc, Community Data Science Collective and Northwestern University

Benjamin Mako Hill, Community Data Science Collective and the University of Washington

Aaron Shaw, Community Data Science Collective and Northwestern University

Community Dialogue on Accountable Governance and Data

Our fourth Community Dialogue covered topics on accountable governance and data leverage as a tool for accountable governance. It featured Amy X. Zhang (University of Washington) and recent CDSC graduate Nick Vincent (Northwestern, UC Davis).

Designing and Building Governance in Online Communities (Amy X. Zhang)

This session discussed different methods of engagement between communities and their governance structures, different models of governance, and empirical work to understand tensions within communities and governance structures. Amy presented PolicyKit, a tool her team built in response to what they learned from their research, which will also help to continue to better understand governance.

Can We Solve Emerging Problems in Technology and AI By Giving Communities Data Leverage? (Nick Vincent)

Nick Vincent looked at the question of how to hold governance structures accountable through collective action. He asked how groups can leverage control of data and the potential implications of data leverage on social structures and technical development.

If you are interested in attending a future Dialogue, sign up for our very low-volume mailing list.

2022 Year in Review

One of the fun things about being in a large lab is getting to celebrate everyone’s accomplishments, wins, and the good stuff that happens. Here is a brief-ish overview of some real successes from 2022.

A photo of the CDSC group on some steps with their hands in the air. There are nineteen people in the photo. NINETEEN!
Our 2022 Fall Retreat!

Graduations and New Positions

Our lab gained SIX new grad student members, Kevin Ackermann, Yibin Fang, Ellie Ross, Dyuti Jha, Hazel Chu, and Ryan Funkhouser. Kevin is a first year graduate student at Northwestern and Yibin and Ellie are first year students at University of Washington. Dyuti, Hazel, and Ryan joined us via Purdue and become Jeremy Foote’s first ever advisees. We had quite a number of undergraduate RAs. We also gained Divya Sikka from Interlake High School.

Nick Vincent became Dr. Nick Vincent, Ph.D (Northwestern). He will do a postdoc at the University of California Davis and University of Washington. Molly de Blanc earned their master’s degree (New York University). Dr. Nate TeBlunthius joined the University of Michigan as a post-doc, working with Professor Ceren Budak.

Kaylea Champion and Regina Cheng had their dissertation proposals approved and Floor Fiers finished their qualifying exams and is now a Ph.D. candidate. Carl Colglaizer finished his coursework.

Aaron Shaw started an appointment as the Scholar-in-Residence for King County, Washington, as well as Visiting Professor in the Department of Communication at the University of Washington.

Teaching

As faculty, it is expected that Jeremy Foote, Mako Hill, Sneha Narayan, and Aaron Shaw taught classes. As a class teaching assistant, Kaylea won an Outstanding Teaching Award! Floor also taught a public speaking class. CDSC members were also teaching assistants, led workshops, and gave guest lectures in classes.

an icon of a silhouette holding a book and a wand, with stars and planets around them. Text reads "Best Teacher in the Universe."
BEST TEACHER” by mickeymanzzz is licensed under CC BY-SA 2.0.

Presentations

This list is far from complete, including some highlights!

Carl presented at ICA alongside Nicholas Diakopoulos, “Predictive Models in News Coverage of the COVID-19 Pandemic in the United States.”

Floor was present at the Easter Sociological Society (ESS), AoIR (Association of Internet Researchers), and ICA. They won a top paper award at National Communication Association (NCA): Walter, N., Suresh, S., Brooks, J. J., Saucier, C., Fiers, F., & Holbert, R. L. (2022, November). The Chaffee Principle: The Most Likely Effect of Communication…Is Further Communication. National Communication Association (NCA) National Convention, New Orleans, LA.

Kaylea had a whopping two papers at ICA, a keynote at the IEEE Symposium on Digital Privacy and Social Media, and presentations at CSCW Doctoral Consortium, a CSCW workshop, and the DUB Doctoral Consortium. She also participated in Aaron Swartz Day, SeaGL, CHAOSSCon, MozFest, and an event at UMASS Boston.

Molly also participated in Aaron Swartz Day, and a workshop at CSCW on volunteer labor and data.

Regina gave presentations at the Makecode Team at Microsoft Research, Expertise@scale Salon (Emory University), Microsoft Research HCI Seminar, and CSCW (“Many Destinations, Many Pathways: A Quantitative Analysis of Legitimate Peripheral Participation in Scratch. 2022” and “Feedback Exchange and Online Affinity: A Case Study of Online Fanfiction Writers“) (among others). She attended CHI and NAACL (with two additional papers). Regina’s paper with Syamindu Dasgupta and Mako HIll at CHI 2022 (“How Interest-Driven Content Creation Shapes Opportunities for Informal Learning in Scratch: A Case Study on Novices’ Use of Data Structures“) won Best Paper Honorable Mention Award.

Sohyeon was at GLF as a knowledge steward and presented two posters at the HCI+D Lambert Conference (one with Emily Zou and one with Charlie Kiene, Serene Ong, and Aaron). She also presented at ICWSM, had posters at ICSSI and IC2S2, and organized a workshop at CSCW. In addition to more traditional academic presentations, Sohyeon was on a fireside chat panel hosted by d/arc server, guest lectured at the University of Washington and Northwestern, and met with Discord moderators to talk about heterogeneity in online governance. Sohyeon also won the Half-Bake Off at the CDSC fall retreat.

Public Scholarship

A photo of four people. Two of them are sitting and looking at laptops, while two of them are standing and looking at the laptops thinking. Only one person is smiling.
This image is from 2016

We did a lot of public scholarship this year! Among presentations, leading workshops, and organizing public facing events, CDSC also ran the Science of Community Dialogue Series. Presenters from within CDSC include Jeremy Foote, Sohyeon Hwang, Nate TeBlunthius, Charlie Kiene, Kaylea Champion, Regina Cheng, and Nick Vincent. Guest speakers included Dr. Shruti Sannon, Dr. Denae Ford, and Dr. Amy X. Zhang. To attend future Dialogues, sign up for our low-volume email list!

These events are organized by Molly, with assistance from Aaron and Mako.

Publications

Rather than listing publications here, you can check them out on the wiki.

Announcing the Community Dialogue on Accountable Governance

Join the Community Data Science Collective (CDSC) for our 4th Science of Community Dialogue! This Community Dialogue will take place on January 20 at 10:00 PT (18:00 UTC) . This Dialogue focuses on community governance and data. Professor Amy X. Zhang (University of Washington) will join Dr. Nick Vincent (Northwestern University, UC Davis) to cover topics including:

  • how communities can develop accountable governance
  • the distribution of power and decision making in communities
  • how collective action can impact systems
  • data leverage

You can register online.

Full Description:

How can communities develop and understand accountable governance? So many online environments rely on community members in profound ways without being accountable to them in direct ways. In this session, we will explore this topic and its implications for online communities and platforms. 

First, Nick Vincent (Northwestern, UC Davis) will discuss the opportunities for so-called “data leverage” and will highlight the potential to push back on the “data status quo” to build compelling alternatives, including the potential for “data dividends” that allow a broader set of users to economically benefit from their contributions. 

The idea of “data leverage” comes out of a basic, but little discussed fact: Many technologies are highly reliant on content and behavioral traces created by everyday Internet users, and particularly online community members who contribute text, images, code, editorial judgement, rankings, ratings, and more.. The technologies that rely on these resources include ubiquitous and familiar tools like search engines as well as new bleeding edge “Generative AI” systems that produce novel art, prose, code and more. Because these systems rely on contributions from Internet users, collective action by these users (for instance, withholding content) has the potential to impact system performance and operators. 

Next, Amy Zhang (University of Washington) will discuss how communities can think about their governance and the ways in which the distribution of power and decision-making are encoded into the online community software that communities use. She will then describe a tool called PolicyKit that has been developed with the aim of breaking out of common top-down models for governance in online communities to enable governance models that are more open, transparent, and democratic. PolicyKit works by integrating with a community’s platform(s) of choice for online participation (e.g., Slack, Github, Discord, Reddit, OpenCollective), and then provides tools for community members to create a wide range of governance policies and automatically carry out those policies on and across their home platforms. She will then conclude with a discussion of specific governance models and how they incorporate legitimacy and accountability in their design.

What is a Dialogue?

The Science of Community Dialogue Series is a series of conversations between researchers, experts, community organizers, and other people who are interested in how communities work, collaborate, and succeed. You can watch this short introduction video with Aaron Shaw.

Community Dialogue: Informal Learning

We had another Science of Community Dialogue! This most recent one was themed around informal learning, talking about communities as informal learning spaces and the sorts of tools and habits communities can adopt to help learners, mentors, and newcomers. We had presentations from Ruijia (Regina) Cheng (University of Washington, CDSC) and Dr. Denae Ford Robinson (Microsoft, University of Washington).

Regina Cheng covered three related research projects and relevant findings:

  • Ruijia Cheng and Benjamin Mako Hill. 2022. “Many Destinations, Many Pathways: A Quantitative Analysis of Legitimate Peripheral Participation in Scratch.” https://doi.org/10.1145/3555106
  • Ruijia Cheng, Sayamindu Dasgupta, and Benjamin Mako Hill. 2022. “How Interest-Driven Content Creation Shapes Opportunities for Informal Learning in Scratch: A Case Study on Novices’ Use of Data Structures.” https://doi.org/10.1145/3491102.3502124
  • Ruijia Cheng and Jenna Frens. 2022. “Feedback Exchange and Online Affinity: A Case Study of Online Fanfiction Writers.” https://doi.org/10.1145/3555127

Participants collaboratively put together three takeaways from Regina Cheng’s presentation.

We often talk about wanting to support “learning” in some general sense, but a critically important question to ask is “learning about what.” Let’s say we want people to learn three things A, B, and C. The kinds of actions or behaviors that support learning goal A often have no effect on B, and C. And sometimes they actively hurt it. We need to be more specific about what we want people to learn because there are tradeoffs.

Social support is wonderful in that users create examples and resources and answer questions. But it also has this narrowing effect. There’s a piling-on effect that makes it easier and easier (and more likely!) to learn the things that folks have learned before and less likely that people learn anything else.

Feedback is not about information transfer, it’s about relationships. To best promote learning, we should create rich, legitimate, inclusive social environment. These are perhaps good things to do anyway.

Dr. Denae Ford Robinson focused on free and open source software (FOSS) communities as a case study of learning communities. She covered theory, needs, and demonstrated tools designed to help with the mentorship and the learning process.

Community-driven settings like FOSS (and social-good oriented projects in particular) rely enormously on volunteers and/or people opting into participation in ways that create huge challenges related to promoting project sustainability: the most active participants are overloaded in a way that is a recipe for burnout.

The path to sustainability involves attracting, retaining, and then sustaining contributions and understanding these processes as both (a) part of the lifecycle of a user and (b) part of a set of dynamics and lifecycle within the community (e.g., dynamics of community growth).

Approach 1 involves providing new information to help maintainers understand how things are going in their communities. A lack of insight and easy access to data is a cause of inefficiency and burnout.

Approach 2 involves making specific, structured recommendations to maintainers based on the experience of others in the past to do things like add tags and to shape behavior.

Approach 3 involves automating aspects of identifying and recognizing work (and perhaps other tasks) as a way of promoting newcomer experiences and reducing the load on maintainers for doing that.

This event and some of the research presented in it were supported by multiple awards from the National Science Foundation (DGE-1842165; IIS-2045055; IIS-1908850; IIS-1910202), Northwestern University, the University of Washington, and Purdue University.

How to Network from Home

We have been going on Lab Dates and it is pretty cool.

Five penguins standing in a cold looking environment. It appears as though three of them are chatting with one another and the other two are having their own conversation.
Caption This Photo” by U.S. Geological Survey is marked with CC0 1.0.

CSCW 2021 introduced Lab Speed Dating wherein labs were matched and given an hour to get to know each other. Sohyeon Hwang organized our first lab date. It was so much fun we decided to go on more in order to meet other groups. I wanted to share a bit about this and our process in case you are interested in trying it out or want to have a meetup with us.

After the initial CSCW Lab Date we made a very long list of other labs we want to meet and have (slowly) been inviting them to come by. We also included individual researchers, people who collaborate in smaller, informal groups, co-authors, and corporate research teams.

We use our “softblock” to schedule meetings, rather than finding a new time for each meeting. The CDSC maintains a softblock, which is a block of time for whatever comes up, one-off meetings we need to schedule, and co-working sessions. (Today I am using the softblock to write this blog post!)

We are pretty open to different structures for our lab dates. So far the ones with full labs have been divided into two parts: 1) everyone introduces themselves as briefly as we can manage and then 2) we break out into small groups for short periods of time to talk. We try to cycle through 2-3 of these breakouts, depending on how many people are in attendance. When meeting with individuals, our guests typically present a piece of work that we workshop or discussed their interests in a more general sense and we talk about them as a whole group. We are open to other models, but nothing has come up yet.

Blocks have focused around networking and getting to know other researchers on a professional level. Because we have been attending fewer in-person events, we have had fewer chances to meet new people. Even at events it can be hard to connect with the people you want to meet and it is very hard (for us) to have everyone from the CDSC in a space together with another group.

If you are interested in going on a lab date with us, you can message me on IRC or email me (details here). We have a lot of open spots for the rest of the quarter and one of them could be yours!

Second Community Dialogue on Anonymity and Privacy

We recently held our second Community Dialogue around the theme of anonymity and privacy. Kaylea Champion presented on the role of anonymity in peer-contribution communities. Dr. Shruti Sannon joined us from the University of Michigan and talked about privacy in the gig economy.

What’s Anonymity Worth (Kaylea Champion)

Anonymity can protect and empower contributors in communities. Anonymity can make people feel safer or actually be safer. For example: Wikipedia editors who are working on controversial pages within contested geographies may be safer when they are able to contribute anonymously. Anonymous contribution is not without problems, as it can also empower trolls, harassers, and other bad actors. For more details, and actions you can take or policies to recommend within your communities, watch the video of Kaylea Champion’s presentation below.

Privacy and Surveillance in the Gig Economy (Dr. Shruti Sannon)

Gig workers can be asked or coerced to give up privacy in exchange for money through the design of the gig platforms they are using or by request of customers. Gig workers also use surveillance tools as a means of protecting themselves — some ride share drivers have cameras in their cars for this purpose. Dr. Sannon shared the broader implications of this situation, and what it can mean outside of the gig economy. To learn more, watch the video below.

Join us!

You can subscribe to our mailing list! We’ll be making announcements about future events there. It is a low volume mailing list.

Acknowledgements

Thanks to speakers Kaylea Champion and Shruti Sannon. The vision for this event borrows from the User and Open Innovation workshops organized by Eric von Hippel and colleagues, as well as others. This event and the research presented in it were supported by multiple awards from the National Science Foundation (DGE-1842165; IIS-2045055; IIS-1908850; IIS-1910202), Northwestern University, the University of Washington, and Purdue University.

Come meet us at CHI 2022

We’re going to be at CHI! The Community Data Science Collective will be presenting three papers. You can find us there in person in New Orleans, Louisiana, April 30 – May 5. If you’ve ever wanted a super cool CDSC sticker, this is your chance!

Two red street cars going down a tree lined street.
Streetcars in New Orleans: 2000 series – Perley A. Thomas Car Works 900 Series Replicas” by Flavio~ is marked with CC BY 2.0.

Stefania (Stef) Druga (University of Washington) wrote “Family as a Third Space for AI Literacies: How do children and parents learn about AI together?” with Amy J. Ko and Fee Lia Christoph (University of Michigan). Stef will be presenting at “Interactive Learning Support Systems,” Monday May 2 at 14:15.

Sejal Khatri (University of Washington) received an honorable mention for her work “The Social Embeddedness of Peer Production: A Comparitive Qualitative Analysis of Three Indian Language Wikipedia Editions,” co-authored by Syamindu Dasgupta, Benjamin Mako Hill, and Aaron Shaw. Sejal will be presenting Tuesday May 3 at 14:15 in “Crowdwork and Collaboration.” Sejal, Aaron, Mako, and Syamindu also have a blog post available.

Ruijia Chen (University of Washington) also received an honorable mention for her paper “How Interest-Driven Content Creation Shapes Opportunities for Informal Learning in Scratch: A Case Study on Novices’ Use of Data Structures,” co-authored by Benjamin Mako Hill and Syamindu Dasgupta. Regina will be talking about it during the session “Programing and Coding Support” on Wednesday May 4 at 09:00. You can also read about Ruijia, Mako, and Syamindu’s work on our blog.

The CDSC logo, which looks a bit like a cloud with four legs, and the text "Community Data Science Collective."
You can have this on a sticker!