Kaylea Champion, and Benjamin Mako Hill and Aaron Shaw with Sejal Khatri (who won’t be at FOSSY)
The Science of Community track is inspired by the CDSC Science of Community Dialogues, which aim to bring together practitioners and researchers to discuss scholarly work that is relevant to the efforts of practitioners. As researchers, we get so much from the communities we work with and study and we want them to also learn from the research they so generously take part in. While the Dialogues cover a broad range of topics and communities, FOSSY presentations focus on how that work related to free and open source software communities, projects, and practitioners.
At FOSSY, we will have a number of really amazing researchers presenting their work. We wanted to share some highlights from the schedule.
Sophia Vargas, from Google’s Open Source Programs Office, will be presenting on how metrics can help us understand contributor burnout. Professor Shoji Kajita, from Kyoto University, will discuss research data management for FOSS communities. Mariam Guizani, from Oregon State University, will cover research on the why and how of corporate participation in FOSS. We will additionally have lightning talks by Adam Hyde, Anita Sarma, Shauna Gordon-McKeon, and incoming Northwestern Ph.D. student Matthew Gaughan.
We are really excited about our workshop “Let’s Get Real: Putting Research Findings Into Practice.” This workshop, designed for FOSS contributors and practitioners, will help guide you on how to get the most out of the incredible research on and relevant to FOSS. If you want to learn how to navigate the sheer volume of interesting research work happening or how to understand what it means, this is the session for you! Our workshop will be led by Kaylea Chamption and Professors Aaron Shaw and Benjamin Mako Hill. You can read more on our wiki.
The International Communication Association (ICA)’s 73nd annual conference is coming up soon. This year, the conference takes place in Toronto, Canada, and a subset of our collective is showing up to present work in person. We are looking forward to meeting up, talking about research, and hanging out together!
ICA takes place from Wednesday, May 24, to Monday, May 29, and CDSC members will take various roles in a number of different conference programs, including chairing, presentations, and co-organizing of preconference. Here is the list of our participation by the time order, so feel free to join us!
Thursday, May 25
We start off with a presentation by Yibin Fan on Thursday at 10:45 am in the International Living Learning Centre of Toronto Metropolitan University on Political Communication Graduate Student Preconference. In a panel on The Causes and Outcomes of Political Polarization and Violence, Yibin will present a paper entitled “Does Incidental Political Discussion Make Political Expression Less Polarized? Evidence From Online Communities”.
Later on Thursday, another preconference on New Frontiers in Global Digital Inequalities Research will take place in M – Room Linden (Sheraton) from 1:30pm to 5pm, Floor Fiers will work as a co-chair with a number of scholars from international, various academic institutions, and they will give a presentation on “The Gig Economy: A Site of Opportunity Vs. a Site of Risk?”. This preconference is affiliated with Communication and Technology Division and Communication Law & Policy Division of ICA.
Friday, May 26
On Friday, Carl Colglazier will present in a panel on Disinformation, Politics and Social Media at 3:00pm in M – Room York (Sheraton), and the presentation is entitled as “The Effects of Sanctions on Decentralized Social Networking Sites: Quasi-Experimental Evidence From the Fediverse”.
Sunday, May 28
On Sunday we will be actively taking different roles in various sessions. In the morning, Yibin Fan will serve as the moderator for a research paper panel on Political Deliberation and Expression affiliated by Political Communication Division at 9:00 am in 2 – Room Simcoe (Sheraton).
Then there comes our highlighting paper that won the Top Paper Award by Computational Methods Division: Nathan TeBlunthuis will present a methodological research paper entitled as “Automated Content Misclassification Causes Bias in Regression: Can We Fix It? Yes We Can!” in a panel on Debate, Deliberation and Discussion in the Public Sphere at noon in M – Room Maple East (Sheraton). This is a project on which Nate collaborates with Valerie Hase at LMU Munich and Chung-hong Chan at University of Mannheim. Congratulations to them for getting the Top Paper Award!
Last but not least, we will finish off our ICA 2023 by seeing our faculty members serving as the chair and discussants in the Computational Methods Research Escalator Session at 1:30pm in M – Room Maple West (the same room as Nate’s presentation!). The session is junior scholars who are inexperienced in publishing to have connections with more senior researchers in the field. As the Call for Papers by Computational Methods Division says, “Research escalator papers provide an opportunity for less experienced researchers to obtain feedback from more senior scholars about a paper-in-progress, with the goal of making the paper ready for submission to a conference or journal.” Aaron Shaw, together with Matthew Weber at Rutgers University, will serve as the chairs for the session. Benjamin Mako Hill and Jeremy Foote, together with a bunch of scholars from other institutions, will work as discussants for improving the research presented here.
We look forward to sharing our research and connecting with you at ICA!
Join the Community Data Science Collective (CDSC) for our 5th Science of Community Dialogue! This Community Dialogue will take place on May 19 at 10:00 am PDT (18:00 UTC). This Dialogue focuses on digital inequalities and online community participation. Professor Hernan Galperin (University of Southern California) will join Floor Fiers (Northwestern University) to present recent research on topics including:
Inequalities in online access and participation
Differentiated participation in online communities
Causes and consequences of online inequalities
Digital skills as a barrier to online participation
The Community Data Science Collective (CDSC) is an interdisciplinary research group made of up of faculty and students at the University of Washington Department of Communication, the Northwestern University Department of Communication Studies, the Carleton College Computer Science Department, and the Purdue University School of Communication.
Learn more
If you’d like to learn more or get future updates about the Science of Community Dialogues, please join the low volume announcement list.
Help us build a dynamic and exciting program to facilitate conversations between free and open source software (FOSS) researchers and practitioners! Submit a session proposal for FOSSY! The deadline for submissions is May 14 May 18 (Edit: The deadline has been moved.).
A Spring 2015 CDSC workshop. Photo by Sage Ross, licensed Creative Commons Attribution Share Alike
Although scholars publish hundreds of papers about free and open source software, online governance, licensing, and other topics very relevant to FOSS communities each year, much of this work never makes it out of academic journals and conferences and back to FOSS communities. At the same time, FOSS communities have a range of insights, questions, and data that researchers studying FOSS could benefit from enormously.
The goal of this track is to build bridges between FOSS communities and scientific research conducted with and about FOSS communities. We hope to provide opportunities for community members to hear about exciting results from researchers; opportunities for researchers to learn from the FOSS community members; and spaces for the FOSS community to think together about how to improve FOSS projects by leveraging research insights and research.
This track will include opportunities for:
researchers to talk with practitioners (about their research)
practitioners to talk with researchers (about their needs)
researchers to talk with other researchers (for learning and collaboration)
FOSSY runs July 13 – 16, and we are hoping to have 2-3 days of content. Towards that end, we are seeking proposals! If you are a researcher with work of relevance to FOSS community members; a FOSS community member with experiences or opportunities relevant to research; or just want to be involved in this conversation, please consider proposing in one of the following formats:
Short Talks. Do you have a recent project to share in some depth? A topic that needs time to unpack? Take 20 minutes to present your thoughts. Following each presentation there will be time for group discussion to help participants apply your work to their practice.
Lightning Talks. Want to make a focused point, pitch, or problem report to a great audience? Bring your 5-minute talk to our lightning round.
Panels. We will be facilitating dialogue between researchers and community members. Would you be willing to share your thoughts as a panelist? Let us know your expertise and a few notes on your perspective so that we can develop a diverse and engaging panel. Contact us directly (details below) if you are interested in being on a panel.
If you have an idea that doesn’t fit into these formats, let’s chat! You can reach out to Kaylea (kaylea@uw.edu), Molly (molly.deblanc@northwestern.edu) or submit your idea as a proposal via the FOSSY form.
Submissions are non-archival, so we welcome ongoing, completed, and already published research work. Non-archival means that presentation of work at FOSSY does not constitute a publication. It’s just a way to get your work out there! Work that synthesizes or draws across a body of published papers is particularly welcome.
What kind of research are you looking to have presented?
We are interested in any topic related to FOSS communities! This might include research from computing (including software engineering, computer security, social computing, HCI), the social sciences and humanities (including management, philosophy, law, economics, sociology, communication, and more), information sciences, and beyond.
For example: how to identify undermaintained FOSS packages and what to do about it; community growth and how to find success in small communities; effective rule making and enforcement in online communities.
If it involves FOSS or is of interest to FOSS practitioners, we welcome it! We are eager to help you put your results into the hands of practitioners who can use your findings to inform their own community’s practices and policies on social, governance, and technical topics.
We hope to welcome scholars and researchers from across academia, government, industry, or wherever else you are from!
Who will I be speaking with?
We expect a multi-disciplinary audience. FOSSY will be bringing together free and open source software practitioners including community managers, designers, legal experts, non-profit and project leaders, technical developers, technical writers, and researchers.
The track is being organized by:
Kaylea Champion, Community Data Science Collective and the University of Washington
Molly de Blanc, Community Data Science Collective and Northwestern University
Benjamin Mako Hill, Community Data Science Collective and the University of Washington
Aaron Shaw, Community Data Science Collective and Northwestern University
Women in Data Science Puget Sound is part of a 50+-country conference series founded and organized in cooperation with Stanford University’s Data Science coalition. Anyone may attend, regardless of gender: events feature a speaker lineup composed of women in data science. The Puget Sound event is Tuesday, April 25 at the Expedia HQ in Seattle, and numerous affiliated regional and online events are scheduled in the coming weeks.
If you’re in the Seattle area, you might like to catch CDSC member Kaylea presenting a workshop! Here’s the pitch for attending her beginner-friendly session:
Let’s Re-think Political Bias & Build Our Own Classifier
How can we think about political bias without falling into assumptions about who's on what side and what that means?
Data science and ML offer us an alternative: we can parse political speech about a topic and use NLP/ML techniques to classify articles we scrape from the web.
In this hands-on workshop, we'll parse the Congressional Record, build a classifier, scrape search results, and analyze texts. You'll walk away with your own example of how to use data science to analyze political framing.
The full lineup of speakers for the Puget Sound conference is posted here. Tickets for the single-day event are $80 (see this link to request a discount code for half off).
Topics on the schedule for this event look juicy if quant work is your jam: AI, BERT, hypergraphs, visualization, forecasting, quantum computing, causal inference, survival analysis, writing better code and career management, with examples ranging from search, sales, and supply chain to economic disparity, DNA sequencing and saving wildlife!
Our fourth Community Dialogue covered topics on accountable governance and data leverage as a tool for accountable governance. It featured Amy X. Zhang (University of Washington) and recent CDSC graduate Nick Vincent (Northwestern, UC Davis).
Designing and Building Governance in Online Communities (Amy X. Zhang)
This session discussed different methods of engagement between communities and their governance structures, different models of governance, and empirical work to understand tensions within communities and governance structures. Amy presented PolicyKit, a tool her team built in response to what they learned from their research, which will also help to continue to better understand governance.
Can We Solve Emerging Problems in Technology and AI By Giving Communities Data Leverage? (Nick Vincent)
Nick Vincent looked at the question of how to hold governance structures accountable through collective action. He asked how groups can leverage control of data and the potential implications of data leverage on social structures and technical development.
How can communities develop and understand accountable governance? So many online environments rely on community members in profound ways without being accountable to them in direct ways. In this session, we will explore this topic and its implications for online communities and platforms.
First, Nick Vincent (Northwestern, UC Davis) will discuss the opportunities for so-called “data leverage” and will highlight the potential to push back on the “data status quo” to build compelling alternatives, including the potential for “data dividends” that allow a broader set of users to economically benefit from their contributions.
The idea of “data leverage” comes out of a basic, but little discussed fact: Many technologies are highly reliant on content and behavioral traces created by everyday Internet users, and particularly online community members who contribute text, images, code, editorial judgement, rankings, ratings, and more.. The technologies that rely on these resources include ubiquitous and familiar tools like search engines as well as new bleeding edge “Generative AI” systems that produce novel art, prose, code and more. Because these systems rely on contributions from Internet users, collective action by these users (for instance, withholding content) has the potential to impact system performance and operators.
Next, Amy Zhang (University of Washington) will discuss how communities can think about their governance and the ways in which the distribution of power and decision-making are encoded into the online community software that communities use. She will then describe a tool called PolicyKit that has been developed with the aim of breaking out of common top-down models for governance in online communities to enable governance models that are more open, transparent, and democratic. PolicyKit works by integrating with a community’s platform(s) of choice for online participation (e.g., Slack, Github, Discord, Reddit, OpenCollective), and then provides tools for community members to create a wide range of governance policies and automatically carry out those policies on and across their home platforms. She will then conclude with a discussion of specific governance models and how they incorporate legitimacy and accountability in their design.
We had another Science of Community Dialogue! This most recent one was themed around informal learning, talking about communities as informal learning spaces and the sorts of tools and habits communities can adopt to help learners, mentors, and newcomers. We had presentations from Ruijia (Regina) Cheng (University of Washington, CDSC) and Dr. Denae Ford Robinson (Microsoft, University of Washington).
Regina Cheng covered three related research projects and relevant findings:
Ruijia Cheng and Benjamin Mako Hill. 2022. “Many Destinations, Many Pathways: A Quantitative Analysis of Legitimate Peripheral Participation in Scratch.” https://doi.org/10.1145/3555106
Ruijia Cheng, Sayamindu Dasgupta, and Benjamin Mako Hill. 2022. “How Interest-Driven Content Creation Shapes Opportunities for Informal Learning in Scratch: A Case Study on Novices’ Use of Data Structures.” https://doi.org/10.1145/3491102.3502124
Ruijia Cheng and Jenna Frens. 2022. “Feedback Exchange and Online Affinity: A Case Study of Online Fanfiction Writers.” https://doi.org/10.1145/3555127
Participants collaboratively put together three takeaways from Regina Cheng’s presentation.
We often talk about wanting to support “learning” in some general sense, but a critically important question to ask is “learning about what.” Let’s say we want people to learn three things A, B, and C. The kinds of actions or behaviors that support learning goal A often have no effect on B, and C. And sometimes they actively hurt it. We need to be more specific about what we want people to learn because there are tradeoffs.
Social support is wonderful in that users create examples and resources and answer questions. But it also has this narrowing effect. There’s a piling-on effect that makes it easier and easier (and more likely!) to learn the things that folks have learned before and less likely that people learn anything else.
Feedback is not about information transfer, it’s about relationships. To best promote learning, we should create rich, legitimate, inclusive social environment. These are perhaps good things to do anyway.
Dr. Denae Ford Robinson focused on free and open source software (FOSS) communities as a case study of learning communities. She covered theory, needs, and demonstrated tools designed to help with the mentorship and the learning process.
Community-driven settings like FOSS (and social-good oriented projects in particular) rely enormously on volunteers and/or people opting into participation in ways that create huge challenges related to promoting project sustainability: the most active participants are overloaded in a way that is a recipe for burnout.
The path to sustainability involves attracting, retaining, and then sustaining contributions and understanding these processes as both (a) part of the lifecycle of a user and (b) part of a set of dynamics and lifecycle within the community (e.g., dynamics of community growth).
Approach 1 involves providing new information to help maintainers understand how things are going in their communities. A lack of insight and easy access to data is a cause of inefficiency and burnout.
Approach 2 involves making specific, structured recommendations to maintainers based on the experience of others in the past to do things like add tags and to shape behavior.
Approach 3 involves automating aspects of identifying and recognizing work (and perhaps other tasks) as a way of promoting newcomer experiences and reducing the load on maintainers for doing that.
This event and some of the research presented in it were supported by multiple awards from the National Science Foundation (DGE-1842165; IIS-2045055; IIS-1908850; IIS-1910202), Northwestern University, the University of Washington, and Purdue University.
It’s Ph.D. application season and the Community Data Science Collective is recruiting! As always, we are looking for talented people to join our research group. Applying to one of the Ph.D. programs that the CDSC faculty members are affiliated with is a great way to get involved in research on communities, collaboration, and peer production.
Because we know that you may have questions for us that are not answered in this webpage, we will be hosting a panel discussion and Q&A about the CDSC and Ph.D. opportunities on October 20 at 7:30pm UTC (3:30pm US Eastern, 2:30pm US Central, 12:30pm US Pacific). You can register online.
This post provides a very brief run-down on the CDSC, the different universities and Ph.D. programs our faculty members are affiliated with, and some general ideas about what we’re looking for when we review Ph.D. applications.
Group photo of the collective at a recent virtual retreat.
What are these different Ph.D. programs? Why would I choose one over the other?
This year the group includes three faculty principal investigators (PIs) who are actively recruiting PhD students: Aaron Shaw (Northwestern University), Benjamin Mako Hill (University of Washington in Seattle), and Jeremy Foote (Purdue University). Each of these PIs advise Ph.D. students in Ph.D. programs at their respective universities. Our programs are each described below.
Although we often work together on research and serve as co-advisors to students in each others’ projects, each faculty person has specific areas of expertise and interests. The reasons you might choose to apply to one Ph.D. program or to work with a specific faculty member could include factors like your previous training, career goals, and the alignment of your specific research interests with our respective skills.
At the same time, a great thing about the CDSC is that we all collaborate and regularly co-advise students across our respective campuses, so the choice to apply to or attend one program does not prevent you from accessing the expertise of our whole group. But please keep in mind that our different Ph.D. programs have different application deadlines, requirements, and procedures!
Who is actively recruiting this year?
If you are interested in applying to any of the programs, we strongly encourage you to reach out the specific faculty in that program before submitting an application.
Ph.D. Advisors
Benjamin Mako Hill
Benjamin Mako Hill is an Associate Professor of Communication at the University of Washington. He is also an Adjunct Assistant Professor at UW’s Department of Human-Centered Design and Engineering (HCDE), Computer Science and Engineering (CSE) and Information School. Although many of Mako’s students are in the Department of Communication, he has also advised students in all three other departments—although he typically has more limited ability to admit students into those programs on his own and usually does so with a co-advisor in those departments. Mako’s research focuses on population-level studies of peer production projects, computational social science, efforts to democratize data science, and informal learning. Mako has also put together a webpage for prospective graduate students with some useful links and information..
AaronShaw is an Associate Professor in the Department of Communication Studies at Northwestern. This year, he’s also the “Scholar in Residence” for King County, Washington. In terms of Ph.D. programs, Aaron’s primary affiliations are with the Media, Technology and Society (MTS) and the Technology and Social Behavior (TSB) Ph.D. programs (please note: the TSB program is a joint degree between Communication and Computer Science). Aaron also has a courtesy appointment in the Sociology Department at Northwestern, but he has not directly supervised any Ph.D. advisees in that department (yet). Aaron’s current projects focus on comparative analysis of the organization of peer production communities and social computing projects, participation inequalities in online communities, and collaborative organizing in pursuit of public goods.
Jeremy Foote
Jeremy Foote is an Assistant Professor at the Brian Lamb School of Communication at Purdue University. He is affiliated with the Organizational Communication and Media, Technology, and Society programs. Jeremy’s current research focuses on how individuals decide when and in what ways to contribute to online communities, how communities change the people who participate in them, and how both of those processes can help us to understand which things become popular and influential. Most of his research is done using data science methods and agent-based simulations.
What do you look for in Ph.D. applicants?
There’s no easy or singular answer to this. In general, we look for curious, intelligent people driven to develop original research projects that advance scientific and practical understanding of topics that intersect with any of our collective research interests.
To get an idea of the interests and experiences present in the group, read our respective bios and CVs (follow the links above to our personal websites). Specific skills that we and our students tend to use on a regular basis include consuming and producing social science and/or social computing (human-computer interaction) research; applied statistics and statistical computing, various empirical research methods, social theory and cultural studies, and more.
Formal qualifications that speak to similar skills and show up in your resume, transcripts, or work history are great, but we are much more interested in your capacity to learn, think, write, analyze, and/or code effectively than in your credentials, test scores, grades, or previous affiliations. It’s graduate school and we do not expect you to show up knowing how to do all the things already.
Intellectual creativity, persistence, and a willingness to acquire new skills and problem-solve matter a lot. We think doctoral education is less about executing tasks that someone else hands you and more about learning how to identify a new, important problem; develop an appropriate approach to solving it; and explain all of the above and why it matters so that other people can learn from you in the future. Evidence that you can or at least want to do these things is critical. Indications that you can also play well with others and would make a generous, friendly colleague are really important too.
All of this is to say, we do not have any one trait or skill set we look for in prospective students. We strive to be inclusive along every possible dimension. Each person who has joined our group has contributed unique skills and experiences as well as their own personal interests. We want our future students and colleagues to do the same.
Now what?
Still not sure whether or how your interests might fit with the group? Still have questions? Still reading and just don’t want to stop? Follow the links above for more information. Feel free to send at least one of us an email. We are happy to try to answer your questions and always eager to chat. You can also join our panel discussion on October 20 at 3:30pm ET (UTC-5).
We recently held our second Community Dialogue around the theme of anonymity and privacy. Kaylea Champion presented on the role of anonymity in peer-contribution communities. Dr. Shruti Sannon joined us from the University of Michigan and talked about privacy in the gig economy.
What’s Anonymity Worth (Kaylea Champion)
Anonymity can protect and empower contributors in communities. Anonymity can make people feel safer or actually be safer. For example: Wikipedia editors who are working on controversial pages within contested geographies may be safer when they are able to contribute anonymously. Anonymous contribution is not without problems, as it can also empower trolls, harassers, and other bad actors. For more details, and actions you can take or policies to recommend within your communities, watch the video of Kaylea Champion’s presentation below.
Privacy and Surveillance in the Gig Economy (Dr. Shruti Sannon)
Gig workers can be asked or coerced to give up privacy in exchange for money through the design of the gig platforms they are using or by request of customers. Gig workers also use surveillance tools as a means of protecting themselves — some ride share drivers have cameras in their cars for this purpose. Dr. Sannon shared the broader implications of this situation, and what it can mean outside of the gig economy. To learn more, watch the video below.
Join us!
You can subscribe to our mailing list! We’ll be making announcements about future events there. It is a low volume mailing list.
Acknowledgements
Thanks to speakers Kaylea Champion and Shruti Sannon. The vision for this event borrows from the User and Open Innovation workshops organized by Eric von Hippel and colleagues, as well as others. This event and the research presented in it were supported by multiple awards from the National Science Foundation (DGE-1842165; IIS-2045055; IIS-1908850; IIS-1910202), Northwestern University, the University of Washington, and Purdue University.