The International Communication Association (ICA)’s 73nd annual conference is coming up soon. This year, the conference takes place in Toronto, Canada, and a subset of our collective is showing up to present work in person. We are looking forward to meeting up, talking about research, and hanging out together!
ICA takes place from Wednesday, May 24, to Monday, May 29, and CDSC members will take various roles in a number of different conference programs, including chairing, presentations, and co-organizing of preconference. Here is the list of our participation by the time order, so feel free to join us!
Thursday, May 25
We start off with a presentation by Yibin Fan on Thursday at 10:45 am in the International Living Learning Centre of Toronto Metropolitan University on Political Communication Graduate Student Preconference. In a panel on The Causes and Outcomes of Political Polarization and Violence, Yibin will present a paper entitled “Does Incidental Political Discussion Make Political Expression Less Polarized? Evidence From Online Communities”.
Later on Thursday, another preconference on New Frontiers in Global Digital Inequalities Research will take place in M – Room Linden (Sheraton) from 1:30pm to 5pm, Floor Fiers will work as a co-chair with a number of scholars from international, various academic institutions, and they will give a presentation on “The Gig Economy: A Site of Opportunity Vs. a Site of Risk?”. This preconference is affiliated with Communication and Technology Division and Communication Law & Policy Division of ICA.
Friday, May 26
On Friday, Carl Colglazier will present in a panel on Disinformation, Politics and Social Media at 3:00pm in M – Room York (Sheraton), and the presentation is entitled as “The Effects of Sanctions on Decentralized Social Networking Sites: Quasi-Experimental Evidence From the Fediverse”.
Sunday, May 28
On Sunday we will be actively taking different roles in various sessions. In the morning, Yibin Fan will serve as the moderator for a research paper panel on Political Deliberation and Expression affiliated by Political Communication Division at 9:00 am in 2 – Room Simcoe (Sheraton).
Then there comes our highlighting paper that won the Top Paper Award by Computational Methods Division: Nathan TeBlunthuis will present a methodological research paper entitled as “Automated Content Misclassification Causes Bias in Regression: Can We Fix It? Yes We Can!” in a panel on Debate, Deliberation and Discussion in the Public Sphere at noon in M – Room Maple East (Sheraton). This is a project on which Nate collaborates with Valerie Hase at LMU Munich and Chung-hong Chan at University of Mannheim. Congratulations to them for getting the Top Paper Award!
Last but not least, we will finish off our ICA 2023 by seeing our faculty members serving as the chair and discussants in the Computational Methods Research Escalator Session at 1:30pm in M – Room Maple West (the same room as Nate’s presentation!). The session is junior scholars who are inexperienced in publishing to have connections with more senior researchers in the field. As the Call for Papers by Computational Methods Division says, “Research escalator papers provide an opportunity for less experienced researchers to obtain feedback from more senior scholars about a paper-in-progress, with the goal of making the paper ready for submission to a conference or journal.” Aaron Shaw, together with Matthew Weber at Rutgers University, will serve as the chairs for the session. Benjamin Mako Hill and Jeremy Foote, together with a bunch of scholars from other institutions, will work as discussants for improving the research presented here.
We look forward to sharing our research and connecting with you at ICA!
Help us build a dynamic and exciting program to facilitate conversations between free and open source software (FOSS) researchers and practitioners! Submit a session proposal for FOSSY! The deadline for submissions is May 14 May 18 (Edit: The deadline has been moved.).
A Spring 2015 CDSC workshop. Photo by Sage Ross, licensed Creative Commons Attribution Share Alike
Although scholars publish hundreds of papers about free and open source software, online governance, licensing, and other topics very relevant to FOSS communities each year, much of this work never makes it out of academic journals and conferences and back to FOSS communities. At the same time, FOSS communities have a range of insights, questions, and data that researchers studying FOSS could benefit from enormously.
The goal of this track is to build bridges between FOSS communities and scientific research conducted with and about FOSS communities. We hope to provide opportunities for community members to hear about exciting results from researchers; opportunities for researchers to learn from the FOSS community members; and spaces for the FOSS community to think together about how to improve FOSS projects by leveraging research insights and research.
This track will include opportunities for:
researchers to talk with practitioners (about their research)
practitioners to talk with researchers (about their needs)
researchers to talk with other researchers (for learning and collaboration)
FOSSY runs July 13 – 16, and we are hoping to have 2-3 days of content. Towards that end, we are seeking proposals! If you are a researcher with work of relevance to FOSS community members; a FOSS community member with experiences or opportunities relevant to research; or just want to be involved in this conversation, please consider proposing in one of the following formats:
Short Talks. Do you have a recent project to share in some depth? A topic that needs time to unpack? Take 20 minutes to present your thoughts. Following each presentation there will be time for group discussion to help participants apply your work to their practice.
Lightning Talks. Want to make a focused point, pitch, or problem report to a great audience? Bring your 5-minute talk to our lightning round.
Panels. We will be facilitating dialogue between researchers and community members. Would you be willing to share your thoughts as a panelist? Let us know your expertise and a few notes on your perspective so that we can develop a diverse and engaging panel. Contact us directly (details below) if you are interested in being on a panel.
If you have an idea that doesn’t fit into these formats, let’s chat! You can reach out to Kaylea (kaylea@uw.edu), Molly (molly.deblanc@northwestern.edu) or submit your idea as a proposal via the FOSSY form.
Submissions are non-archival, so we welcome ongoing, completed, and already published research work. Non-archival means that presentation of work at FOSSY does not constitute a publication. It’s just a way to get your work out there! Work that synthesizes or draws across a body of published papers is particularly welcome.
What kind of research are you looking to have presented?
We are interested in any topic related to FOSS communities! This might include research from computing (including software engineering, computer security, social computing, HCI), the social sciences and humanities (including management, philosophy, law, economics, sociology, communication, and more), information sciences, and beyond.
For example: how to identify undermaintained FOSS packages and what to do about it; community growth and how to find success in small communities; effective rule making and enforcement in online communities.
If it involves FOSS or is of interest to FOSS practitioners, we welcome it! We are eager to help you put your results into the hands of practitioners who can use your findings to inform their own community’s practices and policies on social, governance, and technical topics.
We hope to welcome scholars and researchers from across academia, government, industry, or wherever else you are from!
Who will I be speaking with?
We expect a multi-disciplinary audience. FOSSY will be bringing together free and open source software practitioners including community managers, designers, legal experts, non-profit and project leaders, technical developers, technical writers, and researchers.
The track is being organized by:
Kaylea Champion, Community Data Science Collective and the University of Washington
Molly de Blanc, Community Data Science Collective and Northwestern University
Benjamin Mako Hill, Community Data Science Collective and the University of Washington
Aaron Shaw, Community Data Science Collective and Northwestern University
Women in Data Science Puget Sound is part of a 50+-country conference series founded and organized in cooperation with Stanford University’s Data Science coalition. Anyone may attend, regardless of gender: events feature a speaker lineup composed of women in data science. The Puget Sound event is Tuesday, April 25 at the Expedia HQ in Seattle, and numerous affiliated regional and online events are scheduled in the coming weeks.
If you’re in the Seattle area, you might like to catch CDSC member Kaylea presenting a workshop! Here’s the pitch for attending her beginner-friendly session:
Let’s Re-think Political Bias & Build Our Own Classifier
How can we think about political bias without falling into assumptions about who's on what side and what that means?
Data science and ML offer us an alternative: we can parse political speech about a topic and use NLP/ML techniques to classify articles we scrape from the web.
In this hands-on workshop, we'll parse the Congressional Record, build a classifier, scrape search results, and analyze texts. You'll walk away with your own example of how to use data science to analyze political framing.
The full lineup of speakers for the Puget Sound conference is posted here. Tickets for the single-day event are $80 (see this link to request a discount code for half off).
Topics on the schedule for this event look juicy if quant work is your jam: AI, BERT, hypergraphs, visualization, forecasting, quantum computing, causal inference, survival analysis, writing better code and career management, with examples ranging from search, sales, and supply chain to economic disparity, DNA sequencing and saving wildlife!
CDSC members Molly deBlanc and Kaylea Champion will be presenting at this year’s Aaron Swartz Day and International Hackathon. Molly will speak at 2:50 p.m. Pacific (talk title: My (Extended) Body, My Choice). Kaylea will speak at 3:15 p.m. Pacific (talk title: The Value of Anonymity: Evidence from Wikipedia). Registration and live stream details are available here: https://www.aaronswartzday.org/
If you’re attending ACM-CSCW this year, you are warmly invited to join CDSC members during our talks and other scheduled events. CSCW is not only virtual but spread across multiple weeks and offering sessions multiple times to accommodate timezones. We hope to see you there — we are eager to discuss our work with you!
Tuesday, November 8
6pm-7pm Pacific, “No Community Can Do Everything: Why People Participate in Similar Online Communities” Details at: https://programs.sigchi.org/cscw/2022/index/content/87413 By: Nathan Te Blunthuis, Charles Kiene, Isabella Brown, Nicole McGinnis, Laura Levi, Benjamin Mako Hill
3am-4am Pacific “The Risks, Benefits, and Consequences of Prepublication Moderation: Evidence from 17 Wikipedia Language Editions” Details at: https://programs.sigchi.org/cscw/2022/index/content/87945 By:Chau Tran, Kaylea Champion, Benjamin Mako Hill, Rachel Greenstadt
Friday, November 11
8am-9am Pacific. Many Destinations, Many Pathways: A Quantitative Analysis of Legitimate Peripheral Participation in Scratch. Details at: https://programs.sigchi.org/cscw/2022/index/content/87487 By: Ruijia Cheng, Benjamin Mako Hill
3pm-4pm Pacific “The Risks, Benefits, and Consequences of Prepublication Moderation: Evidence from 17 Wikipedia Language Editions” Details at: https://programs.sigchi.org/cscw/2022/index/content/87945 By: Chau Tran, Kaylea Champion, Benjamin Mako Hill, Rachel Greenstadt
8pm-9pm Pacific. Many Destinations, Many Pathways: A Quantitative Analysis of Legitimate Peripheral Participation in Scratch. Details at: https://programs.sigchi.org/cscw/2022/index/content/87487 By: Ruijia Cheng, Benjamin Mako Hill
Friday, November 18
6am-7am Pacific, “No Community Can Do Everything: Why People Participate in Similar Online Communities” Details at: https://programs.sigchi.org/cscw/2022/index/content/87413 By: Nathan Te Blunthuis, Charles Kiene, Isabella Brown, Nicole McGinnis, Laura Levi, Benjamin Mako Hill
And there’s more…
CDSC members and affiliates are involved in CSCW beyond these public presentations. Nicholas Vincent, Sohyeon Hwang, and Sneha Narayan are part of the organizing team for the “Ethical Tensions, Norms, and Directions in the Extraction of Online Volunteer Work” workshop, where Molly de Blanc is scheduled to present and Kaylea Champion will be giving a lightning talk. Katherina Kloppenborg and Kaylea Champion are presenting in the Doctoral Consortium.
Systems theory is a broad and multidisciplinary scientific approach that studies how things (molecules or cells or organs or people or companies) interact with each other. It argues that understanding how something works requires understanding its relationships and interdependencies.
For example, if we want to predict whether a new online community will grow, an individual perspective might focus on who the founder is, what software it is running on, how well it is designed, etc. A systems approach would argue that it is at least as important to understand things like how many similar communities there are, how active they are, and whether the platform is growing or shrinking.
In a paper just published in Media and Communication, I (Jeremy) argue that 1) it is particularly important to use a systems lens to study online communities, 2) that online communities provide ideal data for taking these approaches, and 3) that there is already really neat research in this area and there should be more of it.
The role of platforms
So, why is it so important to study online communities as interdependent “systems”? The first reason is that many online communities have a really important interdependence with the platforms that they run on. Platforms like Reddit or Facebook provide the servers and software for millions of communities, which are run mostly independently by the community managers and moderators.
However, this is an ambivalent relationship and often the goals and desires of at least some moderators are at odds with those of the platform, and things like community bans from the platform side or protests from the community side are not uncommon. The ways that platform decisions influence communities and how communities can work together to influence platforms are inherently systems questions.
Low barriers to entry and exit
A second feature of online communities is the relative ease with which people can join or leave them. Unlike offline groups, which at least require participants to get dressed, do their hair, and show up somewhere, online community participants can participate in an online community literally within seconds of knowing that it exists.
Similarly, people can leave incredibly easily, and most people do. This figure shows the number of comments made per person across 100 randomly selected subreddits (each line represents a subreddit; axes are both log-scaled). In every case, the vast majority of people only commented once while a few people made many comments.
Fuzzy boundaries
Finally, it’s often really difficult to draw clear boundaries around where one online community ends and another begins. For example, is all of Wikipedia one “community”? It might make sense to think of a language edition, a WikiProject, or even a single page as a community, and researchers have done all of the above. Even on platforms like Reddit, where there is a clearl delineation between communities, there are dependencies, with people and conversations moving across and between communities on similar topics.
In other words, online communities are semi-autonomous, interdependent, contingent organizations, deeply influenced by their environments. Online community scholars have often ignored this larger context, but systems theory gives us a rich set of tools for studying these interdependencies. One reason that it is so ideal is because online communities provide ideal data.
Data from Online Communities
Systems theory is not new – many of the main concepts were developed in the 1950s and 1960s or earlier. Organizational communication researchers saw how applicable these ideas were, and manyresearchersproposed treating organizations as systems.
However, it was really tough to get the data needed to do systems-based research. To study a group or organization as a system, you need to know about not only the internal workings of the group, but how it relates to other groups, how it is influenced by and influences its environment, etc. Gathering data about even one group was difficult and expensive; getting the data to study many groups and how they interact with each other over time was impossible.
The internet has entered the chat
Online communities provide the kind of data that these earlier researchers could have only dreamed of. Instead of data about one organization, platforms store data about thousands of organizations. And this is not just high-level data about activity levels or participation; on the contrary, we often have longitudinal, full-text conversations of millions of people as they interact within and move between communities.
Systems Approaches
In part, this article is a call for researchers to think more explicitly about online communities as systems, and to apply systems theory as a way of understanding how online communities work and how we can design research projects to understand them better. It is also an attempt to highlight strands of research that are already doing this. In the paper, I talk about four: Community Comparisons and Interactions, Individual Trajectories, Cross-level Mechanisms, and Simulating Emergent Behavior. Here, I’ll focus on just two.
Individual Trajectories
Figure from Panciera, K., Halfaker, A., & Terveen, L. (2009). Wikipedians are born, not made: A study of power editors on Wikipedia. Proceedings of the ACM 2009 International Conference on Supporting Group Work, 51–60. https://doi.org/10.1145/1531674.1531682
The first is what I call “Individual Trajectories”. In this approach, researchers can look at how individual people behave across a platform. One of the neat things about having longitudinal, unobtrusively collected data is that we can identify something interesting about users and go “back in time” to look for differences in earlier behavior. For example, in the plot above, Panciera et al. identified people who became active Wikipedia editors; they then went back and looked at how their behavior differed from typical editors from their early days on the site.
Researchers could and should do more work that looks at how people move between communities, and how communities influence the behavior of their members.
Simulating Emergent Behavior
The second approach is to use simulations to study emergent behaviors. Agent-based modeling software like NetLogo or Mesa allows researchers to create virtual worlds, where computational “agents” act according to theories of how the world works. Many communication theories make predictions about how individual‐level behavior produces higher‐level patterns, often through feedback loops (e.g., the Spiral of Silence theory). If agent-based models don’t produce those patterns, then we know that something about the theory—or its computational representation—is wrong.
Model of misinformation spread, from Hu et al. (under review)
Agent-based modeling has received some attention from communication researchers lately, including a wonderful special issue was recently published in Communication Methods and Measures; the editorial article makes some great arguments for the promise and benefits of simulations for communication research.
New Opportunities
It is a really exciting time to be a computational social scientist, especially one that is interested in online organizations and organizing. We have only scratched the surface of what we can learn from the data that is pouring down around us, especially when it comes to systems theory questions. Tools, methods, and computational advances are constantly evolving and opening up new avenues of research.
Of course, taking advantage of these data sources and computational advances requires a different set of skills than Communication departments have traditionally focused on, and complicated, large-scale analyses require the use of supercomputers and extensive computational expertise.
However, there are many approaches like agent-based modeling or simple web scraping that can be taught to graduate students in one or two semesters, and open up lots of possibilities for doing this kind of research.
I’d love to talk more about these ideas—please reach out, or if you are coming to ICA, come talk to me!
The International Communication Association (ICA)’s 72nd annual conference is coming up in just a couple of weeks. This year, the conference takes place in Paris and a subset of our collective is flying out to present work in person. We are looking forward to meeting up, talking research, and eating croissants. À bientôt!
ICA takes place from Thursday, May 26th to Monday, May 30th, and we are presenting a total of ten (!!) times. All presentations given by members of the collective are scheduled between Friday and Sunday.
Friday
We start off with a presentation by Nathan TeBlunthuis on Friday at 11.00 AM, in Room 351 M (Palais des Congres). In a high-density paper session on Computational Approaches to Online Communities, Nate will present a paper entitled “Dynamics of Ecological Adaptation in Online Communities.”
Later that same day, at 3.30 PM in the Amphitheatre Havana (level 3; Palais des Congres), Carl Colglazier will discuss a paper that he collaborated on with Nick Diakopoulos: “Predictive Models in News Coverage of the COVID-19 Pandemic in the U.S.” This paper session is part of the ICA division Journalism Studies.
Saturday
On Saturday, Floor Fiers will present in the paper session “Impression Management Online: FabriCATing An Image.” Their project, which they wrote with Nathan Walter, discusses “Comments on Airbnb and the Potential for Racial Bias” at 2.00 PM in Regency 1 (Hyatt).
Shortly after, that same afternoon, you’ll find two of our poster presentations at 5.00 PM in the Exhibit Hall (Havana; Palais des Congres, level 3). In one of them, Jeremy Foote will discuss his take on “a systems approach to studying online communities.”
The other poster, presented at the same time and place, is by Kaylea Champion and Benjamin Mako Hill on “Resisting Taboo in the Collaborative Production of Knowledge: Evidence from Wikipedia.”
Sunday
Most of our presentations are on the fourth day of the conference. At 9.30 AM, we’ll be presenting in three locations at the same time! First, Floor will discuss their paper “Inequality and Discrimination in the Online Labor Market: a Scoping Review” in Room 311+312 (Palais des Congres). This presentation is part of the paper session “All Things Are Not Equal: CompliCATions From Digital Inequalities.”
Second, Carl will present work on behalf of himself, Aaron Shaw, and Benjamin Mako Hill during a high-density paper session in Room 242A (Palais des Congres). The title of their project is “Extended Abstract: Exhaustive Longitudinal Trace Data From Over 70,000 Wiki.”
Lastly, at the same time in Room 352B (Palais des Congres), Jeremy will present an interview study entitled “What Communication Supports Multifunctional Public Goods in Organizations? Using Agent-Based Modeling to Explore Differential Uses of Enterprise Social Media.” Jeremy’s co-authors on this paper are Jeffrey Treem and Bart van den Hooff.
On Sunday afternoon, at 3.30 PM in Room 311+312 (Palais des Congres), Tiwaladeoluwa Adekunle will talk about a qualitative project she collaborated on with Jeremy, Nate, and Laura Nelson: “Co-Creating Risk Online: Exploring Conceptualizations of COVID-19 Risk in Ideologically Distinct Online Communities.”
We will finish off our ICA 2022 presentations at 5.00 PM in Room 313+314 (Palais des Congres), where Kaylea will present on behalf of Isabella Brown, Lucy Bao, Jacinta Harshe, and Mako. The title of their paper is “Making Sense of Covid-19: Search Results and Information Providers”.
We look forward to sharing our research and connecting with you at ICA!
We’re going to be at CHI! The Community Data Science Collective will be presenting three papers. You can find us there in person in New Orleans, Louisiana, April 30 – May 5. If you’ve ever wanted a super cool CDSC sticker, this is your chance!
This year was packed with things we’re excited about and want to celebrate and share. Great things happened to Community Data Science Collective members within our schools and the wider research community.
Charlie Kiene and Regina Cheng completed their comprehensive exams and are now PhD candidates!
Nate TeBlunthuis defended his dissertation and started a post-doctoral fellowship at Northwestern. Jim Maddock defended his dissertation on December 16th.
Regina was a teaching assistant for senior undergraduate students on their capstone projects. Regina’s mentees won Best Design and Best Engineering awards.
Salt was interviewed on the FOSS and Crafts podcast. His conference presentations included Linux App Summit, SeaGL and DebConf. Kaylea Champion spoke at SeaGL and DebConf. Kaylea’s DebConf present was on her research on detecting at-risk projects in Debian.
Champion, Kaylea. 2021. “Underproduction: An approach for measuring risk in open source software.” 28th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). pp. 388-399, doi: 10.1109/SANER50967.2021.00043.
Fiers, Floor , Aaron Shaw , and Eszter Hargittai. 2021. “Generous Attitudes and Online Participation.” Journal of Quantitative Description: Digital Media, 1. https://doi.org/10.51685/jqd.2021.008
Hill, Benjamin Mako , and Aaron Shaw , 2021. “The hidden costs of requiring accounts: Quasi-experimental evidence from peer production.” Communication Research 48(6): 771-795. https://doi.org/10.1177%2F0093650220910345.
Hwang, Sohyeon and Jeremy Foote . 2021. “Why do people participate in small online communities?”. Proceedings of the ACM on Human-Computer Interaction, 5(CSCW2), 462:1-462:25. https://doi.org/10.1145/3479606
Shaw, Aaron and Eszter Hargittai. 2021. “Do the Online Activities of Amazon Mechanical Turk Workers Mirror Those of the General Population? A Comparison of Two Survey Samples.” International Journal of Communication 15: 4383–4398. https://ijoc.org/index.php/ijoc/article/view/16942
TeBlunthuis, Nathan , Benjamin Mako Hill , and Aaron Halfaker. 2021. “Effects of Algorithmic Flagging on Fairness: Quasi-experimental Evidence from Wikipedia.” Proc. ACM Hum.-Comput. Interact. 5, CSCW1, Article 56 (April 2021), 27 pages. https://doi.org/10.1145/3449130
TeBlunthuis, Nathan. 2021 “Measuring Wikipedia Article Quality in One Dimension.” In Proceedings of the 17th International Symposium on Open Collaboration (OpenSym ’21). Online: ACM Press. https://doi.org/10.1145/3479986.3479991.
The conference will feature two new papers by collective students and faculty that were published in the journal Proceedings of the ACM on Human-Computer Interaction: CSCW.
Information on the talks as well as links to the papers are available here (CSCW members are listed in italics):
In addition, Benjamin Mako Hill is a panel co-chair.
Mako, Sohyeon, Jeremy, and Nathan will all be at the conference and so will tons of our social computing friends. Please come and say “Hello” to any of us and introduce yourself if you don’t already know us :)