Uncategorized – Community Data Science Collective

November 18, 2025November 18, 2025

CDSC at the NCA 2025:

CDSC members will be presenting at this years National Communication Association (NCA) Convention in Denver! You are warmly invited to join CDSC members during our talks and other scheduled events. Please come say hi!

Check out group members attending and what research they’ll be sharing:

Dyuti Jha: Dyuti will be presenting her paper titled “Mapping the Digital Life of Caste-Based Hate Speech” on Thursday, November 20th from 11:00 AM to 12:15 PM, discussing the processes of subreddit level moderation of caste-based hate speech in the absence of Reddit’s acknowledgement of Caste as a system of discrimination. The project seeks to understand how moderators in South Asian subreddits such as r/India and r/UnitedStatesofIndia identify caste based hate speech, how, if at all, they moderate it, whether the caste composition of the moderator teams for these subreddits influence the decision making on whether to or how to moderate such speech, and how they do so without any platform level guidelines. This project highlights a severely under researched area of caste in computing, particularly about how issues of identity bolsters or hinders inclusivity of a community.

Maddie Douglas: Maddie will be part of the Digital Rhetorics and Contemporary Media panel (Rhetorical and Communication Theory Division) on Friday, November 21st from 8:00 AM – 9:15 AM presenting a full paper titled “Strategic Ambiguity in the Modern Digital Age: Polysemy, Controversy, and AI Hype in the ‘Pause Giant AI Experiments’ Open Letter.” Maddie’s paper conducts a close reading of an “AI ethics” open letter that was spread shortly after GPT-4’s release, viewing it as a strategy to benefit AI investors and amplify hype. This reading makes a case for redefining “strategic ambiguity” from Leah Ceccarelli’s 1998 definition to include polysemy that achieves criticism (as well as praise) from audiences.

Hsuen-Chi (Hazel) Chiu: Hazel will be presenting her paper “AI Companions and the Illusion of Privacy: When Social Connection Meets Data Exposure” on Thursday, November 20th from 2:30 – 3:45 PM. She’ll discuss how users manage privacy when forming emotional relationships with AI companions. Drawing on interviews with 15 users of Replika and Character.AI, her study shows that people often treat these chatbots like trusted confidants while simultaneously worrying about how companies might use their data. Using Communication Privacy Management theory and the horizontal/vertical privacy framework, she highlights how users negotiate this tension between intimate disclosure and institutional surveillance. Her findings point to the need for more transparent, user-centered privacy design in emotionally supportive AI systems.

Srish Chatterjee: Srish will be at a day-long pre-conference called ‘Conspiratorial Economies,’ where they’ll present a full paper called “The Invisible Hand: Rhetorical Patterns in Conspiracy Theories on Technology’s Ubiquity.” Srish’s paper examines the rhetorical patterns of technological conspiracy theories and how they function as sophisticated folk epistemology that, while often factually inaccurate, articulate legitimate public anxieties about agency, surveillance, and corporate power in complex digital societies.

October 29, 2025October 29, 2025

Welcome new student members of the CDSC!

Most years, the CDSC is lucky enough to recruit some amazing new Ph.D. students to the lab. This fall is no exception and we are thrilled to welcome an extraordinary group across several of our group’s campuses. The students join us from a wide variety of places, backgrounds, and prior affiliations (which should be encouraging for any prospective students looking to join the group in the future!). Some short bios and photos follow below (in alphabetical order by last name) with the text largely taken from the people page on our wiki. You can look forward to reading more about their research in the coming years!

Eric Fassbender is a first year PhD student in the Media Technology and Society program at Northwestern University. He is interested in studying technology adoption as an expression of resistance and protest. He is currently researching the ways that people form decisions to leave online groups around issues of surveillance and political alignment. You can learn more about him on his website here or on mastodon here. Outside of work, Eric loves reading sci-fi, all things cyberpunk, and hiking to improve his landscape photography.

Jonghyun Jee (pronounced Jong-H-yuh-n, not Hyoon) is a first-year PhD student in the Media, Technology, and Society program at Northwestern University. He studies how online communities create and enforce their rules. His research has looked at a range of platforms, from established ones like Wikipedia, YouTube, and Discord to decentralized networks such as Bluesky and Mastodon. Lately, Jonghyun has been exploring how to use LLMs to simulate these social environments at scale. He’s driven by the belief that critiques of technology (even dystopian ones) are less calls for its undoing than invitations to reimagine it. When Jonghyun procrastinates, he practices zen meditation and writes short film synopses.

Manish Kumar is a first-year PhD student at the School of Information at the University of Texas at Austin advised by Dr Nathan TeBlunthuis and Dr Edgar Gómez Cruz. Manish’s work explores political expression on social media and how it connects to people’s offline relationships. He’s fascinated by human experiences at scale, and tries to bridge qualitative inquiry with computational techniques to capture that complexity. Broadly, Manish studies how social media/technology becomes woven into people’s everyday political sensemaking. Manish grew up in Patna, India. He earned a degree in Information Science & Engineering and spent a few years as a software developer, but has always been drawn to the sociological side of technology. That curiosity took Manish to UC Berkeley for a Master’s of Information Management and Systems, where he discovered research and got completely hooked. In his free time, Manish like to do nerdy stuff like reading historical fiction, going on walking tours, learning about local history (that includes the petty neighbourhood rivalries) and going to museums.

Jianghui Li is a first-year PhD student at the University of Texas at Austin’s School of Information, advised by Dr. Nathan TeBlunthuis. He is interested in researching belief dynamics, collective behavior, and sustainability in sociotechnical systems through the lens of complex adaptive systems. Before studying at UT Austin, Jianghui earned bachelor’s and master’s degrees at Syracuse University’s School of Information Studies, and he misses the cool Syracuse weather Outside of research, Jianghui enjoys fishing, learning about fish, and sometimes thinking about the similarities between ecological systems and human networks.

Dylan Smith is a first-year MA/PhD student at University of Washington—Seattle in Communication. They grew up in Portland, Oregon and got a bachelor’s degree in Computer Science at Carleton College in Minnesota. Dylan’s research interests are in online interpersonal communication and online governance. For the past few years, Dylan has been working on a research project studying Wikipedia’s arbitration process. In their free time, Dylan likes reading fiction, spending time with friends, hiking, and long-distance running. Last Spring, Dylan ran their first marathon!!!

Ran Tang is a MA/PhD student in the Department of Communication at the University of Washington. Her research focuses on the moderation of online communities. She primarily use qualitative methods to study the daily work of volunteer moderators, and is also exploring the use of quantitative approaches in future projects. In her free time, Ran enjoys playing table tennis and swimming.

Yiwei Wu is a first-year PhD student at UT Austin. Previously, she attended the University of Washington for her bachelor’s degree. Her research interests include online collective action, peer production, and community data governance. In her free time, Yiwei enjoys baking, playing musical instruments (bass and Chinese flute), and playing farming games (e.g., Stardew Valley).

October 22, 2025October 22, 2025

Science of Community Dialogue: The Impacts of Organizational Interventions in Open Source Software Engineering

This dialogue will take place on November 7th at 12pm CT and will explore how free/libre and open source software (FLOSS) projects adapt their work processes to recruit new contributors and build the project communities that they want, and how FLOSS projects redesign collaboration processes within different environments and moments in project lifecycles. Professor Igor Steinmacher (Northern Arizona University) will be joining Matt Gaughan (Northwestern University) to present recent research on topics including:

Documentation practices in FLOSS projects
Disconnect between guidelines and reality
Sustainability challenges in FOSS communities
Community-based governance redevelopment
Structural shifts for long-term health

A full session description is on our website. Register online.

What is a Dialogue?

The Science of Community Dialogue Series is a series of conversations between researchers, experts, community organizers, and other people who are interested in how communities work, collaborate, and succeed. You can watch this short introduction video with Aaron Shaw.

What is the CDSC?

The Community Data Science Collective (CDSC) is an interdisciplinary research group made of up of faculty and students at the University of Washington Department of Communication, the Northwestern University Department of Communication Studies, the Carleton College Computer Science Department, the School of Information at UT Austin, and the Purdue University School of Communication.

Learn more

If you’d like to learn more or get future updates about the Science of Community Dialogues, please join the low volume announcement list.

October 1, 2025

Community Dialogue – AI Boundaries: Refusal and Privacy

Join the Community Data Science Collective (CDSC) for our 12th Science of Community Dialogue! This Community Dialogue will take place on October 17th, 2025 at 12:00 pm CT. This dialogue explores how companion chatbots invite deep emotional disclosure while raising concerns about data privacy—and how some communities are pushing back through AI refusal. Professor Jasmine McNealy (University of Florida) will join Hsuen-Chi (Hazel) Chiu (Purdue University) to present recent research on topics including:

Emotional disclosure in chatbot interactions
Data privacy and AI refusal
Designing emotionally intelligent, boundary-aware AI
Cultural implications of opting out of AI companionship

A full session descriptions is on our website. Register online.

What is a Dialogue?

What is the CDSC?

Learn more

If you’d like to learn more or get future updates about the Science of Community Dialogues, please join the low volume announcement list.

September 25, 2025

FOSSY 2025 Wrap-Up: Kaylea Champion “Plausible Slop: Generative AI and Open Source Cybersecurity”

For our final talk of the Science of Community track at FOSSY 2025, Kaylea Champion explored how generative AI tools are disrupting open source cybersecurity—not through advanced attacks, but by flooding communities with “plausible slop,” or misleading, low-effort reports. She shared research on the burden this places on experts, who must balance welcoming newcomers with filtering out noise. Drawing on historical parallels and case studies, she proposed strategies to address these challenges and invited community input to shape future solutions.

This is the final of our 11 part series sharing highlights from the Science of Community track at FOSSY 2025. Visit the FOSSY site for bio details and a full abstract.

September 23, 2025September 23, 2025

FOSSY 2025 Wrap Up: Steve Feng and Anita Sarma “Glue Work Makes the Community Work: Sustaining OSS Through Invisible Labor”

Our tenth talk for the Science of Community Track at FOSSY 2025 featured Zixuan (Steve) Feng and Anita Sarma discussed how glue work —like maintaining code, updating docs, and supporting users—is essential to OSS success but often overlooked and undervalued. The talked about their teams 300+ OSS practitioner studies to define, trace, and elevate these contributions, offering practical strategies to recognize their impact.

This is the 10th of our 11 part series sharing highlights from the Science of Community track at FOSSY 2025. Visit the FOSSY site for bio details and a full abstract.

September 19, 2025September 18, 2025

FOSSY 2025 Wrap Up: Igor Steinmacher “Lessons from a Decade of Open Source Sustainability Research”

Igor Steinmacher walked through lessons learned from a decade of OSS research for our 9th talk of the Science of Community Track. He explored long-term sustainability challenges in FOSS communities, including onboarding, maintainer burnout, and governance. He presented interventions like mentorship strategies, structured contribution paths, and the use of LLMs to support contributors and scale community engagement. Through case studies and longitudinal data, he offered a holistic vision for building more inclusive, resilient, and human-centered open source ecosystems.

This is the 9th of our 11 part series sharing highlights from the Science of Community track at FOSSY 2025. Visit the FOSSY site for bio details and a full abstract.

September 16, 2025September 23, 2025

FOSSY 2025 Wrap Up: Dawn Foster on “Power Dynamics, Rug Pulls, and Other Impacts on FOSS Sustainability

The 8th presenter for the Science of Community track was Dawn Foster, who talked about the power imbalances in FOSS projects, and their potential for disruption. She explored real-world case studies and offered practical steps to help contributors make smarter, more sustainable choices.

This is the 8th of our 11 part series sharing highlights from the Science of Community track at FOSSY 2025. Visit the FOSSY site for bio details and a full abstract .

September 10, 2025September 14, 2025

Prospective PhD Student Q&A – September 26th

Thinking about applying to graduate school? Wonder what it’s like to pursue a PhD or research-based master’s degree? Interested in understanding relationships between technology and society? Curious about how to do research on online communities like Reddit, Wikipedia, or GNU/Linux? The Community Data Science Collective is hosting a virtual Q&A session on Friday, September 26th at 10am PT / 12pm CT / 1pm ET for prospective students. This session is scheduled for an hour, to be divided between a larger group session with faculty and then smaller groups with current graduate students. If you would like to attend, register at this link!

This post provides a very brief run-down on the CDSC, the different universities and Ph.D. and research master’s programs our faculty members are affiliated with, and some general ideas about what we’re looking for when we review applications.

What is the Community Data Science Collective?

The Community Data Science Collective (or CDSC) is a joint research group of (mostly quantitative) empirical social scientists and designers pursuing research about the organization of online communities, peer production, and learning and collaboration in social computing systems. We are based at Northwestern University, the University of Washington-Seattle, University of Washington-Bothell, The University of Texas at Austin, Purdue University, and a few other places. You can read more about us and our work on our research group blog and on the collective’s website/wiki.

What are these different Ph.D. programs? Why would I choose one over the other?

This year the group includes multiple faculty principal investigators (PIs) who are actively recruiting PhD students: Aaron Shaw (Northwestern University), Benjamin Mako Hill (University of Washington in Seattle), Kaylea Champion (University of Washington in Bothell) and Jeremy Foote (Purdue University). Each of these PIs advise Ph.D. and / or M.S. students in graduate programs at their respective universities. We also have faculty PIs who are not currently recruiting students, but are active members of the group: Nathan TeBlunthuis (University of Texas – Austin) and Ryan Funkhouser (University of Idaho). Our programs are each described below.

Although we often work together on research and serve as co-advisors to students in each others’ projects, each faculty member has specific areas of expertise and interests. The reasons you might choose to apply to one of these Ph.D. or M.S. programs or to work with a specific faculty member could include factors like your previous training, career goals, and the alignment of your specific research interests with our respective skills.

At the same time, a great thing about the CDSC is that we all collaborate and regularly co-advise students across our respective campuses, so the choice to apply to or attend one program does not prevent you from accessing the expertise of our whole group. But please keep in mind that our different Ph.D. and M.S. programs have different application deadlines, requirements, and procedures!

Faculty who are actively recruiting this year

If you are interested in applying to any of the programs, we strongly encourage you to reach out to the specific faculty in that program before submitting an application.

Jeremy Foote

Jeremy Foote is an Assistant Professor at the Brian Lamb School of Communication at Purdue University. He is affiliated with the Organizational Communication and Media, Technology, and Society programs. Jeremy’s research focuses on how individuals decide when and in what ways to participate in online communities, how communities change the people who participate in them, and how both of those processes can help us to understand what people believe and which things become popular and influential. He and his students use multiple methods, including data science, agent-based modeling, field experiments, and interviews.

Benjamin Mako Hill

Benjamin Mako Hill is an Associate Professor of Communication at the University of Washington. He is also adjunct faculty at UW’s Department of Human-Centered Design and Engineering (HCDE), Computer Science and Engineering (CSE) and Information School. Although many of Mako’s students are in the Department of Communication, he has also advised students in all three other departments—although he typically has more limited ability to admit students into those programs on his own and usually does so with a co-advisor in those departments. Mako’s research focuses on population-level studies of peer production projects, computational social science, efforts to democratize data science, and informal learning. Mako has also put together a webpage for prospective graduate students with some useful links and information..

Aaron Shaw, Nikki Ritcher Photography

Aaron Shaw is an Associate Professor in the Department of Communication Studies at Northwestern. In terms of Ph.D. programs, Aaron’s primary affiliations are with the Media, Technology and Society (MTS) and the Technology and Social Behavior (TSB) Ph.D. programs (please note: the TSB program is a joint degree between Communication and Computer Science). Aaron also has a courtesy appointment in the Sociology Department at Northwestern, but he has not directly supervised any Ph.D. advisees in that department (yet). Aaron’s current projects focus on comparative analysis of the organization of peer production communities and social computing projects, participation inequalities in online communities, and collaborative organizing in pursuit of public goods.

Kaylea Champion

Kaylea Champion is an Assistant Professor in Computing & Software Systems at the University of Washington-Bothell. Kaylea’s research investigates how people collaborate to build digital infrastructure, including operating systems, programming languages, and information repositories. What gets made and maintained and secured—and what gets neglected? What risks do we face (including from AI and cybercriminals)? What practices lead to better outcomes? How can we work smarter and what can we stop doing? Kaylea’s work seeks to bridge the divide between research and practice, which for her means building relationships with practitioner communities, organizations, and industry to directly share research findings. If you are interested in the Computing & Software Systems graduate programs at the University of Washington – Bothell, you should register for one of the information sessions specific to these programs as well. Kaylea’s department only admits M.S. and B.S. students. In addition, she is happy to support any students who are working with others in the CDSC and can serve on thesis and dissertation committees.

Other Faculty members of the CDSC

Nathan TeBlunthuis, Ventrait Pictures

Nathan TeBlunthuis is an Assistant Professor in the School of Information at the University of Texas at Austin in the area of social informatics. Nathan’s research focuses on analyzing ecosystems of online communities, AI tools in peer production, and methods in computational social science. His current projects continue in these areas and also draw from them all to understand how information sources achieve legitimacy in online communities. He works primarily using computational tools and big data, but also grounds his work in qualitative evidence.

Ryan Funkhouser

Ryan Funkhouser is an Assistant Professor in the Department of Psychology and Communication at the University of Idaho. Ryan’s research focuses on communication processes for bridging ideological divides in online spaces. His work includes explorations of deliberation-focused online communities, the role of narrative in persuasion, and the mechanisms of belief change. Ryan utilizes both computational and qualitative methods to explore text data, primarily from online sources.

What do you look for in Ph.D. applicants?

There’s no easy or singular answer to this. In general, we look for curious, intelligent people driven to develop original research projects that advance scientific and practical understanding of topics that intersect with any of our collective research interests.

To get an idea of the interests and experiences present in the group, read our respective bios and CVs (follow the links above to our personal websites). Specific skills that we and our students tend to use on a regular basis include consuming and producing social science and/or social computing (human-computer interaction) research; applied statistics and statistical computing, various empirical research methods, social theory and cultural studies, and more.

Formal qualifications that speak to similar skills and show up in your resume, transcripts, or work history are great, but we are much more interested in your capacity to learn, think, write, analyze, and/or code effectively than in your credentials, test scores, grades, or previous affiliations. It’s graduate school and we do not expect you to show up knowing how to do all the things already.

Intellectual creativity, persistence, and a willingness to acquire new skills and problem-solving matter a lot. We think doctoral education is less about executing tasks that someone else hands you and more about learning how to identify a new, important problem; develop an appropriate approach to solving it; and explain all of the above and why it matters so that other people can learn from you in the future. Evidence that you can or at least want to do these things is critical. Indications that you can also play well with others and would make a generous, friendly colleague are really important too.

All of this is to say, we do not have any one trait or skill set we look for in prospective students. We strive to be inclusive along every possible dimension. Each person who has joined our group has contributed unique skills and experiences as well as their own personal interests. We want our future students and colleagues to do the same.

Now what?

Still not sure whether or how your interests might fit with the group? Still have questions? Still reading and just don’t want to stop? Follow the links above for more information. Feel free to send at least one of us an email. We are happy to try to answer your questions and always eager to chat.

September 3, 2025September 3, 2025

FOSSY 2025 Wrap-Up: Ben Ford “It’s all about the ecosystem!”

Ben Ford wrapped up Friday as the sixth speaker in the Science of Community track at FOSSY 2025, talking about the idea that the ecosystem is the product and the thing that you build and sell only exists to support it, something OSS companies might learn from.

You can check out Ben’s slide deck here.

This is the 6th of our 11 part series sharing highlights from the Science of Community track at FOSSY 2025. Visit the FOSSY site for bio details and a full abstract.