Aaron Shaw

February 27, 2025February 26, 2025

Decentralization and the social web revisited

Diagram of centralized, decentralized (hierarchical), and distributed communications networks from Paul Baran's 1964 article "On distributed communications networks." — Figure from Paul Baran’s 1964 article “On distributed communications networks” (https://doi.org/10.1109/TCOM.1964.1088883) and referenced by Christine Lemmer-Webber.

Last Spring, I hosted a “thought leader dialogue” on decentralizing social media through the Northwestern Center for Human-Computer Interaction and Design (HCI+D). It was fantastic and I highly recommend you check out the video.

Afterwards, two of the eponymous “thought-leaders” who headlined that session, Christine Lemmer-Webber and Bryan Newbold followed up with each other in a series of blogposts about decentralization, Bluesky, and the Fediverse.

How decentralized is Bluesky really? by Christine Lemmer-Webber.
Reply on Bluesky and decentralization by Bryan Newbold.
Re: Re: Bluesky and decentralization by Christine Lemmer-Webber

I recently re-read these while preparing for a guest lecture in an undergraduate class here at NU. Fair warning: Despite the subtext of a debate about systems mainly used for micro-blogging, the posts are very long. That said, if you care about these topics (or are maybe just curious to learn more), all three are pure gold.

It helps to understand a bit about exactly who Christine and Bryan are, or more specifically, why they have among the very most informed and interesting perspectives on the topic. Christine helped to create the ActivityPub protocol and standard that is at the heart of the Fediverse (Mastodon, Lemmy, Pixelfed, PeerTube). She currently leads the Spritely Institute pursuing the design and implementation of the next generation of decentralized communication tech. Bryan is the protocol engineer at Bluesky and, as such, one of the leaders building out the AT Protocol, which powers Bluesky and a whole other small-but-seemingly-growing ecosystem of applications.

The crux of their conversation revolves around the competing visions for the future of the social web being pursued within the respective ActivityPub and AT Protocol “universes.” The posts underscore some of the more profound differences between the two, in particular the questions around “shared heap” vs. “message passing” architectures, the relative degree of (de)centralization each system affords (this is where the diagram originally created by Paul Baran and reproduced at the top of this post comes in), and (more implicitly) the theories of change motivating the design and implementation choices prioritized in their respective work.

It’s worth nothing that both Bryan and Christine are very careful and intentional to call out the overlaps between ActivityPub and AT Proto as well. I also appreciate how respectful and thoughtful they each are given that there are some substantive and high stakes areas of disagreement being addressed in their conversation.

If you read the posts (and I hope you do!) and want to share your responses, please do so in the comments, via email, Toot, Skeet, or sundry other means. There are several of us in the lab pursuing work in these areas and I’m eager to understand perspectives on the topic.

August 14, 2024August 13, 2024

Professor Floor Fiers!

Dr. Floor Fiers and proud faculty mentor Aaron Shaw. — Floor and Aaron just before Northwestern’s doctoral hooding ceremony.

A very special congratulations to CDSC member Floor Fiers on the completion of their Ph.D. in Media, Technology & Society at Northwestern!

Floor’s dissertation Chasing the Ideal and Making It Work: Pursuing Employment in the Remote Gig Economy, seeks to understand inequality among workers in the gig economy and how they navigate the precarity involved in remote gig work. Several of the chapters have already appeared as standalone, peer-reviewed publications, but there’s plenty of new, exciting, and as-yet-unpublished material in there as well.

This week (!), Floor will begin a position as Assistant Professor in the Amsterdam School of Communication Research (ASCoR) at the University of Amsterdam.

Since I (Aaron) am posting this one myself, it seems appropriate to add that it’s been wonderful working with Floor over the past five+ years. Indeed, I’m still in denial about the fact that Floor won’t be physically present in our lab meetings this year. At the same time, I couldn’t be happier for Floor and definitely get a goofy, proud-faculty-mentor grin on my face whenever I think about the incredible things they’ve accomplished already (nevermind all the cool stuff yet to come).

Congratulations again, Floor!

July 23, 2024July 22, 2024

Academic Year-in-review (2023-2024) and celebration!

CDSC group photo from Fall, 2023 at Northwestern — CDSC group photo taken at Northwestern in Fall, 2023

We love celebrating the accomplishments of CDSC lab and community members! Here’s a less-than-complete, not-quite-brief summary of some of those accomplishments over the past academic year+ (since the last time we wrote a post like this). Congratulations to everyone involved—including those members of the CDSC community not named below. It truly takes a village to do all of these things and we appreciate the achievements and contributions of all.

Awards, degrees, and fellowships:

Hazel Chiu received a Top Paper Award from the International Communication Association (ICA) Communication and Technology (CaT) Division for “User Acceptance of Multiple Accounts Management on SNS: A Technology Acceptance Model Perspective.”
Nathan TeBlunthuis received a Top Paper Award from the ICA Computational Methods (CM) Division for “Misclassification in Automated Content Analysis Causes Bias in Regression. Can We Fix It? Yes We Can!.”
Dyuti Jha and Ryan Funkhouser were named runners-up for the National Communication Association (NCA) Sam Keltner Top Student paper for “Freedom to flourish: A systematic review of the literature at the intersection of resilience, communication, and peacebuilding.”
Floor Fiers completed their Ph.D. and will begin a new role as an Assistant Professor of Communication at the University of Amsterdam.
Nathan TeBlunthuis will begin a new role as an Assistant Professor in the Information School at the University of Texas, Austin.
Tommy Rousse completed his J.D. and MTS Ph.D. at Northwestern.
Sohyeon Hwang will begin a Postdoctoral Fellowship at the Center for Information Technology & Policy (CITP) at Princeton University in the Fall.
Yibin Fan completed a Master’s degree in Communication at the University of Washington.
Benjamin Mako Hill was a fellow at CITP at Princeton University during 2023-2024.
Emily Zou graduated with honors from Northwestern in American Studies with her thesis, “`Did Bro Just Grief the US Government?’: How online community identities create new genres of political communication.” Starting in the Fall, Emily will begin a Ph.D. in Communication at Stanford University.
Carolyn Zou graduated with honors from Northwestern in Communication Studies with their thesis, “Sociotechnical Risks of Simulating Humans with Language Model Agents.” Starting in the Fall, Carolyn will begin a Ph.D. in Computer Science at Stanford University.
Carolyn Zou was awarded an NSF Graduate Research Fellowship (GRFP).

Publications:

Members of the lab published more than 15 papers and articles. This is too many to list here, but you should check our publications page for more.

Talks and conference presentations:

Members of the group gave way too many presentations to list.

Select venues include: Seattle GNU/Linux Conference (SeaGL); Free and Open Source Software Yearly Conference (FOSSY); PyCon; Wikimania; the Annual Meeting of the International Communication Association (ICA); the Annual Meeting of the National Communication Association (NCA); the ACM Conferences CSCW and CHI; the Yale Internet & Society Project; the Berkman-Klein Center for Internet & Society at Harvard University; Stanford University HCI Speaker Series; University of Maryland, College Park; Rutgers University; Cornell Tech, Digital Life Institute; Learning Planet Institute, Paris; the University of Pennsylvania, Annenberg School for Communication; The Rockefeller Foundation’s Bellagio Center in Bellagio, Italy; and the Stanford Trust & Safety Research Conference, the IEEE International Conferences on Weblogs and Social Media (ICWSM) and Software Analysis, Evolution and Reengineering (SANER).

Teaching:

A selection of the courses taught or TA’ed by members of the group in the past year include:

Introduction to Communication
Introduction to Programming and Data Science
Public speaking
Online communities
History & theories of information
Social Network Analysis
Communication technology & politics

Many of these are available via our workshops and classes page.

Events:

Members of the group planned, hosted, or otherwise played leadership roles in the following events:

The CDSC Science of Community Dialogues Series
The Northwestern Center for HCI+Design Thought-Leader Dialogue Series
Free Open Source Software Yearly (FOSSY) Conference, Science of Community Track (2023 and 2024).
The Decentralized Social Media Workshop, Princeton University
Hongerige Wolf Festival (“Science” branch)

Other career and degree milestones:

Madison Deyo joined the group as Program Coordinator!
Molly de Blanc began a Ph.D. in Media, Technology, and Society at Northwestern.
Haomin Lin and Matt Gaughan joined the CDSC at UW and Northwestern respectively.
Carl Colglazier, Ryan Funkhouser, and Zarine Kharazian passed their general/preliminary/qualifying exams.
Charlie Kiene completed an internship at Amazon.

June 10, 2024June 6, 2024

Recording of Thought-leader dialogue: Decentralizing social media

A couple of weeks ago, I moderated a “Thought leader dialogue” panel on “Decentralizing Social Media” co-hosted by the Northwestern Center for Human-Computer Interaction + Design (HCI+D) and the Community Data Science Collective.

The (extraordinary!) panelists were Jaz-Michael King (IFTAS), Christine Lemmer-Webber (Spritely Institute), and Bryan Newbold (BlueSky). The discussion ranged far and wide over some key background on decentralized and federated social media as well as some urgent challenges and opportunities in the space.

The recording of the session is up and you can watch it here (or in the frame below).

Thanks to the panelists, Madison Deyo, and the HCI+D team for making this happen!

January 3, 2024January 3, 2024

New year, new job with us? CDSC is hiring!

Do you care about community, design, computing, and research? We are looking for a person to grow the public impact of the Community Data Science Collective (CDSC) and Northwestern University Center for Human Computer Interaction +Design (HCI+D). We are hiring a full time Program Coordinator to work in both groups. This person will focus on outreach, communications, research community development, strategic event planning, and administration for both the CDSC and HCI+D.

Although a portion of the work may be done remotely, attendance for in-person meetings and workshops is required and the position is located in Evanston on the Northwestern University campus. The average salary for similar positions at Northwestern is around $55,000 per year and includes excellent benefits (compensation details for this position can only be determined by Northwestern HR in the hiring process). We’re looking for a minimum 2 year commitment.

Duties

These fall into four categories, with specific examples in each listed below:

Outreach & communications
- Manage social media posting (LinkedIn, Mastodon, X, WordPress etc.)
- Post events to listservs and websites
- Advertise events such as the Collective’s “Science of Community” series and the Center’s “Thought Leader Dialogues”
- Build contact-lists around specific events and topics
- Share messages with internal and external audiences
Research community development
- Recruit participants to community events
- Organize group retreats (3-4 year total)
- Engage with community members of both the Collective and Center
Strategic event planning
- Develop and execute a strategic event plan for in-person/virtual events
- Collaborate with Collective and Center members to plan and recruit speakers for events
Administration:
- Schedule and plan research meetings
- Track and report on collective and center achievements
- Draft annual research and donor reports
- Document processes and initiatives

Core competencies:

Ability to use and learn web content management tools, such as wordpress, and wikis.
General organization
Communication (be clear, be concise)
Meeting facilitation
Managing upwards
Small/medium scale (20-50 people) event planning
Creative thinking and problem solving

Qualifications

Candidates must hold at least a bachelor’s degree. Familiarity with event planning, community management, project management, and/or scientific research is a plus, as is prior experience in the social or computer sciences, research organizations, online communities, and/or public interest technology and advocacy projects of any kind.

About Northwestern’s Center for HCI+Design and the Community Data Science Collective

The Community Data Science Collective is an interdisciplinary research group made up of faculty, students, and affiliates mainly at the University of Washington Department of Communication, the Northwestern University Department of Communication Studies, the Carleton College Computer Science Department, and the Purdue University School of Communication. To learn more about the Community Data Science Collective, you should check out our wiki, blog, and recent publications.

Northwestern’s Center for Human Computer Interaction + Design is an interdisciplinary research center that brings together researchers and practitioners from across the University to study, design, and develop the future of human and computer interaction at home, work, and play in the pursuit of new interaction paradigms to support a collaborative, sustainable, and equitable society.

Contact

Please contact Aaron Shaw with questions. Both the CDSC and the Center for HCI+D are committed to creating diverse, inclusive, equitable, and accessible environments and we look forward to working with someone who shares these values.

Ready to apply?

Please apply via the Northwestern University job posting (and note that the job ID is 49284). We will begin reviewing applications immediately (continuing on a rolling basis until the position is filled).

(revised to fix a broken link)

April 18, 2023April 17, 2023

Excavating online futures past

Cover of Kevin Driscoll's book, The Modem World.

The International Journal of Communication (IJOC) has just published my review of Kevin Driscoll’s The Modem World: A Prehistory of Social Media (Yale UP, 2022).

In The Modem World, Driscoll provides an engaging social history of Bulletin Board Systems (BBSes), an early, dial-up precursor to social media that predated the World Wide Web. You might have heard of the most famous BBSes—likely Stuart Brand’s Whole Earth ‘Lectronic Link, or the WELL—but, as Driscoll elaborates, there were many others. Indeed, thousands of decentralized, autonomous virtual communities thrived around the world in the decades before the Internet became accessible to the general public. Through Driscoll’s eyes, these communities offer a glimpse of a bygone sociotechnical era and that prefigured and shaped our own in numerous ways. The “modem world” also suggests some paths beyond our current moment of disenchantment with the venture-funded, surveillance capitalist, billionaire-backed platforms that dominate social media today.

The book, like everything of Driscoll’s that I’ve ever read, is both enjoyable and informative and I recommend it for a number of reasons. I also (more selfishly) recommend the book review, which was fun to write and is just a few pages long. I got helpful feedback along the way from Yibin Fan, Kaylea Champion, and Hannah Cutts.

Because IJOC is an open access journal that publishes under a CC-BY-NC-ND license, you can read the review without paywalls, proxies, piracy, etc. Please feel free to send along any comments or feedback! For example, at least one person (who I won’t name here) thinks I should have emphasized the importance of porn in Driscoll’s account more heavily! While porn was definitely an important part of the BBS universe, I didn’t think it was such a central component of The Modem World. Ymmv?

Shaw, A. (2023). Kevin Driscoll, The Modem World: A Prehistory of Social Media. International Journal Of Communication, 17, 4. Retrieved from https://ijoc.org/index.php/ijoc/article/view/21215/4162

February 8, 2023February 7, 2023

How to cite Wikipedia (better)

Two participants of the "Rally to Restore Sanity and/or Fear" in Washington D.C. (USA), holding signs saying "Wikipedia is a valid source" and "citation needed." Photo by Kat Walsh (Wikipedia User: Mindspillage), October 30, 2010, CC-BY-SA 3.0. — Two participants of the “Rally to Restore Sanity and/or Fear” in Washington D.C. (USA), holding signs saying “Wikipedia is a valid source” and “citation needed.” October 30, 2010. Kat Walsh (User:Mindspillage), CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

Wikipedia provides the best and most accessible single source of information on the largest number of topics in the largest number of languages. If you’re anything like me, you use it all the time. If you (also like me) use Wikipedia to inform your research, teaching, or other sorts of projects that result in shared, public, or even published work, you may also want to cite Wikipedia. I wrote a short tutorial to help people do that more accurately and effectively.

The days when teachers and professors banned students from citing Wikipedia are perhaps not entirely behind us, but do you know what to do if you find yourself in a situation where it is socially/professionally acceptable to cite Wikipedia (such as one of my classes!) and you want to do so in a responsible, durable way?

More specifically, what can you do about the fact that any Wikipedia page you cite can and probably will change? How do you provide a useful citation to a dynamic web resource that is continuously in flux?

This question has come up frequently enough in my classes over the years, that I drafted a short tutorial on doing better Wikipedia citations for my students back in 2020. It’s been through a few revisions since then and I don’t find it completely embarrassing, so I am blogging about it now in the hopes that others might find it useful and share more widely. Also, since it’s on my research group’s wiki, you (and anyone you know) can even make further revisions or chat about it with me on my user:talk page.

You might be thinking, "so wait, does this mean I can cite Wikipedia for anything"??? To which I would respond "Just hold on there, cowboy."

Wikipedia is, like any other information source, only as good as the evidence behind it. In that regard, nothing about my recommendations here make any of the information on Wikipedia any more reliable than it was before. You have to use other skills and resources to assess the quality of the information you’re citing on Wikipedia (e.g., the content/quality of the references used to support the claims made in any given article).

Like I said above, the problem this really tries to solve is more about how to best cite something on Wikipedia, given that you have some good reason to cite it in the first place.

December 6, 2022December 5, 2022

CDSC students, courses, & (award winning!) instructors featured by Wiki Education

Wiki Education (a.k.a., WikiEdu) is an independent non-profit organization that promotes the integration of Wikipedia into education and classrooms. In pursuit of this mission, WikiEdu has created incredible resources for students and instructors, including tools that facilitate classroom assignments where students create and improve Wikipedia articles.

In courses at both Northwestern and the University of Washington, CDSC faculty and students have offered courses with Wikipedia assignments for over a decade. In the past two weeks, WikiEdu has featured the most recent instances of these courses on their blog.

The first WikiEdu post celebrated the work of a team of Northwestern students that included Carl Colglazier (TSB and CDSC Ph.D. student) and Hannah Yang (undergraduate Communication Studies major and former CDSC research assistant). The team, all members of the Online Communities & Crowds course I taught with CDSC Ph.D. student Sohyeon Hwang in Winter 2022, overhauled an article on Inclusive design in English Wikipedia. Since the article’s initial publication back in March, other Wikipedia editors have improved it further and it has attracted over 10,000 pageviews. Amazing work, team!

The second post celebrates UW Communication doctoral student Kaylea Champion, recipient of an Outstanding Teaching Award from the Communication Department on the strength of her work in another Winter 2022 undergraduate course on Online Communities (also taught by Benjamin Mako Hill) that features a Wikipedia assignment. Several of Kaylea’s students thought so highly of her work in the course that they collaborated in nominating her for the award. Kaylea enjoyed the experience enough that she’s about to offer the course again as the lead instructor at UW this upcoming Winter term. I should also note that Kaylea has been nominated for a university-wide award, but we won’t know the outcome of that process for a while yet. Congratulations, Kaylea!

The public recognition of CDSC students and teaching is gratifying and provides a great reminder of why assignments that ask students to edit Wikipedia are so valuable in the first place. Most fundamentally, editing Wikipedia engages students in the production of public, open access knowledge resources that serve a much greater and broader purpose than your typical term paper, pop quiz, or exam. When students develop encyclopedic materials on topics of their interest, motivated undergraduates like Hannah Yang can directly connect coursework with practical, real-world concerns in ways that build on the expertise of graduate students like Carl Colglazier. This kind of school work creates unusually high impact products. Kaylea Champion puts the idea eloquently in that WikiEdu post: “Instead of locking away my synthesis efforts in a paper no one but my instructors would read, the Wikipedia assignment pushed me to address the public.”

Just think, how many people ever read a word of most college (or high school or graduate school) term papers? By contrast, the Wikipedia articles created by our students have routinely been viewed over 100,000 times in aggregate by the end of the term in which we offer the course. Extrapolate this out over a decade and our students’ work has likely been read millions of times by now. As with other content on Wikipedia, this work will shape public discourse, including judicial decisions, scientific research, search engine results, and more. There’s absolutely nothing academic about that!

November 16, 2021November 16, 2021

Fool’s gold? The perils of using online survey samples to study online behavior

When it comes to research about participation in social media, sampling and bias are topics that often get ignored or politely buried in the "limitations" sections of papers. This is even true in survey research using samples recruited through idiosyncratic sites like Amazon’s Mechanical Turk. Together with Eszter Hargittai, I (Aaron) have a new paper (pdf) out in the International Journal of Communication (IJOC) that illustrates why ignoring sampling and bias in online survey research about online participation can be a particularly bad idea.

Surveys remain a workhorse method of social science, policy, and market research. But high-quality survey research that produces generalizable insights into big (e.g., national) populations is expensive, time-consuming, and difficult. Online surveys conducted through sites like Amazon Mechanical Turk (AMT), Qualtrics, and others offer a popular alternative for researchers looking to reduce the costs and increase the speed of their work. Some people even go so far as to claim that AMT has "ushered in a golden age in survey research" (and focus their critical energies on other important issues with AMT, like research ethics!).

Despite the hype, the quality of the online samples recruited through AMT and other sites often remains poorly or incompletely documented. Sampling bias online is especially important for research that studies online behaviors, such as social media use. Even with complex survey weighting schemes and sophisticated techniques like multilevel regression with post-stratification (MRP), surveys gathered online may incorporate subtle sources of bias because the people who complete the surveys online are also more likely to engage in other kinds of activities online.

Surprisingly little research has investigated these concerns directly. Eszter and I do so by using a survey instrument administered concurrently on AMT and a national sample of U.S. adults recruited through NORC at the University of Chicago (note that we published another paper in Socius using parts of the same dataset last year). The results suggest that AMT survey respondents are significantly more likely to use numerous social media, from Twitter to Pinterest and Reddit, as well as have significantly more experiences contributing their own online content, from posting videos to participating in various online forums and signing online petitions.

Such findings may not be shocking, but prevalent research practices often overlook the implications: you cannot rely on a sample recruited from an online platform like AMT to map directly to a general population when it comes to online behaviors. Whether AMT has created a survey research "golden age" or not, analysis conducted on a biased sample produces results that are less valuable than they seem.

January 29, 2021January 29, 2021

CDSC is hiring research assistants

The Northwestern University branch of the Community Data Science Collective (CDSC) is hiring research assistants. CDSC is an interdisciplinary research group made of up of faculty and students at multiple institutions, including Northwestern University, Purdue University, and the University of Washington. We’re social and computer scientists studying online communities such as Wikipedia, Reddit, Scratch, and more.

Screenshot from a recent remove meeting of the CDSC — A screenshot from a recent remote meeting of the CDSC…

Recent work by the group includes studies of participation inequalities in online communities and the gig economy, comparisons of different online community rules and norms, and evaluations of design changes deployed across thousands of sites. More examples and information can be found on our list of publications and our research blog (you’re probably reading our blog right now).

This posting is specifically to work on some projects through the Northwestern University part of the CDSC. Northwestern Research Assistants will contribute to data collection, analysis, documentation, and administration on one (or more) of the group’s ongoing projects. Some research projects you might help with include:

A study of rules across the five largest language editions of Wikipedia.
A systematic literature review on the gig economy.
Interviews with contributors to small, niche subreddit communities.
A large-scale analysis of the relationships between communities.

Successful applicants will have an interest in online communities, social science or social computing research, and the ability to balance collaborative and independent work. No specialized skills are required and we will adapt work assignments and training to the skills and interests of the person(s) hired. Relevant skills might include: coursework, research, and/or familiarity with digital media, online communities, human computer interaction, social science research methods such as interviewing, applied statistics, and/or data science. Relevant software experience might include: R, Python, Git, Zotero, or LaTeX. Again, no prior experience or specialized skills are required.

Expected minimum time commitment is 10 hours per week through the remainder of the Winter quarter (late March) with the possibility of working additional hours and/or continuing into the Spring quarter (April-June). All work will be performed remotely.

Interested applicants should submit a resume (or CV) along with a short cover letter explaining your interest in the position and any relevant experience or skills. Applicants should indicate whether you would prefer to pursue this through Federal work-study, for course credit (most likely available only to current students at one of the institutions where CDSC affiliates work), or as a paid position (not Federal work-study). For paid positions, compensation will be $15 per hour. Some funding may be restricted to current undergraduate students (at any institution), which may impact hiring decisions.

Questions and/or applications should be sent to Professor Aaron Shaw. Work-study eligible Northwestern University students should indicate this in their cover letter. Applications will be reviewed by Professor Shaw and current CDSC-NU team members on a rolling basis and finalists will be contacted for an interview.

The CDSC strives to be an inclusive and accessible research community. We particularly welcome applications from members of groups historically underrepresented in computing and/or data sciences. Some of these positions funded through a U.S. National Science Foundation Research Experience for Undergraduates (REU) supplement to awards numbers: IIS-1910202 and IIS-1617468.