Graph of subscribers and moderators over time in /r/NoSleep. The image is taken from our 2016 CHI paper.
Last year at CHI 2016, we published a qualitative study examining the effects of a large influx of newcomers to the /r/nosleep online community in Reddit. Our study began with the observation that most research on sustained waves of newcomers focuses on the destructive effect of newcomers and frequently invokes Usenet’s infamous “Eternal September.” Our qualitative study argued that the /r/nosleep community managed its surge of newcomers gracefully through strategic preparation by moderators, technological systems to reign in on norm violations, and a shared sense of protecting the community’s immersive environment among participants.
We are thrilled that, less a year after the publication of our study, Zhiyuan “Jerry” Lin and a group of researchers at Stanford have published a quantitative test of our study’s findings! Lin analyzed 45 million comments and upvote patterns from 10 Reddit communities that a massive inundation of newcomers like the one we studied on /r/nosleep. Lin’s group found that these communities retained their quality despite a slight dip in its initial growth period.
Our team discussed doing a quantitative study like Lin’s at some length and our paper ends with a lament that our findings merely reflected, “propositions for testing in future work.” Lin’s study provides exactly such a test! Lin et al.’s results suggest that our qualitative findings generalize and that sustained influx of newcomers need not doom a community to a descent into an “Eternal September.” Through strong moderation and the use of a voting system, the subreddits analyzed by Lin appear to retain their identities despite the surge of new users.
There are always limits to research projects work—quantitative and qualitative. We think the Lin’s paper compliments ours beautifully, we are excited that Lin built on our work, and we’re thrilled that our propositions seem to have held up!
You may have heard of Change.org. It’s a popular online petitioning platform. You may have even noticed there can many online petitions about popular topics. For instance, it is easy to find dozens of petitions protesting the Lychee and Dog Meat Festival with varying levels of support.
Imagine you want to start an online petition. You might worry if your petition is very similar to other people’s petitions that already have signatures. These other petitions have a head start and will get all the attention. That said, if nobody has made any similar petitions, maybe that’s because the issue you are petitioning about doesn’t yet have a lot of popular support. You might also worry if your petition is unusual. Which of these two worries (making a duplicate petition and making a petition no one cares about) should concern you, dear petition creator? In my research, I set out to answer this question. The project is still in progress. I recently presented it as a poster at CSCW ’17.
Sociologists of organizational ecology considered similar questions about businesses and social movement organizations. They wanted to explain why organizations were more likely to die when an industry was young or old, but less likely to die in between. They argued that density, or the number of organizations in the population, was tied both to processes of legitimation and competition. There aren’t many firms in unproven industries because it’s not clear the industry will succeed, but when an industry is mature it becomes competitive. Everybody wants a piece of the pie, but you might not get enough pie to survive! This notion is called density dependence theory.
I think it is intuitive to apply this logic to online petitions and topics. If you make a petition about a low-density topic, chances for success should be lower because the petition is more likely to be unusual or illegitimate. However if you make a petition in a high-density topic, now you have to worry about competition with all the other petitions in the topic. You want your petition to be original, but not weird!
To collect data to test this theory, I downloaded a large set of petitions from Change.org, spam filtered them, and removed very short ones. Next I used LDA topic modeling to group petitions into topics. This makes it possible to assign petitions to points in a topic space. The more crowded this part of topic space, the denser the petition’s environment.
Finally, I used a regression model to predict petition signature counts. Since density dependence theory predicts that the relationship between density and signature count is shaped like an upside-down U, I included a quadratic term for density. The plot below shows that observed relationship between density in topic space and signature count is what the theory predicted. The darkness of the lines at the bottom of the plot show that most petitions are in less dense parts of topic space. So you, dear petition creator, should worry about competition and legitimacy, but worry about legitimacy first!
I’m excited by this result because it shows interesting similarities between efforts to organize coordinated activism online and traditional organizations like firms. I’m planning to apply this method to other forms of online coordination like wikis and online communities.
This blog-post and the work it describes is a collaborative project between Nate TeBlunthuis, Benjamin Mako Hill and Aaron Shaw. We are still at work writing this project up as a research article. The work has been supported by the US National Science Foundation
When major league baseball held its opening day on April 15 in 1947, a 28 year-old infielder made his highly-anticipated debut at first base for the Brooklyn Dodgers. He would go on to record an extraordinary season and career worthy of induction into the Baseball Hall of Fame, winning Rookie of the Year honors in 1947, a batting title and Most Valuable Player award in 1949, and a World Series title in 1955. He also produced two seasons that rank among the top 100 ever (by the metric of Wins-Above-Replacement among position players).
Jackie Robinson (1954 public domain photo by Bob Sandberg for Look Magazine).
Looking at the box score, Jackie Robinson didn’t make an overwhelming impact on the outcome of his first game, but his presence on the field challenged the racist status quo of professional baseball and American society. What’s more, the intense public-ness of the challenge made Robinson’s presence a symbol and a spectacle: of the roughly 26,500 spectators in attendance at Ebbets field, an estimated 14,000 were black. I cannot imagine what it was like to be at that game — one of those rare places and moments where it becomes possible to see an historic social transformation as it unfolds. Just the thought gives me goosebumps.
Every major league player, coach, and umpire will don Robinson’s iconic number 42 in recognition today. Watching games and highlights from Jackie Robinson Days past, I’ve been troubled by how easily such observances drift into a hagiographic reverie that sometimes even take on a self-congratulatory tone. Stories of Robinson’s incredible athletic and personal accomplishments sometimes efface his struggle against horrible, violent, and aggressive responses. Worse yet, the stories usually play down the persistence of racism and its effects today. Baseball celebrates Jackie Robinson Day out of a strange combination of guilt and pride; knowledge and ignorance; resistance and complicity.
As I indicated earlier, Robinson’s performance and impact qualified him for the Hall of Fame along multiple dimensions. However, another way to think about his unique contribution to baseball is to consider how such virulent racism likely affected his play and how unbelievably, mind-blowingly great a player he might have been under less racist conditions.
There’s no obviously valid way to construct a counterfactual Jackie Robinson, but research on the phenomenon of stereotype threat suggests a very simple, naive statistical adjustment strategy. To paraphrase a bunch of scholarly studies and the (pretty extensive) Wikipedia article, stereotype threat reduces the performance of individuals who belong to negatively stereotyped groups, largely by inducing feelings of anxiety.
Stereotype threat affects various kinds of behaviors including athletic achievement. A 1999 study by Jeff Stone and colleagues (pdf) estimates the effects of some typical forms of stereotype threat on a sample of black men’s athletic performance, reporting that race-based priming resulted in a 23.5% worse outcome on a miniature golf (!) task than a control condition with no priming.
Consider that the priming in this Stone et al. study was done in a fairly polite, impersonal, non-hateful, non-threatening way in relation to a mini-golf task with absolutely nothing at stake. Consider just how personal, vitriolic, and violent the responses to Jackie Robinson were — many of them coming directly from opposing players and “fans” who went to great pains to heckle him in the middle of at-bats, physically target him with violent slides and more on the field, or issue death threats to him and his family. Consider how much Robinson had at stake and just how public his successes and his failures would have been.
Some people may like to imagine (and filmmakers may like to depict) that the hatred helped to motivate and focus Robinson, spurring him to even greater performance. Similarly, part of the mystique of the greatest athletes is that they seem to empty their heads of all the noise and distractions that would debilitate the rest of us at precisely those moments when the stakes and pressures are highest. It’s easy to say that Robinson didn’t respond to the pressure in the same way as most humans would, but the research on stereotype threat suggests that it probably affected him on the field anyway. Just being reminded — even in very subtle, socially-coded ways — that you belong to a socially excluded group reduced athletic performance by nearly a quarter. The sort of cognitive burden that comes along with being singled out and targeted by the kind of racial hatred that Robinson experienced must be orders of magnitude greater. What sort of impact would this burden have had on Robinson’s play?
Now, go look at the stat lines again from those two spectacular seasons (1949, WAR 9.6, and 1951, WAR 9.7) that Robinson had and imagine them without the stress, the pain, the distraction of all that hate. Be a little bit generous and inflate the WAR statistics by the same 23.5% that Stone et al.’s subjects performance dropped in a laboratory study in ridiculously low-key conditions. Under these assumptions, Robinson’s two greatest seasons might have yielded WAR of 11.9 and 12.0 respectively — easily placing them both among the top 10 seasons by a position player ever.
This quarter, I am teaching a graduate seminar called “The Practice of Scholarship” that is required for second-year students in the Northwestern MTS and TSB programs. Following Mako’s lead, I am using the Community Data Science Collective wiki to host the (editable) syllabus. In other words, I am eating to my heart’s content.
We had our first class session yesterday and it went really well. The goal for the quarter is for every student to prepare a manuscript for submission to a peer reviewed venue. I told the students that the course will serve as a hybrid writing boot camp and extended group therapy session. There will be much workshopping and iteration and sharing of feelings. There will also be polite, friendly, and unyielding pressure to produce scholarly work of exceptional quality.
In keeping with the wikified ethos, much of the course schedule remains tbd at this point, so please drop me a line with comments, suggestions, or pointers to great readings that brilliant, interdisciplinary, empirical social scientists and HCI researchers like my students would appreciate.
Hello world! It’s been a while since I’ve done any blogging, but I’ve been wanting to return for some time now, so here we are. My old blog was a hodge podge that hovered at the edges of my research. Current events featured prominently, especially those having to do with governance in online communities, knowledge production and access, and research ideas. I have a few different goals for this blog.
A new day dawns for blogging on the shores of Lake Michigan…
First, since it’s part of the Community Data Science Collective site, I plan to talk about our research, affiliates, community events, and related topics. Second, I want to use the blog as a space to sketch out research ideas more regularly. When I blogged previously, I was a graduate student. I had more unstructured time in which to brainstorm and reflect. The transition to faculty and the subsequent accumulation of responsibilities, projects, students, and commitments has left me seeking time to think broadly and with less structure. I need a semi-structured space and time to do so. As a result, I return to blogging.
This relates to a third goal: a minimum of one post per week. In the old days, Mako coordinated the Cambridge instance of Iron Blogger, a group blogging accountability project in which all the participants agreed to write one post per week or pay $5 into a common pot (that we then used to throw a party whenever it got big enough). The incentives sound misaligned, but the semi-public commitment, a deadline, and the nominal material cost of failure got a weekly post out of me roughly 90% of the time.
There is no iron blogger group in Chicago (yet?), but I’m going to recreate the structure with a little public accountability infrastructure with some friends. So far, Rachel and I have committed to posting weekly and tracking our posts. If others want to join, we can add further infrastructure as needed. No fines for now, but if I fail to post frequently between now and the end of the academic year, I’ll revisit.
Finally, since I do a lot more mentoring and teaching now than I used to, I imagine that these activities will occupy a fair amount of my attention as well. I feel more comfortable publishing material about my teaching now than when I first started at Northwestern. I am also realizing that my approach to teaching would lend itself really well to blogging as I am continually tinkering with the structure of my assignments, readings, evaluations, and lessons. A space to reflect on my experiences more actively and to solicit feedback from students and others seems like a helpful thing.
That’s it for this opening post. Thanks for reading.