Fool’s gold? The perils of using online survey samples to study online behavior

The OG Mechanical Turk (public domain via Wikimedia Commons). It probably was not useful for unbiased survey research sampling either.

When it comes to research about participation in social media, sampling and bias are topics that often get ignored or politely buried in the "limitations" sections of papers. This is even true in survey research using samples recruited through idiosyncratic sites like Amazon’s Mechanical Turk. Together with Eszter Hargittai, I (Aaron) have a new paper (pdf) out in the International Journal of Communication (IJOC) that illustrates why ignoring sampling and bias in online survey research about online participation can be a particularly bad idea.

Surveys remain a workhorse method of social science, policy, and market research. But high-quality survey research that produces generalizable insights into big (e.g., national) populations is expensive, time-consuming, and difficult. Online surveys conducted through sites like Amazon Mechanical Turk (AMT), Qualtrics, and others offer a popular alternative for researchers looking to reduce the costs and increase the speed of their work. Some people even go so far as to claim that AMT has "ushered in a golden age in survey research" (and focus their critical energies on other important issues with AMT, like research ethics!).

Despite the hype, the quality of the online samples recruited through AMT and other sites often remains poorly or incompletely documented. Sampling bias online is especially important for research that studies online behaviors, such as social media use. Even with complex survey weighting schemes and sophisticated techniques like multilevel regression with post-stratification (MRP), surveys gathered online may incorporate subtle sources of bias because the people who complete the surveys online are also more likely to engage in other kinds of activities online.

Surprisingly little research has investigated these concerns directly. Eszter and I do so by using a survey instrument administered concurrently on AMT and a national sample of U.S. adults recruited through NORC at the University of Chicago (note that we published another paper in Socius using parts of the same dataset last year). The results suggest that AMT survey respondents are significantly more likely to use numerous social media, from Twitter to Pinterest and Reddit, as well as have significantly more experiences contributing their own online content, from posting videos to participating in various online forums and signing online petitions.

Such findings may not be shocking, but prevalent research practices often overlook the implications: you cannot rely on a sample recruited from an online platform like AMT to map directly to a general population when it comes to online behaviors. Whether AMT has created a survey research "golden age" or not, analysis conducted on a biased sample produces results that are less valuable than they seem.


Discover more from Community Data Science Collective

Subscribe to get the latest posts sent to your email.

Leave a Reply