Community Data Science Workshops at UW

 Photo from the Boston Python Workshop - a similar workshop run in Boston that has inspired and provided a template for the CDW.
Photo from the Boston Python Workshop – a similar workshop run in Boston that has inspired and provided a template for the CDSW.

The Community Data Science Workshops are a series of project-based workshops being held at the University of Washington for anyone interested in learning how to use programming and data science tools to ask and answer questions about online communities like Wikipedia, Twitter, free  and open source software, and civic media.

The workshops are for people with no previous programming experience. The goal is to bring together both researchers and academics as well as participants and leaders in online communities.  The workshops will all be free of charge. Participants from outside UW are encouraged to apply.

There will be three workshops held from 9am-4pm on three Saturdays in April and May. Each session will involve a period for lecture and technical demonstrations in the morning. This will be followed by a lunch graciously provided by the eSciences Institute at UW.  The rest  of the day will be followed by group work on programming and data science projects supported by more experienced mentors.

Introduction to Programming (April 5) — Programming is an essential tool for data science and is useful for solving many other problems. The goal of this session will be to introduce programming in the Python programming language. Each participant will leave having solved a real problem and will have built their first real programin their group. We will be relying on the curriculum from the Boston Python Workshops. Because we expect to hit the ground running, we will also run a session in the evening of Friday April 4 to help participants get software installed.

Importing Data from Wikipedia and Twitter APIs (May 3)  — An important step in doing data science is collecting data. The goal of this session will be to teach participants how to get data from the public application programming interfaces (“APIs”) common to many social media and online communities. Although, we will use the APIs provided by Wikipedia and Twitter in the session, the principles and techniques are common to many online communities.

Data Analysis and Visualization (May 31) — The goal of data science is to use data to answer questions. In our final session, we will use the Python skills we learned in the first session and the datasets we’ve created in the second to ask and answer common questions about the activity and health of online communities. We will focus on learning how to generate visualizations, create summary statistics, and test hypotheses.

Our goal is that, after the three workshops, participants will be able to use data to produce numbers, hypothesis tests, tables, and graphical visualizations to answer questions like:

  • Are new contributors to an article in Wikipedia sticking around longer or contributing more than people who joined last year?
  • Who are the most active or influential users of a particular Twitter hashtag?
  • Are people who participated in a Wikipedia outreach event staying involved? How do they compare to people that joined the project outside of the event?

Our first session will be modeled after the Boston Python Workshops, but the curriculum of the later sessions is still in development and will be influenced by the needs of the participants.

Sign up and Participate!

Participants! If you are interested in learning data science, fill out our registration form here. The deadline to register is Wednesday March 26th.  We will let participants know if we have room for them by Saturday March 29th. Space is limited and will depend on how many mentors we can recruit for the sessions.

Interested in being a mentor? If you already have experience with Python, please consider helping out at the sessions as a mentor. Being a mentor will involve working with participants and talking them through the challenges they encounter in programming. No special preparation is required. And we’ll feed you!  Because we want to keep a very high mentor to student ratio, recruiting more mentors means we can accept more participants. If you’re interested,  email makohill@uw.edu. Also, thank you, thank you, thank you!

About the Organizers

The workshops are being coordinated, organized, and led by Benjamin Mako Hill at the University of Washington Department of Communication and Jonathan Morgan at the Wikimedia Foundation. They have been designed with lots of help and inspiration from Shauna Gordon-McKeon and Asheesh Laroia of OpenHatch and lots of inspiration from the Boston Python Workshop.

These workshops are an all-volunteer effort. Fundamentally, we’re doing this because we’re programmers and data scientists that work in online communities and we really believe that the skills you’ll learn in these sessions are important and empowering tools.

The workshops are being supported by the UW Department of Communication and the eSciences Institute.

If you have any questions or concerns, contact Benjamin Mako Hill at makohill@uw.edu.

Dept.Comm_UW_vertical_small_square escience_logo

 Photo from the Boston Python Workshop - a similar workshop run in Boston that has inspired and provided a template for the CDW.
Photo from the Boston Python Workshop – a similar workshop run in Boston that has inspired and provided a template for the CDSW.

3 Replies to “Community Data Science Workshops at UW”

Leave a Reply