Mozilla Contributor Analysis Project (Joint MoCo & MoFo)

I’m  back at the screen after a week of paternity leave, and I’ll be working part-time for next two weeks while we settle in to the new family routine at home.

In the meantime, I wanted to mention a Mozilla contributor analysis project in case people would like to get involved.

We have a wiki page now, which means it’s a real thing. And here are some words my sleep-deprived brain prepared for you earlier today:

The goal and scope of the work:

Explore existing contribution datasets to look for possible insights and metrics that would be useful to monitor on an ongoing basis, before the co-incident workweek in Portland at the beginning of December.

We will:

  • Stress-test our current capacity to use existing contribution data
  • Look for actionable insights to support Mozilla-wide community building efforts
  • Run ad-hoc analysis before building any ‘tools’
  • If useful, prototype tools that can be re-used for ongoing insights into community health
  • Build processes so that contributors can get involved in this metrics work
  • Document gaps in our existing data / knowledge
  • Document ideas for future analysis and exploration

Find out more about the project here.

I’m very excited that three members of the community have already offered to support the project and we’ve barely even started.

In the end, these numbers we’re looking at are about the community, and for the benefit of the community, so the more community involvement there is in this process, the better.

If you’re interested in data analysis, or know someone who is, send them the link.

This project is one of my priorities over the following 4-8 weeks. On that note, this looks quite appealing right now.

So I’m going make more tea and eat more biscuits.


  1. If within the scope of the project, an interesting metric to keep track of would be volunteer vs. contractor vs. employee contributions on the main mozilla-central tree, as well as in other known key repositories.

    It would also be interesting to see the recurrence and retention rates of these groups. I.e. how common is it that a new contributor stays around to commit more patches and how often they various groups do so.

    1. That’s an excellent question, and one that is surprisingly tricky because of the way employees work using personal accounts and separate email addresses for bugs and in many cases contributors move between those three categories. I tried a version of this using membership of the GitHub organization as an indicator, which was close enough when trying to count number of contributors. But when you look at the number of contributions by a contributor, employees naturally have the luxury of being able to spend much more time contributing than the average volunteer. Josh Matthews has looked at this by cross checking usernames against a staff list which I think will be the most accurate approach at this point.

      We will definitely need to do some segmentation as we analyze this data going forward. It might be that we segment on activity level regardless of contracts (ie. casual/active/core contributors).

      1. If you have access to the phonebook (the LDAP directory) most employees list their primary and work emails there, which you say may not be the same.

Comments are closed.