What I see in these graphs of Github contribution

Context: Last week I shared a few graphs (1, 2, 3, 4) looking at data from our repositories on Github, extracted using this Gitribution app thing, as part of our work to dashboard contributor numbers for the Mozilla Foundation.

I didn’t comment on the graphs at the time because I wanted time for others to look at them without my opinions skewing what they might see. This follow up post is a walk-through of some things I see in the graphs/data.

The real value in looking at data is finding ways to make things better by challenging ourselves, and being honest about what the numbers show, so this will be as much about questions as answers…

Also, publishing this last week flagged up some missing repositories and identified some other members of staff so these graphs are based on the latest version of the data (there was no impact on shapes, but some numbers will be different).

What time of day do people contribute (UTC)?

By Hour of DayOur paid staff who are committing code are mostly in US/Canadian timezones and it make sense that most of their commits are during these hours (graphed by UTC). But, what caught my attention here is that the volunteer contribution times follow the same shape.

Questions to ask:

  • Do volunteer contributions follow the same shape because contributing code has a dependency on being able to talk in real time with staff? For example in IRC. If so, is this a bottleneck for contributing code?
  • If not, what is creating this shape for volunteer contributors? Perhaps it’s biased to timezones where more people are interested in the things we are building, and potentially biased by language? But looking at support for Maker Party and other activities there is a global audience for our tools.
  • What does a code contribution pathway look like for people in the 0300-1300UTC times? Is there anything we can do to make things easier or more appealing?

The shape of volunteer contributions

ShapeThe shape of this graph is pretty typical for any kind of volunteering or activity involving human interactions. It’s close to a power law graph with a long-tail.

If you’ve not looked at a data set like this before, don’t panic that so many people only make a single contribution. At the same time, don’t use the knowledge that this is typical not to ask questions about how we can be better.

Lots of people want to get involved in volunteering projects but often their good intentions don’t align with their actual available free time. I say this as someone who signs up for more things than fit into my available hours for personal projects.

The two questions I want to ask of this graph are:

  1. Where could our efforts to support contributors best influence the overall shape?
  2. What does this look like at 10 x scale?

So, starting with where we could influence shape… my opinion (no data here) says to think about people in this range.Shape HighlightTo the left of this highlighted area people are already making code contributions over and above even many staff. Shower them in endless gratitude! But I don’t think they don’t need practical help from us.  To the right of this highlighted area is the natural long tail. Supporting that bigger group of people for single-touch interactions is about clear documentation and easy to follow processes. But I think the group of people roughly highlighted in that graph are people we can reach out to. These people potentially have capacity to do more. We should find out what they are interested in, what they want to get out of contribution and build relationships with them. In practical terms, we have finite time to invest in direct relationships with contributors. I think this is an effective place to invest some of that time.

I think the second question is more  challenging. What does this look like at 10 x scale?

In 2013, ~50 people made a one-time contribution.

  • What do we need in place for 500 people to make a one-time code contribution?
  • Do we have 500 suitable ‘first’ bugs for 2014?
  • Is the amount of setup work required to contribute to our tools appropriate for people making a single contribution?
  • If not, is that a blocker to growing contributor numbers?

In 2013, there were ~1,500 code commits by volunteers.

  • What do we need in place for 15,000 activities on top of planned staff activity?
  • How does this much activity align towards a common product roadmap?
  • How is it scheduled, allocated, reviewed and shipped?

When planning to work with 10 x contributor numbers, possibly the biggest shift to consider is the ratio of staff to volunteers:

ContributorRatio

  • How does impact on time allocated for code reviews?
  • How do we write bugs?
  • How we prioritize bugs? Etc.
  • Even, what does an IRC channel or a dev maling list look like after this change?

Other questions to ask:

  • What do we think is the current ‘ceiling’ on our contributor numbers for people writing code?
    • Is it the number of developers who know about our tools and want to help? (i.e. a ‘marketing’ challenge to inspire more people)
    • Is it the amount of suitable work ready and available for people who want to help? (are we losing people who want to help because it’s too hard to get involved?)
    • Both? With any bias?

 What do you think?

I’m only one set of eyes on this, so please challenge my observations and feel free to build on this too.

Also, as the data in here is publicly accessible already I think I can publish this Tableau view as an interactive tool you can play with, but I need to check the terms first.

Contribution Graphs part 4: Contributions by Contributors over time

I’m posting a quick series of these without much comment on my part as I’d love to know what you see in each of them.

This is looking at activity in Github (commits and issues), for the repositories listed here. It’s an initial dive into the data, so don’t be afraid to ask questions of it, or request other cuts of this. In the not so distant future, we’ll be able to look at this kind of data across our combined contribution activities, so this is a bit of a taster.

Click for the full-size images.

Contributions by Contributors over time

Last but not least for today, I think there are some stories in this one…

Contributions by Contributors over Time

Is anything here a surprise? What do you see in this?

Contribution Graphs part 3: Distribution of contributions

I’m posting a quick series of these without much comment on my part as I’d love to know what you see in each of them.

This is looking at activity in Github (commits and issues), for the repositories listed here. It’s an initial dive into the data, so don’t be afraid to ask questions of it, or request other cuts of this. In the not so distant future, we’ll be able to look at this kind of data across our combined contribution activities, so this is a bit of a taster.

Click for the full-size images.

Distribution of contributions (excluding staff work)

Here are a couple of ways of visualizing this same data.

Distribution 2Distribution 1

Is anything here a surprise? What do you see in this?

Contribution Graphs part 2: By hour of the day

I’m posting a quick series of these without much comment on my part as I’d love to know what you see in each of them.

This is looking at activity in Github (commits and issues), for the repositories listed here. It’s an initial dive into the data, so don’t be afraid to ask questions of it, or request other cuts of this. In the not so distant future, we’ll be able to look at this kind of data across our combined contribution activities, so this is a bit of a taster.

Click for the full-size images.

By hour of the day

By hour of the day

Is anything here a surprise? What do you see in this?

Contribution Graphs part 1: Contributions over time

I’m posting a quick series of these without much comment on my part as I’d love to know what you see in each of them.

This is looking at activity in Github (commits and issues), for the repositories listed here. It’s an initial dive into the data, so don’t be afraid to ask questions of it, or request other cuts of this. In the not so distant future, we’ll be able to look at this kind of data across our combined contribution activities, so this is a bit of a taster.

Click for the full-size images.

Contributions over time

1 combined Over time

Broken down by teams

2 By team

Broken down further by repository

3 By Repo

Is anything here a surprise? What do you see in this?

Is being a member of the mozilla ‘organization’ on github a good proxy indicator of being staff?

Following on from the post about Gitribution, these are my notes around my initial exploration of the data extracted from Github.

One of the challenges of counting volunteer contributors to Mozilla is working out who is a volunteer and who is paid-staff. The concept of a volunteer contributor in itself is full of complications, as paid staff will volunteer their free time on other projects they care about, and contributors become employees, or employees will work using their personal email addresses and so on. The fidelity of tracking that would be required to *perfectly* identify when someone does something on a ‘voluntary’ basis would not be proportionate to the impact this would have on the usefulness of the final reporting. So perfect tracking is not the goal here.

My first pass at filtering out staff from contributor counts on github was to look at whether someone is a member of the mozilla organization on github. I thought this would be a good proxy for ‘staff’, and doing this gave us this breakdown:

Without manually checking usernames, this is how the data is split between staff and contributors
Without manually checking usernames, this is how the contribution counts are split between staff and contributors

However, in this non-staff contributor segment of the data, there are a few names I know are definitely staff, and as I don’t know all of Mozilla’s staff I assume others in here are staff too.

Some names definitely in the wrong buckets at significant scale
Some names here are definitely in the wrong buckets, with significant contribution numbers linked to them

So, it’s safe to say that the inverse of our question is false. That is: not being a member of the org on github is not a good enough proxy to say someone is not a paid member of staff.

This is less critical when counting the number of people. For example this is the split of volunteers to staff using this github membership status as the proxy measure:

There might be 10 people who technically need to move from the blue to the orange bar, but that's not important if the aim is growing the blue bar 10x without much change to the orange bar.
There might be ~10 people who technically need to move from the blue to the orange bar, but that’s not important if the aim is growing the blue bar 10x without much change to the orange bar.

But if we want to analyze contribution activity (we do!) I need to manually (with a little automation in Tableau) check these github accounts, and add those who are staff to an extra list within Gitribution to cross check when saving the data:

4 Manually Check
These are the most significant accounts to check for people who are staff

Getting back to the original question… Is being a member of the mozilla ‘organization’ on github a good proxy indicator of being staff? 

The quick no-data-query-required test is to click through to a few profiles and look for examples of people who are not staff: https://github.com/orgs/mozilla/members. I found a few on the first page alone. But as stated earlier, it can also be hard to tell! Mozillians are a connected bunch who often work on other projects too. However, I found enough people in that list employed at other organizations to assume they are not all staff (though in some cases they used to be staff but are not now).

So to answer the question in it’s strictest sense, the answer is no. Being a member of the github organisation is not a certain indicator of being a paid member of staff.

But our context is more specific than this, so I need to refine the question: Is being a member of the mozilla ‘organization’ on github a good proxy indicator of being staff with regards to people actively contributing to Foundation projects on Github?

For this we go back to the data to check the most significant buckets of activity…

These are the priority accounts to manually check as they could skew the overall stats
These are the priority accounts to manually check as they could skew the overall stats

I can manually check this list of usernames making up the biggest chunks of contribution activity from those marked as ‘staff’.

There are a couple of people in here who are not current staff (and some former staff with less than 100 activities), but this would not skew the data enough that we should need to maintain yet another list of exceptions. There is also a further ‘grey area’ in the overlap between Mozilla contribution, and CDOT-supported/funded contribution to Mozilla.

I think for now at least, I will leave this list as it is, and say that the check against membership of the github organization is a meaningful filter, but we also need to maintain an extra list of ‘further people who are staff’.

So, I made these amends to Gitribution. Rebuilt the database and ran the queries again which gets us to here:

Comparing contributor numbers of staff to volunteers is barely changed, but the contribution activity is significantly different and will make our next analysis phase more accurate.
Comparing contributor numbers of staff to volunteers has barely changed, but the contribution activity is significantly different and will make our next analysis phase more accurate.

With the data in reasonable shape, we can do some more interesting analysis, which we’ll save for another post.

Gitribution

Click to embiggen. This was a check to see how well being a member of the github organisation flags someone as being staff.
Click to embiggen. How well does being a member of a github organisation flag someone as being staff?

Over the last week or so I’ve been building a thing: Gitribution. It’s an attempt to understand contributions to Mozilla Foundation work that happen on Github. It’s not perfect yet, but it’s in a state to get feedback on now.

Why did I build this?

For these reasons (in this order):

  1. Counting: To extract counts of contributor numbers from Github across Foundation projects on an automated ongoing basis
  2. Testing: To demo the API format we need for other sources of data to power our interim contributor dashboard
  3. Learning: To learn a bit about node.js so I can support metrics work on other projects more directly when it’s helpful to (i.e. submitting pull-requests rather than just opening bugs)

1. Counting

The data in this tool is all public data from the Github API, but it’s been restructured so it can be queried in ways that answer questions specific to our goals this year, and has some additional categorization of repositories to query against individual teams. The Github API on it’s own couldn’t answer our questions directly.

This also gives me data in a format that can be explored visually in Tableau (I’ll share this in a follow up blog post). We can now count Github contributors, and also analyze contributions.

2. Testing

Part of our interim dashboard plans include a standard format for reporting on numbers of active and new contributors for a given activity. Building this tool was a way to test if that format makes sense. The output is an API that you can ping with a date and see:

  1. The number of unique usernames to contribute in the 12 months prior (excluding those users who are members of the Github organization that owns the repositories – ie Mozilla or openNews)
  2. The number of those who only contributed in the 7 days prior (i.e. new contributors)

You can test the API here (change the date, or the team names – currently webmaker, openbadge, openNews)

We can use this in the dashboard soon.

Learning

I know a lot more about node.js than I did last week. So that’s something 🙂

I started out writing this as though it was a python app using JavaScript syntax before grasping the full implications of node’s non-blocking model.

I descended into what I later found out is called callback hell and felt much better when I learned that callback hell is a shared experience!

I tried an extreme escape from callback hell by re-building the app in a fire-and-forget process that kicked off several thousand pings to the Github API and paid no attention to whether or not they succeeded (clearly not a winning solution).

And I’ve ended up with something that isn’t too hellish but uses callbacks to manage the process flow. The current process is pretty linear, so I was able to sense check what it’s doing but it also works mostly on one task at a time so isn’t getting the potential value out of node’s non-blocking model.

Next steps

  • Tweaks to the categorization of ‘members=staff’
    • See the attached image of contributions by username. There are some members of staff with many contributions who are not members of Mozilla on Github. This is not material when counting number of contributors in relation to targets, but when we analyze contribution activity those users with a lot of contributions skew the data significantly.
  • Check and correct the list of repos assigned to each team
    • Currently a best guess based on my limited knowledge and some time trawling through all the repos on the main Mozilla Github page
  • Work out how to use this with Science Lab projects
    • as Software Carpentry use Github as part of their training (which I love) it means the data in their account doesn’t represent the same kinds of activities in the other repos. I need to think about this.
  • Pick the brains of my knowledgeable colleagues and get a review of this code

What else is this good for?

User Testing vs. A/B Testing

So you have a page, or a system, or a form or an app or anything, and you know you want to make it ‘better’.

And the question is…

Should we use User Testing or A/B Testing?

TL;DR: Both can be valuable, and you can do both. So long as you have an underlying indicator of the ‘better’ that you can track over the long-term.

A/B Testing

A/B testing is good because:

  • It shows you ‘for real’, at scale, what actually prompts your users to do the things you most want them to do
  • It can further polish processes that have been through thorough user testing; to a degree that’s just not possible with a small group of people
  • It can self-optimize content in real-time to maximize the impact of short spikes of campaign activity

A/B testing is bad because:

  • You need a lot of traffic to get significant results which can prevent testing of new or niche products
  • You can get distracted by polishing the details, and overlook major UX changes that would have significant impact
  • It can slow down the evolution of your project if you try to test *everything* before you ship

User Testing

User testing is good because:

  • You can quickly get feedback on new products even before they are live to the public; you don’t need any real traffic
  • It flags up big issues that the original designers might miss because of over familiarity with their offering (the ‘obvious with hindsight’ things)
  • Watching someone struggle to use the thing you’ve built inspires action (it does for me at least!)

User testing is bad because:

  • Without real traffic, you’re not measuring how this works at scale
  • What people say they will do is not the same as what they do (this can be mitigated, but it’s a risk)
  • If you don’t connect this to some metrics somewhere, you don’t know if you’re actually making things better or you’re stuck in an infinite loop of iterative design

An ideal testing process combines both

  1. For the thing you want to improve, say a webpage, decide on your measure of success, e.g. number of sign-ups
  2. Ensure this measure of success is tracked, reported on and analyzed over time and understand if there are underlying seasonal trends
  3. Start with User Testing to look for immediate problems/opportunities
  4. If you have enough traffic, A/B Test continually to polish your content and design features
  5. Keep checking the impact of your changes (A/B and User Testing) against your underlying measure of success
  6. Repeat User Testing
  7. A/B Testing can include ideas that come from User Testing sessions
  8. Expect to see diminishing returns on each round of testing and optimization
  9. Regularly ask yourself, are the potential gains of another round of optimization worth the effort? If they’re not, move onto something else you can optimize.

That’s simplified in parts, but is it useful for thinking about when to run user testing vs. A/B testing?

The Language of Contribution Metrics

TL;DR: Share your thoughts on the language we use around contribution metrics here (anyone can contribute): https://etherpad.mozilla.org/contributors-dashboard-language

Then if you have the time, here are some of my thoughts on this topic…

What does language have to do with metrics?

You’d be forgiven for thinking that working with data and metrics is a clean and scientific-like process of running queries against a database or two and generating a report. In many ways, I’m glad it’s not as simple as that.

Metrics are only as good as the things they enable us to improve. Which means while metrics need to be grounded in good clean data, they are primarily for people; and not just for people to read.

In their best incarnation, metrics motivate people to change things for the better.

At this scale, motivating people is definitely more art than science which gets us to the topic of this post: the language that frames our metrics.

The project this conversation relates to is building a dashboard to measure the number of active contributors to the Mozilla Foundation. Counting is a *reasonably* clean task on it’s own, but the reason this project exists is to support our goals to grow Mozilla at scale (10x at first, 1 Million Mozillians in time).

The language we use to frame the numbers on this dashboard does impact on how well the dashboard motivates us to ask tough questions of our plans and processes in relation to this goal. So it’s an important part of the dashboard UX.

My intro to this project was framed as such: “Our goal for 2014 is to ship 10,000 contributors”. And as someone who likes agile development, hacking things together and getting things done, ‘ship’ is a word that appeals to me and I think resonates for many at Mozilla, but it’s also internal parlance. Not secret by any means, but intended for a particular audience.

Where this language becomes ‘trickyish’, is having this conversation in the open (our plans are open and acknowledge this challenge). Our contributors are not a product, and the word ‘shipped’ might not sit right with them.

So how do we talk about growing contribution without risking taking away an individual’s feeling that contribution is something they choose to do?

It’s not a challenge unique to Mozilla, but our approach to working open might enable a unique solution.

This anecdote is from my time at WWF, and I think highlights the risk we are talking about:

I remember looking at supporter comments in response to the question “what prompted you to give?” which we asked at the end of our donation process. Though this story accounts for the minority of our respondents, there was a recurring theme where people’s records show they have responded to something like a unique marketing URL in a TV ad and donated to the particular issue highlighted by the ad in question; but they would go on to make comments like this: “Nothing *prompted* me to donate. I did so off my own back, because I care about [tigers/pandas/etc] and I myself decided I should do something to help.”

Many people (supporters/volunteers) strongly want to think they are entirely responsible for the things they choose to do.

But organisations recruiting supporters at scale know that you have to actively do things to bring people on board – from marketing and advertising through to welcoming and community management and much more. Fundraising 101: if you don’t ask you don’t get.

Traditional non-profit organisations, due to the economic pressures of effectively fundraising, skip this conversation and focus on the story of how support directly impacts the end goals of their mission… e.g. £10 can X. They cannot afford to lose the profitability of their fundraising in order to better talk about the challenge and process of fundraising (at least at scale) even though these business-like processes exist to support the same underlying mission that supporters care about.

But Mozilla’s way of working is far from traditional, so I think we’re in a great place to talk about why and how we’re counting contribution and more excitingly we can open up the tools that we hope can motivate staff to grow contributor numbers, so they can be used by the community too (this is another blog post to follow).

For now though, let’s talk about the word ‘shipping’.

Does the phrase ‘shipping contributors’ help you think at scale? Or does it sit uneasy? What word would you use instead?

Please join the conversation here:
https://etherpad.mozilla.org/contributors-dashboard-language