Mozilla Contributor Analysis Project (Joint MoCo & MoFo)

I’m  back at the screen after a week of paternity leave, and I’ll be working part-time for next two weeks while we settle in to the new family routine at home.

In the meantime, I wanted to mention a Mozilla contributor analysis project in case people would like to get involved.

We have a wiki page now, which means it’s a real thing. And here are some words my sleep-deprived brain prepared for you earlier today:

The goal and scope of the work:

Explore existing contribution datasets to look for possible insights and metrics that would be useful to monitor on an ongoing basis, before the co-incident workweek in Portland at the beginning of December.

We will:

  • Stress-test our current capacity to use existing contribution data
  • Look for actionable insights to support Mozilla-wide community building efforts
  • Run ad-hoc analysis before building any ‘tools’
  • If useful, prototype tools that can be re-used for ongoing insights into community health
  • Build processes so that contributors can get involved in this metrics work
  • Document gaps in our existing data / knowledge
  • Document ideas for future analysis and exploration

Find out more about the project here.

I’m very excited that three members of the community have already offered to support the project and we’ve barely even started.

In the end, these numbers we’re looking at are about the community, and for the benefit of the community, so the more community involvement there is in this process, the better.

If you’re interested in data analysis, or know someone who is, send them the link.

This project is one of my priorities over the following 4-8 weeks. On that note, this looks quite appealing right now.

So I’m going make more tea and eat more biscuits.

Overlapping types of contribution

Screen Shot 2014-08-21 at 14.02.27TL;DR: Check out this graph!

Ever wondered how many Mozfest Volunteers also host events for Webmaker? Or how many code contributors have a Webmaker contributor badge? Now you can find out

The reason the MoFo Contributor dashboard we’re working from at the moment is called our interim dashboard is because it’s combining numbers from multiple data sources, but the number of contributors is not de-duped across systems.

So if you’re counted as a contributor because you host an event for Webmaker, you will be double counted if you also file bugs in Bugzilla. And until now, we haven’t known what those overlaps look like.

This interim solution wasn’t perfect, but it’s given us something to work with while we’re building out Baloo and the cross-org areweamillionyet.org (and by ‘we’, the vast credit for Baloo is due to our hard working MoCo friends Pierros and Sheeri).

To help with prepping MoFo data for inclusion in Baloo, and by  generally being awesome, JP wired up an integration database for our MoFo projects (skipping a night of sleep to ship V1!).

We’ve tweaked and tuned this in the last few weeks and we’re now extracting all sorts of useful insights we didn’t have before. For example, this integration database is behind quite a few of the stats in OpenMatt’s recent Webmaker update.

The downside to this is we will soon have a de-duped number for our dashboard, which will be smaller than the current number. Which will feel like a bit of a downer because we’ve been enthusiastically watching that number go up as we’ve built out contribution tracking systems throughout the year.

But, a smaller more accurate number is a good thing in the long run, and we will also gain new understanding about the multiple ways people contribute over time.

We will be able to see how people move around the project, and find that what looks like someone ‘stopping’ contributing, might be them switching focus to another team, for example. There are lots of exciting possibilities here.

And while I’m looking at this from a metrics point of view today, the same data allows us to make sure we say hello and thanks to any new contributors who joined this week, or to reach out and talk to long running active contributors who have recently stopped, and so on.

Trendlines and Stacking Logs

TL;DR

  • Our MoFo dashboards now have trendlines based on known activity to date
  • The recent uptick in activity is partly new contributors, and partly new recognition of existing contributors (all of which is good, but some of which is misleading for the trendline in the short term)
  • Below is a rambling analogy for thinking about our contributor goals and how we answer the question ‘are we on track for 2014?’
  • + if you haven’t seen it, OpenMatt has crisply summarized a tonne of the data and insights that we’ve unpicked during Maker Party

Stacking Logs

I was stacking logs over the weekend, and wondering if I had enough for winter, when it struck me that this might be a useful analogy for a post I was planning to write. So bear with me, I hope this works…

To be clear, this is an analogy about predicting and planning, not a metaphor for contributors* :D

So the trendline looks good, but…

Screen Shot 2014-08-19 at 11.47.27

Trendlines can be misleading.

What if our task was gathering and splitting logs?

Vedstapel, Johannes Jansson (1)

We’re halfway through the year, and the log store is half full. The important questions is, ‘will it be full when the snow starts falling?

Well, it depends.

It depends how quickly we add new logs to the store, and it depends how many get used.

So let’s push this analogy a bit.

Firewood in the snow

Before this year, we had scattered stacks of logs here and there, in teams and projects. Some we knew about, some we didn’t. Some we thought were big stacks of logs but were actually stacked on top of something else.

Vedstapel, Johannes Jansson

Setting a target was like building a log store and deciding to fill it. We built ours to hold 10,000 logs. There was a bit of guesswork in that.

It took a while to gather up our existing logs (build our databases and counting tools). But the good news is, we had more logs than we thought.

Now we need to start finding and splitting more logs*.

Switching from analogy to reality for a minute…

This week we added trendlines to our dashboard. These are two linear regression lines. One based on all activity for the year to-date, and one based on the most recent 4 weeks. It gives a quick feedback mechanism on whether recent actions are helping us towards to our targets and whether we’re improving over the year to-date.

These are interesting, but can be misleading given our current working practices. The trendline implies some form of destiny. You do a load of work recruiting new contributors, see the trendline is on target, and relax. But relaxing isn’t an option because of the way we’re currently recruiting contributors.

Switching back to the analogy…

We’re mostly splitting logs by hand.

Špalek na štípání.jpg

Things happen because we go out and make them happen.

Hard work is the reason we have 1,800 Maker Party events on the map this year and we’re only half-way through the campaign.

There’s a lot to be said for this way of making things happen, and I think there’s enough time left in the year to fill the log store this way.

But this is not mathematical or automated, which makes trendlines based on this activity a bit misleading.

In this mode of working, the answer to ‘Are we on track for 2014?‘ is: ‘the log store will be filled… if we fill it‘.

Scaling

Holzspalter 2

As we move forward, and think about scale… say a hundred-thousand logs (or even better, a Million Mozillians). We need to think about log splitting machines (or ‘systems’).

Systems can be tested, tuned, modified and multiplied. In a world of ‘systems’ we can apply trendlines to our graphs that are much better predictors of future growth.

We should be experimenting with systems now (and we are a little bit). But we don’t yet know what the contributor growth system looks like that works as well as the analogous log splitting machines of the forestry industry. These are things to be invented, tested and iterated on, but I wouldn’t bet on them as the solution for 2014 as this could take a while to solve.

I should also state explicitly that systems are not necessarily software (or hardware). Technology is a relatively small part of the systems of movement building. For an interesting but time consuming distraction, this talk on Social Machines from last week’s Wikimania conference is worth a ponder:

Predicting 2014 today?

Even if you’re splitting logs by hand, you can schedule time to do it. Plan each month, check in on targets and spend more or less time as required to stay on track for the year.

This boils down to a planning exercise, with a little bit of guess work to get started.

In simple terms, you list all the things you plan to do this year that could recruit contributors, and how many contributors you think each will recruit. As you complete some of these activities you reflect on your predictions, and modify the plans and update estimates for the rest of the year.

Geoffrey has put together a training workshop for this, along with a spreadsheet structure to make this simple for teams to implement. It’s not scary, and it helps you get a grip on the future.

From there, we can start to feed our planned activity and forecast recruitment numbers into our dashboard as a trendline rather than relying solely on past activity.

The manual nature of the splitting-wood-like-activity means what we plan to do is a much more important predictor of the future than extrapolating what we have done in the past, and that changing the future is something you can go out and do.

*Contributors are not logs. Do not swing axes at them, and do not under any circumstances put them in your fireplace or wood burning stove.

2014 Contributor Goals: Half-time check-in

We’re a little over halfway through the year now, and our dashboard is now good enough to tell us how we’re doing.

TL;DR:

  • The existing trend lines won’t get us to our 2014 goals
    • but knowing this is helpful
    • and getting there is possible
  • Ask less: How do we count our contributors?
  • Ask more: What are we doing to grow the contributor community? And, are we on track?

Changing the question

Our dashboard now needs to move from being a project to being a tool that helps us do better. After all, Mozilla’s unique strength is that we’re a community of contributors and this dashboard, and the 2014 contributor goal, exist to help us focus our workflows, decisions and investments in ways that empower the community. Not just for the fun of counting things.

The first half of the year focused us on the question “How do we count contributors?”. By and large, this has now been answered.

We need to switch our focus to:

  1. Are we on track?
  2. What are we doing to grow the contributor community?

Then repeating these two question regularly throughout the year, and adjusting our strategy as we go.

Are we on track?

Wearing my cold-dispassionate-metrics hat, and not my “I know how hard you’re all working already” hat, I have to say no (or, not yet).

I’m going to look at this team by team and then look at the All Mozilla Foundation view at the end.

Your task, for each graph below is to take an imaginary marker pen and draw the line for the rest of the year based on the data you can see to date. And only on the data you can see to-date.

  • What does your trend line look like?
  • Is it going to cross the dotted target line in 2014?

OpenNews

Screen Shot 2014-07-18 at 19.48.44

Based on the data to-date, I’d draw a flat line here. Although there are new contributors joining pretty regularly, the overall trend is flat. In marketing terms there is ‘churn'; not a nice term, but a useful one to talk about the data. To use other crass marketing terms, ‘retention’ is as important as ‘acquisition’ in changing the shape of this graph.

Science Lab

Screen Shot 2014-07-18 at 19.49.55

Dispassionately here, I’d have to draw a trend line that’s pointing slightly down. One thing to note in this view is that the Science Lab team have good historic data, so what we’re seeing here is the result of the size of the community in early 2013, and some drop-off from those people.

Appmaker

Screen Shot 2014-07-18 at 19.50.57

This graph is closest to what we want to see generally, i.e. pointing up. But I’ll caveat that with a couple of points. First, taking the imaginary marker pen, this isn’t going to cross the 2014 target line at the current rate. Second, unlike the Science Lab and OpenNews data above, much of this Appmaker counting is new. And when you count things for the first time, a 12 month rolling active total has a cumulative effect in the first year, which increases the appearance of growth, but might not be a long term trend. This is because Appmaker community churn won’t be a visible thing until next year when people could first drop out of the twelve month active time-frame.

Webmaker

Screen Shot 2014-07-18 at 19.51.47

This graph is the hardest to extend with our imaginary marker pen, especially with the positive incline we can see as Maker Party kicks off. The Webmaker plan expects much of the contributor community growth to come from the Maker Party campaign, so a steady incline was not the expectation across the year. But, we can still play with the imaginary marker pen.

I’d do the following exercise: In the first six months, active contributors grew by ~800 (~130 per month), so assuming that’s a general trend (big assumption) and you work back from 10k in December you would need to be at ~9,500 by the end of September. Mark a point at 9,500 contributors above the October tick and look at the angle of growth required throughout Maker Party to get there. That’s not impossible, but it’s a big challenge and I don’t have any historic data to make an informed call here.

Note: the Appmaker/Webmaker separation here is a legacy thing from the beginning of the year when we started this project. The de-duped datastore we’re working on next will allow us to graph: Webmaker Total > Webmaker Tools > Appmaker as separate graphs with separate goals, but which get de-duped and roll-up into the total numbers above, and in turn roll-up into the Mozilla wide total at areweamillionyet.org – this will better reflect the actual overlaps.

Metrics

[ 0 contributors ]

The MoFo metrics team currently has zero active volunteer contributors, and based on the data available to date is trending absolutely flat. Action is required here, or this isn’t going to change. I also need to set a target. Growing 0 by 10X doesn’t really work. So I’ll aim for 10 volunteer contributors in 2014.

All Mozilla Foundation

Screen Shot 2014-07-18 at 19.52.40

Here we’re adding up the other graphs and also adding in ~900 people who contributed to MozFest in October 2013. That MozFest number isn’t counted towards a particular team and simply lifts the total for the year. There is no trend for the MozFest data because all the activity happened at once, but if there wasn’t a MozFest this year (don’t worry, there is!) in October the total line would drop by 900 in a single week. Beyond that, the shape of this line is the cumulative result of the team graphs above.

In Q3, we’ll be able to de-dupe this combined number as there are certainly contributors working across MoFo teams. In a good way, our total will be less that the sum of our parts.

Where do we go from here?

First, don’t panic. Influencing these trend lines is not like trying to shift a nation’s voting trends in the next election. Much of this is directly under our control, or if not ‘control’, then it’s something we can strongly influence. So long as we work on it.

Next, it’s important to note that this is the first time we’ve been able to see these trends, and the first time we can measure the impact of decisions we make around community building. Growing a community beyond a certain scale is not a passive thing. I’ve found David Boswell’s use of the term ‘intentional’ community building really helpful here. And much more tasteful than my marketing vocabulary!

These graphs show where we’re heading based on what we’re currently doing, and until now we didn’t know if we were doing well, or even improving at all. We didn’t have any feedback mechanism on decisions we’d make relating to community growth. Now we do.

Trend setting

Here are some initial steps that can help with the ‘measuring’ part of this community building task.

Going back to the marker pen exercise, take another imaginary color and rather than extrapolate the current trend, draw a positive line that gets you to your target by the end of the year. This doesn’t have to be a straight line; allow your planned activity to shape the growth you want to see. Then ask:

  • Where do you need to be in Aug, Sep, Oct, Nov, Dec?
  • How are you going to reach each of these smaller steps?

Schedule a regular check-in that focuses on growing your contributor community and check your dashboard:

  • Are your current actions getting you to your goals?
  • What are the next actions you’re going to take?

The first rule of fundraising is ‘Ask for money’. People often overlook this. By the same measure, are you asking for contributions?

  • How many people are you asking this week or month to get involved?
  • What percentage of them do you expect to say yes and do something?

Multiply those numbers together and see if it that prediction can get you to your next step towards your goal.

Asking these questions alone won’t get us to our goals, but it helps us to know if our current approach has the capacity to get there. If it doesn’t we need to adjust the approach.

Those are just the numbers

I could probably wrap up this check-in from a metrics point of view here, but this is not a numbers game. The Total Active Contributor number is a tool to help us understand scale beyond the face-to-face relationships we can store in our personal memories.

We’re lucky at Mozilla that so many people already care about the mission and want to get involved, but sitting and waiting for contributors to show up is not going to get us to our goals in 2014. Community building is an intentional act.

Here’s to setting new trends.

MoFo Contributor Dashboard(s) – switching to Plan A

tl;dr:

  • We’re wrapping up work on the MoFo Interim Dashboard
    • The only other data source we’ll add is the badge counts for webmaker mentors & hive community members
    • This is still our MoFo working document for the time being
  • Then, we’ll switch our development efforts into integrating with Project Baloo
    • Baloo is where we will de-dupe contributors across teams/tools etc.
    • areweamillionyet.org will become our working document in time (‘time’ is TBC)

Switching to Plan A

The MoFo contributor dashboard we’ve been working with this year is our *interim* counting solution, and just as we’re “completing” it we’re now in a position to switch from an interim solution to a fully integrated system which is properly integrated with MoCo. This is pretty good timing, but it’s a change in scope for our immediate work so is worth a status update.

Within MoFo we’ve deliberately been working on a Plan B solution. This is a relatively crude, not de-duped, not a single-source-of-truth dashboard, to give us the quickest visibility we could get into contributor trends. But as we’ve noted from the start, this solution keeps in mind Plan A so that we can transition to it when we’re ready.

We started our work with this Plan B because two existing projects were underway that could be our Plan A and we didn’t want to duplicate these efforts and/or sit around idle as we didn’t know for sure when either of these could be used ‘in anger’.

Project Baloo (formerly Wormhole/Blackhole) was the most closely aligned piece of work, but there was also work to issue contributor badges which was another possible place to count the number of people contributing. Both of these projects are complex because they span many teams and systems and the philosophically-quicksand-like question “what counts as contribution?”. So we didn’t know exactly when we could use either of these as our data-store.

As of last week, Baloo is a functioning data-warehouse with working aggregations setup for a number of MoCo teams. Which means we can begin our switch from Plan B to Plan A (or more accurately our *transition* to Plan A, as this will take time).

The dashboard at areweamillionyet.org which was being demo’d with purely github data when we launched it a couple of weeks ago, is now using data from Baloo’s de-duped aggregations across Github, Bugzilla and SUMO activity. For now, the output of this work is sitting in this Google Doc, and graphed with the help of a little node app.

There are many more teams and systems to integrate into Baloo going forward (across MoCo and MoFo), and there’s a fair bit of automation to work out, but this dashboard is a real example of how this can work. This is Plan A in action, and where we can focus our efforts next.

So, for MoFo colleagues, what does this mean in practical terms?:

We should:

  • Continue using the existing dashboard as our working document
  • Continue using the adhoc logger for counting contributions that don’t have a record anywhere else

I will limit the open work on the Interim Dashboard to:

  • Minimal update to the interface
  • Adding in numbers for Webmaker Mentors and Hive members via badges data
  • “Won’t-fixing-ing” some bugs where the effort is now better spent on Plan A

I will file bugs and start working on:

  • Getting raw contribution data into Project Baloo
  • Documenting the conversion points in a way that’s more consistent with MoCo’s
  • Joining up the documentation for this with MoCo’s existing info

When ‘less than the sum of our parts’ is a good thing

areweamillionyetHere’s a happy update about our combined Mozilla Foundation (MoFo) and Mozilla Corporation (MoCo) contributor dashboards.

TL;DR: There’s a demo All Mozilla Contributor Dashboard you can see at areweamillionyet.org

It’s a demo, but it’s also real, and to explain why this is exciting might need a little context.

Since January, I’ve been working on MoFo specific metrics. Mostly because that’s my job, but also because this/these organisations/projects/communities take a little while to understand, and getting to know MoFo was enough to keep me busy.

We also wanted to ship something quickly so we know where we stand against our MoFo goals, even if the data isn’t perfect. That’s what we’ve built in our *interim* dashboard. It’s a non de-duped aggregation of the numbers we could get out of our current systems without building a full integration database. It gives us a sense of scale and shows us trends. While not precise and high resolution yet, this has still been helpful to us. Data can sometimes look obvious once you see it, but before this we were a little blind.

So naturally, we want to make this dashboard as accurate as possible, and the next step is integrating and de-duping the data so we can know if the people who run Webmaker events are the people who commit code, are the people who file bugs, are the people who write articles for Source, are the people who teach Software Carpentry Bootcamps, etc, etc.

The reason we didn’t start this project by building a MoFo integration database, is because MoCo were already working on that. And in less than typical Mozilla style (as I’m coming to understand it), we didn’t just build our own version of this. ;) (though there might be some scope and value for integrating some of this data within the Foundation anyway, but that’s a separate thought).

The integration database in question is MoCo’s project Baloo, which Pierros, and many people on the MoCo side have been working on. It’s a complex project influenced by more than than just technical requirements. Scoping the system is the point at which many teams are first looking at their contributor data in detail and working out what their indicators of contribution look like.

Our plan is that our MoFo interim dashboard data-source can eventually be swapped out for the de-duped ‘single source of truth’ system, at which point it goes from being a fuzzy-interim thing to a finalized precise thing.

While MoCo and ‘Fo have been taking two different approaches to solving this problem, we’ve not worked in isolation. We meet regularly, follow each other’s progress and have been trying to find the point where these approaches can merge into a unified cross Mozilla solution.

The demo we shipped yesterday was the first point where we’ve joined up this work.

Dial-up modems

I want to throw in an internet based analogy here, for those who remember dial-up modems.

Let’s imagine this image shows us *all* the awesome Mozilla contributors and what they are doing. We want there to be 20k of them in 2014.

unknown

It’s not that we don’t know if we have contributors. We’ve seen individual contributors, and we’ve seen groups of contributors, but we haven’t seen them all in one place yet.

So to continue the dial-up modem analogy, let’s think of this big-picture view of contribution as a large uncompressed JPEG, which has been loading slowly for a few months.

The MoFo interim dashboard has been getting us part of this picture. Our approach has revealed the MoFo half of this picture with slowly increasing resolution and accuracy. It’s like an interlaced JPEG, and is about this accurate so far:

interlaced

The Project Baloo approach is precise and can show MoCo and MoFo data, but adds data source at a time. It’s rolling out like a progressive JPEG. The areweamillionyet.org dashboard demo isn’t using Baloo yet, but the data it’s using is a good representation of how Baloo can work. What you can see in the demo dashboard is a picture like this:

progressive

(Original Photo Credit: Gen Kanai)

About areweamillionyet.org

This is commit data extracted from cross team/org/project repositories via github. Even though code contribution is only one part of the big-picture. Seeing this much of the image tells us things we didn’t know before. It gives us scale, trends and ways to ask questions about how to effectively and intentionally grow the community.

The ‘commit history’ over time is also a fascinating data set, and I’ll follow up with a blog post on that soon.

Less that the sum of our parts? When 5 + 5 = 8

With the goal of 20k active contributors this year, shared between MoCo and MoFo, we’re thinking about 10k active contributors to MoCo and to MoFo. And if we counted each org in isolation we could both say “here’s 10k active contributors”, and this would be a significant achievement. But, if we de-dupe these two sets it would be really worrying if there wasn’t an overlap between the people who contribute to MoCo and the people who contribute to MoFo projects.

Though we want to engage many many individual contributors, I think a good measure of our combined community building effectiveness will be how much these ‘pots’ of contributors overlap. When 10k MoFo contributors + 10k MoCo contributors = 15k combined Mozilla contributors, we should definitely celebrate.

That’s the thing I’m most excited about with regards to joining up the data. Understanding how contributors connect across projects; how they volunteer their time, energy and skills is many different ways, and understanding what ‘Many Voices, One Mozilla’ looks like. When we understand this, we can improve it, for the benefit of the project, and the individuals who care about the mission and want to find ways into the project so they can make a difference.

While legal processes define ‘Corporations’ and ‘Foundations’, the people who volunteer and contribute rarely give a **** about which org ‘owns’ the project they’re contributing too; they just want to build a better internet. Mozilla is bigger than the legal entities. And the legal entities are not what Mozilla is about, they are just one of the necessities to making it work.

So the org dashboards, and team dashboards we’re building can help us day-to-day with tactical and strategic decisions, but we always need to keep them in the context of the bigger picture. Even if the big picture takes a while to download.

Here’s to more cross org collaboration.

Want to read more?

Contributors counting… contributors?

We now have a reasonably organized Mozilla Foundation Metrics Wiki Hub Page Thing.

While my priority to date this year has been working out how MoFo teams count their contributors, I thought I should also take the time to open up this metrics work in a way that contributors can get involved, if that’s what takes their fancy. After all, contributor metrics are only as good as the systems they help us improve, and in turn the contributors they help us empower. :)

As with many good things in the world of open source, this includes a mailing list.

So here’s by blurb if you’d consider signing up:

The mofo-metrics mailing list:

“An open community mailing list for volunteers and staff interested in Mozilla Foundation Metrics. What are the numbers, graphs and other data points that can help the Mozilla Foundation to better promote openness, innovation and participation on the Internet? Sign up and help us answer that question.”

I’m not 100% sure what contribution will look like in metrics-land, but I’m happy find out and to try and make this work.

A quick update on the interim contributor dashboard

I’ve just updated the main wiki page tracking our contributor dashboard project, so I won’t repeat everything here.

The quick update is that the puzzle pieces that will make our interim contributor dashboard work are coming together now.

Which means we have a live dashboard front-end, (with a few data-holes we need to plug!). This screenshot is just data from Github.

It may be missing loads of data, but what's there is real and updating automatically :)
It may be missing loads of data, but what’s there is real and updating automatically :)

Let’s gather some more numbers…