When ‘less than the sum of our parts’ is a good thing

areweamillionyetHere’s a happy update about our combined Mozilla Foundation (MoFo) and Mozilla Corporation (MoCo) contributor dashboards.

TL;DR: There’s a demo All Mozilla Contributor Dashboard you can see at areweamillionyet.org

It’s a demo, but it’s also real, and to explain why this is exciting might need a little context.

Since January, I’ve been working on MoFo specific metrics. Mostly because that’s my job, but also because this/these organisations/projects/communities take a little while to understand, and getting to know MoFo was enough to keep me busy.

We also wanted to ship something quickly so we know where we stand against our MoFo goals, even if the data isn’t perfect. That’s what we’ve built in our *interim* dashboard. It’s a non de-duped aggregation of the numbers we could get out of our current systems without building a full integration database. It gives us a sense of scale and shows us trends. While not precise and high resolution yet, this has still been helpful to us. Data can sometimes look obvious once you see it, but before this we were a little blind.

So naturally, we want to make this dashboard as accurate as possible, and the next step is integrating and de-duping the data so we can know if the people who run Webmaker events are the people who commit code, are the people who file bugs, are the people who write articles for Source, are the people who teach Software Carpentry Bootcamps, etc, etc.

The reason we didn’t start this project by building a MoFo integration database, is because MoCo were already working on that. And in less than typical Mozilla style (as I’m coming to understand it), we didn’t just build our own version of this. 😉 (though there might be some scope and value for integrating some of this data within the Foundation anyway, but that’s a separate thought).

The integration database in question is MoCo’s project Baloo, which Pierros, and many people on the MoCo side have been working on. It’s a complex project influenced by more than than just technical requirements. Scoping the system is the point at which many teams are first looking at their contributor data in detail and working out what their indicators of contribution look like.

Our plan is that our MoFo interim dashboard data-source can eventually be swapped out for the de-duped ‘single source of truth’ system, at which point it goes from being a fuzzy-interim thing to a finalized precise thing.

While MoCo and ‘Fo have been taking two different approaches to solving this problem, we’ve not worked in isolation. We meet regularly, follow each other’s progress and have been trying to find the point where these approaches can merge into a unified cross Mozilla solution.

The demo we shipped yesterday was the first point where we’ve joined up this work.

Dial-up modems

I want to throw in an internet based analogy here, for those who remember dial-up modems.

Let’s imagine this image shows us *all* the awesome Mozilla contributors and what they are doing. We want there to be 20k of them in 2014.

unknown

It’s not that we don’t know if we have contributors. We’ve seen individual contributors, and we’ve seen groups of contributors, but we haven’t seen them all in one place yet.

So to continue the dial-up modem analogy, let’s think of this big-picture view of contribution as a large uncompressed JPEG, which has been loading slowly for a few months.

The MoFo interim dashboard has been getting us part of this picture. Our approach has revealed the MoFo half of this picture with slowly increasing resolution and accuracy. It’s like an interlaced JPEG, and is about this accurate so far:

interlaced

The Project Baloo approach is precise and can show MoCo and MoFo data, but adds data source at a time. It’s rolling out like a progressive JPEG. The areweamillionyet.org dashboard demo isn’t using Baloo yet, but the data it’s using is a good representation of how Baloo can work. What you can see in the demo dashboard is a picture like this:

progressive

(Original Photo Credit: Gen Kanai)

About areweamillionyet.org

This is commit data extracted from cross team/org/project repositories via github. Even though code contribution is only one part of the big-picture. Seeing this much of the image tells us things we didn’t know before. It gives us scale, trends and ways to ask questions about how to effectively and intentionally grow the community.

The ‘commit history’ over time is also a fascinating data set, and I’ll follow up with a blog post on that soon.

Less that the sum of our parts? When 5 + 5 = 8

With the goal of 20k active contributors this year, shared between MoCo and MoFo, we’re thinking about 10k active contributors to MoCo and to MoFo. And if we counted each org in isolation we could both say “here’s 10k active contributors”, and this would be a significant achievement. But, if we de-dupe these two sets it would be really worrying if there wasn’t an overlap between the people who contribute to MoCo and the people who contribute to MoFo projects.

Though we want to engage many many individual contributors, I think a good measure of our combined community building effectiveness will be how much these ‘pots’ of contributors overlap. When 10k MoFo contributors + 10k MoCo contributors = 15k combined Mozilla contributors, we should definitely celebrate.

That’s the thing I’m most excited about with regards to joining up the data. Understanding how contributors connect across projects; how they volunteer their time, energy and skills is many different ways, and understanding what ‘Many Voices, One Mozilla’ looks like. When we understand this, we can improve it, for the benefit of the project, and the individuals who care about the mission and want to find ways into the project so they can make a difference.

While legal processes define ‘Corporations’ and ‘Foundations’, the people who volunteer and contribute rarely give a **** about which org ‘owns’ the project they’re contributing too; they just want to build a better internet. Mozilla is bigger than the legal entities. And the legal entities are not what Mozilla is about, they are just one of the necessities to making it work.

So the org dashboards, and team dashboards we’re building can help us day-to-day with tactical and strategic decisions, but we always need to keep them in the context of the bigger picture. Even if the big picture takes a while to download.

Here’s to more cross org collaboration.

Want to read more?

Contributors counting… contributors?

We now have a reasonably organized Mozilla Foundation Metrics Wiki Hub Page Thing.

While my priority to date this year has been working out how MoFo teams count their contributors, I thought I should also take the time to open up this metrics work in a way that contributors can get involved, if that’s what takes their fancy. After all, contributor metrics are only as good as the systems they help us improve, and in turn the contributors they help us empower. 🙂

As with many good things in the world of open source, this includes a mailing list.

So here’s by blurb if you’d consider signing up:

The mofo-metrics mailing list:

“An open community mailing list for volunteers and staff interested in Mozilla Foundation Metrics. What are the numbers, graphs and other data points that can help the Mozilla Foundation to better promote openness, innovation and participation on the Internet? Sign up and help us answer that question.”

I’m not 100% sure what contribution will look like in metrics-land, but I’m happy find out and to try and make this work.

Getting Bicho Running as a process on Heroku with a Scheduler

By Félicien Victor Joseph Rops (Belgium, Namur, 1833-1898) [Public domain], via Wikimedia Commons
“Ou la lecture du grimoire”
For our almost complete MoFo Interim Dashboard, I’m planning to use an issue tracker parsing tool called Bicho to work out how many people are involved in the Webmaker project in Bugzilla. Bicho is part of a suite of tools called Metrics Grimoire which I’ll explore in more detail in near future. When combined with vizGrimoire, you can generate interesting things like this which are very closely related to (but not exactly solving the same challenge) as our own contribution tracking efforts.

I recently installed a local copy of Bicho, and ran this against some products on Bugzilla to test it out. It generates a nicely structured relational database including the things I want to count and feed into our contributor numbers.

This morning I got this running on Heroku, which means it can run periodically and update a hosted DB, which can then feed numbers into our dashboard.

This was a bit trial and error for me as all the work I’ve done with Python was within Google App Engine’s setup, and my use of Heroku has been for Node apps, so these notes are to help me out some time in the future when I look back to this.

Getting this working on Heroku

$ pip freeze

generates a list of the requirements from your working localenv e.g.

BeautifulSoup==3.2.1
MySQL-python==1.2.5
feedparser==5.1.3
python-dateutil==2.2
six==1.6.1
storm==0.20
wsgiref==0.1.2

Copy this into a requirements.txt file in the root of your project

But remove the line: Bicho==0.9 (or it tries to install this via pip, which fails)

Heroku’s notes on specifying dependencies.

You can now push this to Heroku.

Then, I ran:

$ heroku run python setup.py

But I’m actually not sure if that was required.

Then you can run Bicho remotely via heroku run commands

$ heroku run python bin/bicho --db-user-out=yourdbusername --db-password-out=yourdbuserpassword --db-database-out=yourdbdatabase --db-hostname-out=yourdbhostname -d 5 -b bg --backend-user 'abugzilla@exampleuser.com' --backend-password 'bugzillapasswordexample' -u 'https://bugzillaurl.com?etc'

As a general precaution for anything like this, don’t use a user account that has any special privileges. I create duplicate logins that have the same level of access available to any member of the public.

Once you’ve got a command that works here, cancel the running script as it might have thousands of issues left to process.

Then setup a scheduler https://devcenter.heroku.com/articles/scheduler

$ heroku addons:add scheduler:standard
$ heroku addons:open scheduler

copy your working command into the scheduler just without the ‘heroku run’ part

python bin/bicho --db-user-out=yourdbusername --db-password-out=yourdbuserpassword --db-database-out=yourdbdatabase --db-hostname-out=yourdbhostname -d 5 -b bg --backend-user 'abugzilla@exampleuser.com' --backend-password 'bugzillapasswordexample' -u 'https://bugzillaurl.com?etc'

If you set this to run every 10 mins, the process will cycle and get killed periodically but in the logs this usefully shows you how the import is progressing.

I’m generally happy with this as a solution for counting contributors in Webmaker’s issue tracking history, but would need to work on some speed issues if this was of interest across Mozilla projects.

Currently, this is importing about 400 issues an hour, which would be problematic to process 1,000,000+ bugs in bugzilla.mozilla.org. But that’s not a problem to solve right now. And not necessarily the way you’d want to do that either.

Are we on track with our 2014 contributor goals?

I presented a version of this on the Mozilla Foundation staff call yesterday, and thought it’s worth a write-up for those who weren’t on the call and those in MoCo working on related things.

Some Context:

One of the cross Mozilla goals this year is to “10X” the number of active contributors, with a longer term goal of growing to a million Mozillians.

When the 10X goal was set we weren’t really sure what X was, for valid reasons; defining contribution is as much art as it is science, and the work plans for this year include building the tools to measure this in a systematic way. The goals justify the tools, and vice versa. Chicken and egg.

2,000 contributors were invited to the summit, so the target was set at 20k active contributors shared between MoCo and MoFo. MoFo have been working to a target of 10k contributors but in practice this isn’t going to be a clean 50/50 split and there will be overlap in contributors across teams, projects and MoFo/MoCo. For example, 10k MoCo contributors + 10k MoFo contributors could = 19k Mozilla contributors.

When I joined in January, each of the MoFo teams did some (slightly tedious) manual counting and estimated their contributor numbers for 2013, and we added these up to a theoretical 5,600 contributors. This was our baseline number. Useful to an order or magnitude, but not precise.

This 5,600 number suggests that 10k contributors was quite far off 10X contributors based on these January estimates, but 10k is still going to be a challenging goal. At 10X we’d have been aiming for 50k+ contributors.

From the data that’s emerging, 10k active contributors to MoFo feels like a sane but stretching target.

With the recent forming of the Badge Alliance, some MoFo goals are now Badge Alliance goals, and the same goes for counting people contributing to parts of the Open Badge ecosystem. As a result, our theoretical 5,600 contributor number got smaller. It’s now 4,200.

So 4,200 is where we assumed we started this year, but we haven’t proved this yet. And realizing this measurement has been our priority metrics project this year.

How are we doing so far?

We’ve been automating ways to count these ‘theoretical’ contributors, and feeding them into our dashboard.

But to-date as we’ve looked at the dashboard, and the provable number was 1,000 or 2,000 or so, we would then say “but the real number is actually closer to 5,000”. Which means the dashboard hasn’t been very useful yet, as the theoretical number always trumped the provable but incomplete number.

This will change in the next few weeks.

We’re now nearly counting ‘live’, all of those theoretical pots of contribution.

And the dashboard is at 2,800.

Once we add the Webmaker mentors who complete their training this year, and anything else that goes into the ad-hoc contribution logger, we’re basically at our real comparison point to that theoretical number, and we can drop the ‘theoretical’ bit.

If there’s a thousand mentors and another four hundred contributors added to the ad-hoc logger, our theoretical estimate will be remarkably close to reality. Except, that it’s six months behind where we thought it would be.

We’re getting close to that 4,200, but we expected (albeit pretty loosely) to be there in January.

This either means that:

(A) the growth shown on the graph to-date is an artifact of missing historical data, and we’re actually trending pretty flat.

(B) our 2013 estimates were too high and we started this year with fewer contributors than we thought, but we’ve been growing to date.

As we don’t have time-stamped historical data for some of these things, we’re not going to know which for sure. But either way, we now need to increase the rate at which we on-board new contributors to hit 10k by the end of the year.

There are plans in place for growing contribution numbers, but this is going to be active work.

Whether that’s converting new webmaker users who join us through Maker Party, or reducing barriers to contributing code or, actively  going out and asking people if they want to contribute. Growing that contributor number is going to be a combination of good processes and marketing.

Also to note

I’ll be making this MoFo total number smaller by X% when we integrate the data into a single location and de-dupe people across these  activities. But we don’t know what X% is yet. That’s just something to be aware of.

In relation to the points on there not being a clear MoCo/MoFo split in where people contribute, we’re much more directly connecting up the systems and processes now. We’ll have more to share on this in the coming weeks.

Tracking the status of the dashboard

Contributor Dashboard Status Update (‘busy work’?)

While I’m always itching to get on with doing the work that needs doing, I’ve spent this morning writing about it instead. Part of me hates this, but another realizes this is valuable. Especially when you’re working remotely and the project status in your head is of no use to your colleagues scattered around the globe.

So here’s the updated status page on our Mozilla Foundation Contributor Dashboard, and some progress on my ‘working open‘.

Filing bugs, linking them to each other, and editing wiki pages can be tedious work (especially wiki tables that link to bugs!) but the end result is very helpful, for me as well as those following and contributing to the project.

And a hat-tip to Pierros, whose hard-work on the project Baloo wiki page directly inspired the formatting here.

Now, back to doing! 🙂

Progress on Contributor Dashboard(s)

Latest dashboard screenshotWe’re seeing real progress getting data into the MoFo Contributor Dashboard now, but we need to keep in mind that counting existing contributors and engaging new contributors are two separate tasks that will help us move this line upwards.

The gains we are seeing right now are counting gains rather than contribution gains.

Getting this dashboard fully populated with our existing contributor numbers will be an achievement, but growing our contributor numbers is the real end goal of this work.

40-50% done?

Using our back-of-the-napkin numbers from 2013 as a guide the current data sources shown on the dashboard today capture about 40% of the numbers we’re expecting to see here. Depending on how good our estimates were, and how many new contributors have joined in Q1, we expect this will be near the 5k mark by the time it’s all hooked up.

Ad-hoc contribution?

AdhoctributionWe think about 10% of the contribution we want to count doesn’t currently exist in any database, so I’ve written a simple tool for logging this activity which will feed into this dashboard automatically.

MoFo staff can play with this tool now to test out the UX, but we can’t start logging data with this until it’s been through a security and privacy review (as this will store email addresses). Follow progress on the security review here.

The next biggest pot of data? Badges.

badgesA significant chunk of the remaining 50% of current contributors we want to count, and a significant number of the new contributors we expect to engage this year will be acknowledged by issuing badges.

This starts with Webmaker Super Mentor badges that will be issued through Badgekit.

My next task here is to work with the Badges team to expose a metrics API endpoint that lets us count the number of people we issue particular badges too.  Initial thinking on this is here.

Along with the tooling to hook up the data, the badges will also need to be designed and issued before these contributors are counted.

Tracking progress?

The dashboard status wiki page is the best place to track current progress about data sources being added (this is linked from the top of the live dashboard). Also within that wiki page is a link to a Working Document with even more detail.

Trending with caution?

For some of the things we’re displaying on this dashboard, we are logging them now for the very first time, even if contributors may have been contributing for a while. This will unavoidably skew the graph in the short term.

This means our latest Active Contributor numbers will be meaningful, but the rate at which it increases may be misleading while we start counting some of these things for the first time.

Initial upward trends may slow as we finish logging existing contributors for the first time, though this may also be balanced out by the real increases that will occur as we engage new contributors.

What’s coming next?

Today, I’m working on front-end dashboards for each of the MoFo teams which may be more helpful on a week-to-week basis in team meetings.

This is only interim?

This dashboard is starting to look useful. For instance, we have the beginnings of our first contributor trend line (and the good news is it’s going upwards). But this is still only our interim solution.

We are adding up counts from a number of different data-sources but not joining up the data. This data is not de-duped and when the data sources are all joined up we will need to add potential margin-of-error warnings to the final graph (the real numbers will be smaller than our graph).

This interim solution is a deliberate decision because it’s more useful to have non-de-duped data now to see trends quick enough to inform our plans for the year than it is to have everything joined up perfectly too late in the year to adjust our activities.

In parallel to this interim solution, we are working with MoCo on Project Baloo which will allow us to de-dupe contributors for more accurate counting in the long run.

Want to know more? (Or get involved)

Here are the components that we have wired together so far:

More updates to follow.

A quick update on the interim contributor dashboard

I’ve just updated the main wiki page tracking our contributor dashboard project, so I won’t repeat everything here.

The quick update is that the puzzle pieces that will make our interim contributor dashboard work are coming together now.

Which means we have a live dashboard front-end, (with a few data-holes we need to plug!). This screenshot is just data from Github.

It may be missing loads of data, but what's there is real and updating automatically :)
It may be missing loads of data, but what’s there is real and updating automatically 🙂

Let’s gather some more numbers…

Who’s teaching this thing anyway?

This is an idea for Webmaker teacher dashboards, and some thoughts on metadata related to learning analytics

This post stems from a few conversations around metrics for Webmaker and learning analytics and it proposes some potential product features which need to be challenged and considered. I’m sharing the idea here as it’s easy to pass this around, but this is very much just an idea right now.

For context, I’m approaching this from a metrics perspective, but I’m trying to solve the data gathering challenge by adding value for our users rather than asking them to do any extra work.

These are the kind of questions I want us to be able to answer

and that can inform future decision making in a positive way…

  • How many people using Webmaker tools are mentors, students, or others?
  • Do mentors teach many times?
  • How many learners go on to become mentors?
  • What size groups do mentors typically work with?
  • How many mentors teach once, and then never again? (their feedback would be very useful)
  • How many learners come back to Webmaker tools several days after a lesson?
  • Which partnership programme reached the greatest number of learners?

And the particularly tricky area…

  • What data points show developing competencies in Web Literacy?

Flexible and organic data points to suit the Webmaker ecosystem

The Webmaker suite of tools are very open and flexible and as a result get used by people for many different things. Which personally, I like a lot. However, this also makes understanding our users more difficult.

When looking at the data, how can we tell if a new Thimble Make has come from a teacher, a student, or even an experienced web developer who works at Mozilla and is using the tool to publish their birthday wishes to the web? The waters here are muddy.

We need a few additional indicators in the data to analyze it in a meaningful way, but these indicators have to work with the informal teaching models and practices that exist in the Webmaker ecosystem.

On the grounds that everyone has both something to teach and to learn, and that we want trainers to train trainers and so on, I propose that asking people to self-identify as mentors via a survey/check-box/preferences/etc will not yield accurate flags in the data.

The journey to identifying yourself as a mentor is personal and complex, and though that process is immensely interesting, there are simpler things we can measure.

The simplest measure is that someone who teaches something is a teacher. That sounds obvious, but it’s very slightly different from someone who thinks of themselves as a teacher.

If we build a really useful tool for teaching (I’m suggesting one idea below) and its use identifies Webmaker accounts as teacher(s) and/or learner(s) then we’d have useful metadata to answer almost all of those questions asked above.

When we know who the learners are we can better understand what learning looks like in terms of data (a crucial step in conversations about learning analytics).

If anyone can use this proposed tool as part of their teaching process, and students can engage with it as students. Then anyone can teach, or attend a lesson in any order without having to update their account records to say “I first attended a Maker Party, then I taught a session on remixing for the web, and now I’m learning about CSS and next I want to teach about Privacy”.

A solution like this doesn’t need 100% use by all teachers and learners to be useful (which helps the solution remain flexible if it doesn’t suit). It just needs enough people to use it to use it that we have a meaningful sample of Webmaker teachers and learners flagged in the database.

With a decent sample we can see what teaching with Webmaker looks like at scale. And with this kind of data, continually improve the offering.

An idea: ‘Teacher Lesson Dashboards’

I think Teacher Lesson Dashboards would catch the metadata we need, and I’ll sketch this out here. Don’t get stuck on any naming I’ve made up right now, the general process for the teacher and the learner is the main thing to consider.

1. Starting with a teacher/mentor

User logs in to Webmaker.org

Clicks an option to “Create a new Lesson”

Gets an interface to ‘build-up’ a Lesson (a curation exercise)

Adds starter makes to the lesson (by searching for their own and/or others makes)

e.g. A ‘Lesson’ might include:

  • A teaching kit with discussion points, and a link to X-ray goggles demo
  • A thimble make for students to remix
  • A (deliberately) broken thimble make for students to try and debug
  • A popcorn make to remix and report back what they have learned

They give their lesson a name

Add optional text and an image for the lesson

Save their new Lesson, and get a friendly short URL

Then point students to this at the beginning of the teaching session

2. The learner(s) then…

Go the URL the mentor provides

Optionally, check-in to the lesson (and create a Webmaker account at the same time if required)

Have all the makes and activities they need in one place to get started

One click to view or remix any make in the Lesson

Can reference any written text to support the lesson

3. Then, going back to the mentor

Each ‘Lesson’ also has a dashboard showing:

  • Who has checked-in to the lesson
    • with quick links to their most recent makes
    • links to their public profile pages
    • Perhaps integrating together.js functionality if you’re running a lesson remotely?
  • Metrics that help with teaching (this is a whole other conversation, but it depends first on being able to identify who is teaching who)
  • Feedback on future makes created after the lesson (i.e. look what your session led to further down the line)

4. And to note…

‘Lessons’ as a kind of curated make, can also me remixed and shared in some way.

Useful?

I’m not on the front-lines using the tools right now, so this is a proposal very much from a person who wants flags in a database 🙂

  • Does this feel like it adds value to mentors and/or learners?
  • Do you think is a good way to identify who’s teaching and who’s learning? (and who’s doing both of course)

 

What I see in these graphs of Github contribution

Context: Last week I shared a few graphs (1, 2, 3, 4) looking at data from our repositories on Github, extracted using this Gitribution app thing, as part of our work to dashboard contributor numbers for the Mozilla Foundation.

I didn’t comment on the graphs at the time because I wanted time for others to look at them without my opinions skewing what they might see. This follow up post is a walk-through of some things I see in the graphs/data.

The real value in looking at data is finding ways to make things better by challenging ourselves, and being honest about what the numbers show, so this will be as much about questions as answers…

Also, publishing this last week flagged up some missing repositories and identified some other members of staff so these graphs are based on the latest version of the data (there was no impact on shapes, but some numbers will be different).

What time of day do people contribute (UTC)?

By Hour of DayOur paid staff who are committing code are mostly in US/Canadian timezones and it make sense that most of their commits are during these hours (graphed by UTC). But, what caught my attention here is that the volunteer contribution times follow the same shape.

Questions to ask:

  • Do volunteer contributions follow the same shape because contributing code has a dependency on being able to talk in real time with staff? For example in IRC. If so, is this a bottleneck for contributing code?
  • If not, what is creating this shape for volunteer contributors? Perhaps it’s biased to timezones where more people are interested in the things we are building, and potentially biased by language? But looking at support for Maker Party and other activities there is a global audience for our tools.
  • What does a code contribution pathway look like for people in the 0300-1300UTC times? Is there anything we can do to make things easier or more appealing?

The shape of volunteer contributions

ShapeThe shape of this graph is pretty typical for any kind of volunteering or activity involving human interactions. It’s close to a power law graph with a long-tail.

If you’ve not looked at a data set like this before, don’t panic that so many people only make a single contribution. At the same time, don’t use the knowledge that this is typical not to ask questions about how we can be better.

Lots of people want to get involved in volunteering projects but often their good intentions don’t align with their actual available free time. I say this as someone who signs up for more things than fit into my available hours for personal projects.

The two questions I want to ask of this graph are:

  1. Where could our efforts to support contributors best influence the overall shape?
  2. What does this look like at 10 x scale?

So, starting with where we could influence shape… my opinion (no data here) says to think about people in this range.Shape HighlightTo the left of this highlighted area people are already making code contributions over and above even many staff. Shower them in endless gratitude! But I don’t think they don’t need practical help from us.  To the right of this highlighted area is the natural long tail. Supporting that bigger group of people for single-touch interactions is about clear documentation and easy to follow processes. But I think the group of people roughly highlighted in that graph are people we can reach out to. These people potentially have capacity to do more. We should find out what they are interested in, what they want to get out of contribution and build relationships with them. In practical terms, we have finite time to invest in direct relationships with contributors. I think this is an effective place to invest some of that time.

I think the second question is more  challenging. What does this look like at 10 x scale?

In 2013, ~50 people made a one-time contribution.

  • What do we need in place for 500 people to make a one-time code contribution?
  • Do we have 500 suitable ‘first’ bugs for 2014?
  • Is the amount of setup work required to contribute to our tools appropriate for people making a single contribution?
  • If not, is that a blocker to growing contributor numbers?

In 2013, there were ~1,500 code commits by volunteers.

  • What do we need in place for 15,000 activities on top of planned staff activity?
  • How does this much activity align towards a common product roadmap?
  • How is it scheduled, allocated, reviewed and shipped?

When planning to work with 10 x contributor numbers, possibly the biggest shift to consider is the ratio of staff to volunteers:

ContributorRatio

  • How does impact on time allocated for code reviews?
  • How do we write bugs?
  • How we prioritize bugs? Etc.
  • Even, what does an IRC channel or a dev maling list look like after this change?

Other questions to ask:

  • What do we think is the current ‘ceiling’ on our contributor numbers for people writing code?
    • Is it the number of developers who know about our tools and want to help? (i.e. a ‘marketing’ challenge to inspire more people)
    • Is it the amount of suitable work ready and available for people who want to help? (are we losing people who want to help because it’s too hard to get involved?)
    • Both? With any bias?

 What do you think?

I’m only one set of eyes on this, so please challenge my observations and feel free to build on this too.

Also, as the data in here is publicly accessible already I think I can publish this Tableau view as an interactive tool you can play with, but I need to check the terms first.

Contribution Graphs part 4: Contributions by Contributors over time

I’m posting a quick series of these without much comment on my part as I’d love to know what you see in each of them.

This is looking at activity in Github (commits and issues), for the repositories listed here. It’s an initial dive into the data, so don’t be afraid to ask questions of it, or request other cuts of this. In the not so distant future, we’ll be able to look at this kind of data across our combined contribution activities, so this is a bit of a taster.

Click for the full-size images.

Contributions by Contributors over time

Last but not least for today, I think there are some stories in this one…

Contributions by Contributors over Time

Is anything here a surprise? What do you see in this?