I recently installed a local copy of Bicho, and ran this against some products on Bugzilla to test it out. It generates a nicely structured relational database including the things I want to count and feed into our contributor numbers.
This morning I got this running on Heroku, which means it can run periodically and update a hosted DB, which can then feed numbers into our dashboard.
This was a bit trial and error for me as all the work I’ve done with Python was within Google App Engine’s setup, and my use of Heroku has been for Node apps, so these notes are to help me out some time in the future when I look back to this.
Getting this working on Heroku
$ pip freeze
generates a list of the requirements from your working localenv e.g.
Copy this into a
requirements.txt file in the root of your project
But remove the line:
Bicho==0.9 (or it tries to install this via pip, which fails)
You can now push this to Heroku.
Then, I ran:
$ heroku run python setup.py
But I’m actually not sure if that was required.
Then you can run Bicho remotely via
heroku run commands
$ heroku run python bin/bicho --db-user-out=yourdbusername --db-password-out=yourdbuserpassword --db-database-out=yourdbdatabase --db-hostname-out=yourdbhostname -d 5 -b bg --backend-user 'firstname.lastname@example.org' --backend-password 'bugzillapasswordexample' -u 'https://bugzillaurl.com?etc'
As a general precaution for anything like this, don’t use a user account that has any special privileges. I create duplicate logins that have the same level of access available to any member of the public.
Once you’ve got a command that works here, cancel the running script as it might have thousands of issues left to process.
Then setup a scheduler https://devcenter.heroku.com/articles/scheduler
$ heroku addons:add scheduler:standard
$ heroku addons:open scheduler
copy your working command into the scheduler just without the ‘heroku run’ part
python bin/bicho --db-user-out=yourdbusername --db-password-out=yourdbuserpassword --db-database-out=yourdbdatabase --db-hostname-out=yourdbhostname -d 5 -b bg --backend-user 'email@example.com' --backend-password 'bugzillapasswordexample' -u 'https://bugzillaurl.com?etc'
If you set this to run every 10 mins, the process will cycle and get killed periodically but in the logs this usefully shows you how the import is progressing.
I’m generally happy with this as a solution for counting contributors in Webmaker’s issue tracking history, but would need to work on some speed issues if this was of interest across Mozilla projects.
Currently, this is importing about 400 issues an hour, which would be problematic to process 1,000,000+ bugs in bugzilla.mozilla.org. But that’s not a problem to solve right now. And not necessarily the way you’d want to do that either.