DevLog for Sep 10-11 2014

Back from an awesome vacation. Too awesome to write about, even :) Suffice it to say, England has some really pretty places.

Some Android app work, and lots of monitoring work

  • Fixed bugs causing the Wikipedia Android Alpha from building properly. Now it builds properly whenever there is a new commit. Hooray! This was primarily caused by me forgetting to give it lots of RAM (8G VMEM) to execute the mvn build commands (https://gerrit.wikimedia.org/r/#/c/159482/) and also not cleaning up previous .alpha subfolders (https://gerrit.wikimedia.org/r/#/c/159481/) – this causes a chain of .alpha.alpha.alpha.* subfolders, breaking the build.
  • Added a patch to the Android alpha app itself that checks for updates every day or so and notifies you if there’s a new one. Was fairly trivial to write, although I was hoping to make it more seamless (i.e. download the apk myself and just pop it up for people to tap). It now requires 4 clicks to install it, should be able to bring it down to 2 at some point in the future if people care enough.
  • Added a method to our check_graphite code that lets you individually check a bunch of metrics for thresholds (https://gerrit.wikimedia.org/r/#/c/159473/). This makes it much simpler to do icinga checks on a bunch of metrics that are all measuring the same thing but from different machines. BetaLabs and ToolLabs checks use this.
  • Cleaned up a bunch of minor things with our check_graphite script. Also fucked up trying to replace all double quotes in it with single quotes for consistency – it replaced the double quotes being used inside single quotes, and caused all checks to fail. Fixed shortly by https://gerrit.wikimedia.org/r/#/c/159711/
  • Added more monitoring for betalabs! Now checks for stale puppet runs (https://gerrit.wikimedia.org/r/#/c/159701/) and low space on the root partition (https://gerrit.wikimedia.org/r/#/c/159694/). All are green now, thanks to some work from bd808.
  • Added monitoring for ToolLabs! Now checks for stale puppet runs, low space on root and /var, and puppet failure events (https://gerrit.wikimedia.org/r/#/c/159709/). Also checks for high sustained CPU usage (https://gerrit.wikimedia.org/r/#/c/159751/). Then spent some time (with help from scfc_de (whose nick I kept spelling as scfe_de until today)) cleaning up the puppet failures. They are all green now as well.
  • Did a bunch of cleaning up work around the graphite role, removing the realm branching (https://gerrit.wikimedia.org/r/#/c/159759/). Ori says everytime realm branching code is removed, an angel gets its wings, so well done there.

Not a bad day, eh? I’ve been trying to wake up early, perhaps that is helping.

Shitty Web advertising #1

image

The message icon was flashing obnoxiously. I wonder how much money CNet made off that.

I don’t notice these on desktop thanks to Adblock. Should set it up on my phone too.

DevLog for 31 Aug, 1-4 Sep 2014

Missed DevLogging for a while.

Am in London now.

  • Started using a spare Majestouch Ninja 2 over my regular Kinesis Advantage. This is way more portable, and my hand does not seem to be hurting while using it (so far only about 4-5 hours). If this keeps up, I should be able to move to a similar smaller keyboard over the much bulkier kinesis. There’s still a little bit of discomfort, so I’ll probably want a very portable and mechanical split keyboard. Can’t sadly seem to find any, though :( Maybe I should just build one with an arduino :)
  • Setup icinga checks for puppet failures and disk space issues on betalabs, and fixed a bunch of issues/docs in our icinga puppet code during that time. This still doesn’t properly work since our implementation of check_graphite does not support wildcard metrics properly – it should check thresholds for each series, but it seems to do that only across the entire series combined, which is kinda useless. Should fix that soon by adding more features to it. Also might try out other alternatives to icinga, since our icinga puppet code is a fuckball anyway.
  • Fix a couple more Quarry bugs. There’s still a random bug where celery seems to be attempting to read data about a query run from mysql before the web has committed it, which is theoretically impossible (I do a commit before sending the task to celery with the id), so I suspect some mysql fuckery. Will need to debug that sooner than later, and also consider moving to postgres. But then Quarry will have to deal with SQLite (for result storage), MySQL (for connecting to labsdb) and with postgres for local data, which sounds insanely complex. I also added CORS support to resultsets, and Magnus is playing with it (wooohhooooo!!!). I’m going to add more features to make it easier for people to use results from quarry in their JS applications elsewhere. Should be fun.
  • Finished videos for the first week of the Coursera Data Analysis and Statistical Inference class I’m taking. Started poking around R since the labs for that class are from R, should be fun.
  • Chad has started his devlog. He does search stuff at Wikimedia and is a co-whiner about all things Java. Do check out :)

Am away on ‘vacation’ till Wednesday, yay! :) Should disconnect well.

DevLog for 29, 30 Aug 2014

Chill weekend. Didn’t really do anything code related. Recovering from friday night party :)

Started reading Data + Design which seems quite nice. Also starting a coursera course on Data analysis and Statistical Inference on Sep 1, should be fun.

DevLog for 28 Aug 2014

Let’s see. I’m also going to attempt to include patch links wherever possible.

  • Cleaned up session handling bugs in Quarry. They were previously not closing properly, causing SQLAlchemy exceptions now and then that’ll let queries die in a ‘queued’ state. This should hopefully be fixed by https://gerrit.wikimedia.org/r/#/c/156909/.
  • Made https://graphite.wmflabs.org somewhat more stable again for use by Wikimedia BetaLabs folks by just blacklisting all other projects from sending data to it (with a local whitelist hack + https://gerrit.wikimedia.org/r/#/c/156966/). This would let them use it, and they said they would find it massively useful. All this is a hack until https://rt.wikimedia.org/Ticket/Display.html?id=8163 could be resolved and I can deploy graphite on a ‘real’ machine.
  • Minor patch to my Gerrit -> IRC bot to make it strip a common prefix (‘operations/’) from change messages posted to the operations channel – https://gerrit.wikimedia.org/r/#/c/156877/. Also made one that calls jenkins-bot jerkins-bot if it -1’s your patch (https://gerrit.wikimedia.org/r/#/c/156878/) but it seems to have a bug that I can’t be bothered to debug atm.

Might spend some more time with Hive over the next few days – figured out an approach for using it from Python, and it should be fun to do so!

DevLog for 25 and 26 Aug 2014

Not very code heavy

  • Couple of pull requests(#1 and #2) for the Atom Autosave plugin. One adds a preference to not autosave by when you are explicitly closing a window / pane, and the other just sets the ‘enabled’ preference to default. CoffeeScript isn’t too bad either! I should consider writing more plugins (I currently use Atom for CSS/JS/Puppet, should try other languages)
  • Added CSV, TSV and JSON download options to Quarry. There’s another Webinar tomorrow by the Grantmaking team, and J-Mo asked for it. Streaming TSV and CSV implemented in a neat way, will write a blog post tomorrow about it.
  • Started work on a ‘number of editors’ per country metric for WPDMZ, needs to be finished up.

I feel a bit exhausted (physically and mentally) from the intense coding over the last few weeks, might have a few chill days to recharge myself. I’m growing old! :(

DevLog for Aug 24, 2014

Wooooo!

  • Moved labsbooks (described in yesterday’s devlog) to use a shared readonly IPython virtualenv maintained by me. Also installed a bunch of modules people might want to use (SciPy, NumPy, Pandas, PyTables, matplotlib). Am considering just installing IPython notebook globally via puppet and using that, since that’ll enable users to just use the system packages. However, the version of IPython notebook from Ubuntu is ancient, so that’s probably a non-starter.
  • Have a basic version of the IPython publishing process working! Any toollabs user can create a notebook by:
    • Creating a ~/notebooks folder
    • Doing a chmod +x ~/notebooks
    • Doing a chmod +x ~
    • Putting IPython notebooks into ~/notebooks (as .ipynb) files
    • Going to https://tools.wmflabs.org/notebooks/<user-shell-name>/<path-to-ipynb-file> Will have to do a bit more work before it can be considered ‘production grade’ (such as user pages, a nicer theme, etc, etc) BUT YAY GOOD START. It already caches the html output in Redis and invalidates with the mtime of the file, so should be pretty quick.
  • Made the ssh tunneling process for labsbooks purely python, without requiring the ProxyCommand. This makes things simpler (and more portable!). I’ll need to work on securing this properly before I can publish it for broader use.
  • Wrote an email to the analytics mailing list about making public the ‘edits per country’ data. I hope to make this publicly available with enough granularity that not just me but others can use this for fun research as well.

I’ve been using Atom for puppet stuff, PyCharm for Python and IntelliJ for Java stuff, and that seems to be doing ok. They all have decent Vim keybindings as well, and good replacements for other functionality – and I might stick with Atom for a while to see how it goes :)

DevLog for 23 Aug 2014

Back in Glasgow! It was actually not very cold today, only cold! Progress.

Started working on my long-abandoned labsbook project, which aims to make Tool Labs a first class environment for people who want to run (and publish) iPython Notebooks while also being able to access the replica databases and the dumps. Doing this in a secure manner is kinda hard, but I think I’ve a neat solution that lets everyone run a personal iPython kernel on the Grid, access it from their local machine, and also publish it to the web from a standard location. So far, I’ve gotten my script to a point where it’ll setup an iPython environment for you if it doesn’t already exist, start the kernel if it isn’t already running, and tunnel the editing interface back to you to use! Things left to do include:

  1. Open up the browser when tunnel is open
  2. Find a sane way to kill a kernel that hasn’t been doing anything since forever
  3. Setup a shared iPython environment (just code, readonly) so people don’t have to setup their own environments everytime (this is primarily a performance enhancement)
  4. Find a nice and simple way for iPython notebooks to be published. I’m currently thinking of an URL such as tools.wmflabs.org/notebooks/<username>/<notebookname>' to display them, and an index attools.wmflabs.org/notebooks/`. This shouldn’t be too hard with appropriate permission munging.

I’m also using paramiko for this, which makes writing SSH related code with Python a breeze. It even supports proxycommand! Blunders I’ve done while getting up to this point include:

  1. Running jsub run.bash -mem 4G instead of jsub -mem 4G run.bash and wondering why my script kept getting killed with OOM.
  2. Trying to do a pip install on the Grid nodes (which don’t have build tools) instead of on tools-dev and wondering why running it from the commandline works but from jsub does not.
  3. Wondering why my SSH Tunnel kept dying and trying to debug that without realizing that it was dying because the iPython process was dying because it was OOMing because of my earlier jsub error
  4. Thinking that user accounts (rather than tool accounts) can not submit jobs to the grid, while the problem was that I had not set the execute bit on my script.

Once this is done (I suspect tomorrow), I’ll work on getting the data from my work with WPDMZ into a form good enough for publicizing (removing ways of de-anonymization), and then use iPython notebooks to make graphs! This should be fun :)

Source so far available on Github. Needs more work / documentation / cleaning up.

Devlog for Aug 22 2014

Was in Edinburgh again, missed writing it as I went along. Oh well.

  • Ported the code for the Wikipedia Android app automated builds to Python, and you can see it in action at http://tools.wmflabs.org/wikipedia-android-builds/ now. It lets you download the latest build, and notes the last successful build time. Good enough :) It was originally in bash, and porting it to python allowed me to create a ‘fake’ API (just JSON blobs written to known locations in the file system). Next step is to write a helper app
  • The Atom experiment is coming along well. Am using it for most of my Puppet work these days. Should give LightTable another go as well.

That’s it!

Devlog for 21 Aug 2014

In Edinburgh! I’ve finally stopped spelling it as Edinborough!

  • Added ‘user group’ functionality to Quarry, and added a sudo user group that does what you would think it does. Will be assigned super, super sparingly.
  • Found out that I’d have to explicitly specify charset of the database when I’m creating it or MySQL will default to a stupid charset. Forced all tables and columns to utf8 and that seems to have fixed a bunch of unicode issues. Yay?
  • Still facing occasional MySQL server has gone away errors with SQLAlchemy for local MySQL instance, despite asking SQLAlchemy to recycle connections every hour or so. Reduced the recycle time to 10m, hopefully that helps.
  • Read Tony Hoare’s Turing Award speech from 1980, titled “The Emperor’s Old Clothes”. I think I should read more of these papers / speeches, helps keep perspective and ‘learn from history’. Lots of warnings against complexity seem to be a very common theme, and one I’ve also personally encountered many times. Recommend reading :)
  • More DMZ work! Now running edits per country stats separated by mobile vs desktop for all countries for all wikipedias! EXCITING!

© 2014 Yuvi Panda

Theme by Anders NorenUp ↑