Note: I’m trying to spend time explicitly writing random side projects that are not related to what I’m actively working on as my main project in some form.

A random thread started by Ironholds on a random mailing list I was wearily catching up on contained a joke from bearloga about malformed User Agents. This prompted me to write UAuliver (source), a Firefox extension that randomizes your user agent to be a random string of emoji. This breaks a surprisingly large number of software, I’m told! (GMail & Gerrit being the ones I explicitly remember)

Things I learnt from writing this:

  1. Writing Addons for Firefox is far easier to get started with than they were the last time I looked. Despite the confusing naming (Jetpack API == SDK != WordPress’ Jetpack API != Addons != Plugins != WebExtension), the documentation and tooling were nice enough that I could finish all of this in a few hours!
  2. I can still write syntactically correct Javascript! \o/
  3. Generating a ‘string of emoji’ is easier/harder than you would think, depending on how you would like to define ’emoji’. The fact that Unicode deals in blocks that at least in this case aren’t too split up made this quite easy (I used the list on Wikipedia to generate them). JS’s String.fromCodePoint can also be used to detect if the codepoint you just generated randomly is actually allocated.
  4. I don’t actually know how HTTP headers deal with encoding and unicode. This is something I need to actually look up. Perhaps a re-read of the HTTP RfC is in order!

It was a fun exercise, and I might write more Firefox extensions in the future!

DNS servers, localhost and asynchronous code

localhost is always, right? Nope, can also be ::1 if your system only has IPV6 (apparently).

Asking a DNS server for an A record for localhost should give you back right? Nope – it varies wildly! gives me an NXDOMAIN which means it tells you straight up THIS DOMAIN DOES NOT EXIST! Which is true, since localhost isn’t a domain. But if you ask the same thing of any dnsmasq server, it’ll tell you localhost is Other servers vary wildly – I found one that returned an NXDOMAIN for AAAA but for A (which is pretty wild, since NXDOMAIN makes most software treat the domain as not existing and not attempt other lookups). So localhost and DNS servers don’t mix very well.

But why is this a problem, in general? Most DNS resolution happens via gethostbyname libc call, which reads /etc/hosts properly, right? Problem there is that there is popular software that’s completely asynchronous (coughnginxcough) that does not use gethostbyname (since that’s synchronous) and directly queries DNS servers (asynchronously). This works perfectly well until you try to hit localhost and it tells you ‘no such thing!’.

I should probably file a bug with nginx to have them read /etc/hosts as well, and in the mean-time work around by sending to nginx rather than localhost.

How did your thursday go?

Simple python packaging for Debian / Ubuntu

(As requested by Jean-Fred)

One of the ‘pain points’ with working on deploying python stuff at Wikimedia is that pip and virtualenvs are banned on production, for some (what I now understand as) good reasons (the solid Signing / Security issues with PYPI, and the slightly less solid but nonetheless valid ‘If we use pip for python and gem for ruby and npm for node, EXPLOSION OF PACKAGE MANAGERS and makes things harder to manage’). I was whining about how hard debian packaging was for quite a while without checking how easy/hard it was to package python specifically, and when I finally did, it turned out to be quite not that hard.

Use python-stdeb.

Really, that is it. Ignore most other things (until you run into issues that require them :P). It can translate most python packages that are packaged for PyPI into .debs that mostly pass lintian checks. Simplest way to package, that I’ve been following for a while is:

  1. Install python-stdeb (from pip or apt). Usually requires the packages python-all, fakeroot and build-essential, although for some reason these aren’t required by the debian package for stdeb. Make sure you’re on the same distro you are building the package for.
  2. git clone the package from its source
  3. Run python --command-packages=stdeb.command bdist_deb (or python3 if you want to make Python 3 package)
  4. Run lintian on it. If it spots errors, go back and fix them, usually by editing the file (or sometimes a stdeb.cfg file). This is usually rather obvious and easy enough to fix.
  5. Run dpkg -i <package> to try to install the package. This will error out if it can’t find the packages that your package depends on. This means that they haven’t been packaged for debian yet. You can mostly fix this by finding that package, and making a deb for it, and installing it as well (recursively making debs for packages as you need them). While this sounds onerous, the fact is that most python packages already exist as deb packages and you stdeb will just work for them. You might have to do this more if you’re packaging for an older distro (cough cough precise cough cough), but is much easier on newer distros.
  6. Put your package in a repository! If you want to use this on Wikimedia Labs, you should use Labsdebrepo. Other environments will have similar ways to make the package available via apt-get. Avoid the temptation to just dpkg -i it on machines manually :)

That’s pretty much it! Much simpler than I originally expected, and not much confusing / conflicting docs. The docs for stdeb are pretty nice and complete, so do read these!

Will update the post as I learn more.

Paying for IRCCloud

I’ve started paying for IRCCloud.

It is the first ‘service’ I am paying for as a subscriber, I think. I’ve considered doing that for a long time, but ‘paying for IRC’ just felt… odd. I’ve been using ZNC + LimeChat. It’s decent, but sucks on Mobile. Keeping a socket open all the time on a phone just kills the battery, plus the UX on most Android clients sucks.

So after seeing Sam Smith use IRCCloud during Wikimania, I made the plunge and paid for IRCCloud. It is still connecting to my bouncer, so I have logs under my control. It also has a very usable Android client, and syncs ‘read’ status across devices, and is quite fast.

Convenience and UX won over ‘Free Software’ this time.

“Write once, attempt to debug everywhere”

“Write once, attempt to debug everywhere” is probably way more accurate than “Write once, run anywhere” – at least for GUI apps. There’s something rather special about ideas that are theoretically amazingly wonderful yet end up being a major pain in the butt when you try to put them in practice, isn’t it?

The older Wikipedia App happened to be in PhoneGap, and I consider it one of my biggest blunders to not have torn it down on day 0 and rewritten it in something saner. I once started writing a blog post for the Wikimedia Blog about why we switched from PhoneGap to native apps, but it was killed for having too much profanity in it :) Someday, perhaps.

Wikipedia Apps reboot taking shape nicely


Prettier, faster and more feature packed! Native Wikipedia app for Android coming real soon!

PHP being insane, part 5832

Somehow: ( $a && $b ) || ( $b && $c ) || ( $a && $c )

Became: $a ? ( $b || $c ) : ( $b && $c )

Became: count( array_filter( array( $a, $b, $c ) ) ) >= 2

Became: "$a$b$c" > 1

God dammit PHP…

(from discussion among me, ^d, ori-l, bd808 and anomie on #mediawiki-core about how to represent ‘if at least 2 of three conditions are true’)

DevLog for Sun, Aug 25, 2013

DevLogs have been something I've not been writing much of of late. Time to fix that!

WLM Android App

Spent some time reviving the WLM Android App. Wasn't too hard, and am surprised at how well it still runs :) Some work still needed to update the templates and other metadata to refer to WLM2013 rather than WLM2012 – but that should not be too hard. The fact that it is an issue at all is simply because I ripped out all the Campaign related APIs a few weeks ago with my UploadCampaign rewrite.

multichill was awesome in moving the Monuments API to Tool Labs – hence making it much faster! Initially we thought that the Toollabs DB was too slow for writes – but this turned out to be a mistake, since apparently the Replica Databases had slow writes, but tools-db itself was fine. There's a bug tracking this now. Toollabs version of the API still seems much faster to me than Toolserver's :)

UploadCampaigns API

Mediawiki sucks. Eeeew! Specifically, writing API modules – why can't we just be happy and have everything be JSON? Sigh!

I'm adding a patch that allows UploadCampaigns to be queried selectively, rather than just via the normal page APIs. Right now, this only lets us filter by enabled status – but in the future, this should be able to also filter on a vast array of other properties. Properties about Geographic location come to mind as the most useful. That patch still has a good way to go before it can be merged (continue support being the foremost one), but it is getting there :)

The ickiest part of the patch is perhaps that it sends out raw JSON data as a… string. So no matter which format you are using on your client, you need to use a JSON parser to deal with the Campaigns data. This sortof makes sense, since that is how the data is stored anyway. Doesn't make it any less icky, though!

Not bad for a lazy Sunday, eh?

Update: After not being able to sleep, I also submitted a patch to make phpcs pass for UploadWizard, and also fought with the UploadCampaigns API patch to have it (brokenly?) support continuing. Yay?

Die ’80 cols or die!’ guidelines!

Python's PEP8 has just been changed to no longer recommend sticking to 79 columns! The new text says:

Aim to limit all lines to a maximum of 79 characters, but up to 99 characters is acceptable when it improves readability.

It would be nice to not have any such set limits at all, and just depend on programmers not being insane, but this is still an improvement!

Sprinkling some Douchebaginess in code

After being frustrated at Java's lack of a generic 'callback' type, I created this interface:

    public interface ContributionUploadProgress {
        void onUploadStarted(Contribution contribution);
        boolean isJavaAPieceOfShit();

And randomly throw around (with onComplete implementing ContributionUploadProgress)

    assert onComplete.isJavaAPieceOfShit();

This, of course, is trivial to fix with an IDE. Should be more fun with a dynamic language :)

(And yes, I removed that code before committing)

