Back in Glasgow! It was actually not very cold today, only cold! Progress.
Started working on my long-abandoned
labsbook project, which aims to make Tool Labs a first class environment for people who want to run (and publish) iPython Notebooks while also being able to access the replica databases and the dumps. Doing this in a secure manner is kinda hard, but I think I’ve a neat solution that lets everyone run a personal iPython kernel on the Grid, access it from their local machine, and also publish it to the web from a standard location. So far, I’ve gotten my script to a point where it’ll setup an iPython environment for you if it doesn’t already exist, start the kernel if it isn’t already running, and tunnel the editing interface back to you to use! Things left to do include:
- Open up the browser when tunnel is open
- Find a sane way to kill a kernel that hasn’t been doing anything since forever
- Setup a shared iPython environment (just code, readonly) so people don’t have to setup their own environments everytime (this is primarily a performance enhancement)
- Find a nice and simple way for iPython notebooks to be published. I’m currently thinking of an URL such as
tools.wmflabs.org/notebooks/<username>/<notebookname>' to display them, and an index attools.wmflabs.org/notebooks/`. This shouldn’t be too hard with appropriate permission munging.
I’m also using paramiko for this, which makes writing SSH related code with Python a breeze. It even supports
proxycommand! Blunders I’ve done while getting up to this point include:
jsub run.bash -mem 4G instead of
jsub -mem 4G run.bash and wondering why my script kept getting killed with OOM.
- Trying to do a
pip install on the Grid nodes (which don’t have build tools) instead of on
tools-dev and wondering why running it from the commandline works but from
jsub does not.
- Wondering why my SSH Tunnel kept dying and trying to debug that without realizing that it was dying because the iPython process was dying because it was OOMing because of my earlier
- Thinking that user accounts (rather than tool accounts) can not submit jobs to the grid, while the problem was that I had not set the
execute bit on my script.
Once this is done (I suspect tomorrow), I’ll work on getting the data from my work with WPDMZ into a form good enough for publicizing (removing ways of de-anonymization), and then use iPython notebooks to make graphs! This should be fun :)
Source so far available on Github. Needs more work / documentation / cleaning up.