Most UNIX users have heard of the nice utility used to run a command with a lower priority to make sure that it only runs when nothing more important is trying to get a hold of the CPU:
nice long_running_script.sh
That's only dealing with part of the problem though because the CPU is not all there is. A low priority command could still be interfering with other tasks by stealing valuable I/O cycles (e.g. accessing the hard drive).
Prioritizing I/O
Another Linux command, ionice, allows users to set the I/O priority to be lower than all other processes.
Here's how to make sure that a script doesn't get to do any I/O unless the resource it wants to use is idle:
sudo ionice -c3 hammer_disk.sh
The above only works as root, but the following is a pretty good approximation that works for non-root users as well:
ionice -n7 hammer_disk.sh
You may think that running a command with both nice and ionice would have
absolutely no impact on other tasks running on the same machine, but there is one
more aspect to consider, at least on machines with limited memory: the disk cache.
Polluting the disk cache
If you run a command (for example a program that goes through the entire file system checking various things, you will find that the kernel will start pulling more files into its cache and expunge cache entries used by other processes. This can have a very significant impact on a system as useful portions of memory are swapped out.
For example, on my laptop, the nightly debsums, rkhunter and tiger cron jobs essentially clear my disk cache of useful entries and force the system to slowly page everything back into memory as I unlock my screen saver in the morning.
Thankfully, there is now a solution for this in Debian: the nocache package.
This is what my long-running cron jobs now look like:
nocache ionice -c3 nice long_running.sh
Turning off disk syncs
Another relatively unknown tool, which I would certainly not recommend for all cron jobs but is nevertheless related to I/O, is eatmydata.
If you wrap it around a command, it will run without bothering to periodically make sure that it flushes any changes to disk. This can speed things up significantly but it should obviously not be used for anything that has important side effects or that cannot be re-run in case of failure.
After all, its name is very appropriate. It will eat your data!
This will hopefully be useful to some of those who have purchased the Linux version of the Torchlight game as part of the 6th Humble Indie Bundle.
While crawling some late dungeon levels, the game crashed with the following error message on the command line:
*-*-* OGRE Initialising
*-*-* Version 1.6.5 (Shoggoth)
terminate called after throwing an instance of 'Ogre::RenderingAPIException'
what(): OGRE EXCEPTION(3:RenderingAPIException): Zero sized texture surface on texture MEDIA/PARTICLES/TEXTURES/TRAIL/TRAIL37.DDS face 0 mipmap 2. Probably,
the GL driver refused to create the texture. in GLTexture::_createSurfaceList at ../../../../RenderSystems/GL/src/OgreGLTexture.cpp (line 405)
Error: signal: 6
./Torchlight.bin.x86_64(_ZN10LinuxUtils13crash_handlerEi+0x25)[0x17eb6f5]
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f3b1e20b4a0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7f3b1e20b425]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x7f3b1e20eb8b]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x11d)[0x7f3b1eb5d69d]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb5846)[0x7f3b1eb5b846]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb5873)[0x7f3b1eb5b873]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(__cxa_rethrow+0x46)[0x7f3b1eb5b9b6]
/opt/torchlight/lib64/libOgreMain-1.6.5.so(_ZN4Ogre8Resource4loadEb+0x38d)[0x7f3b218a6dad]
/opt/torchlight/lib64/libOgreMain-1.6.5.so(_ZN4Ogre15ResourceManager4loadERKSsS2_bPNS_20ManualResourceLoaderEPKSt3mapISsSsSt4lessISsESaISt4pairIS1_SsEEE+0x91)[0x7f3b218b5381]
./Torchlight.bin.x86_64(_ZN16ParticleUniverse17ParticleTechnique15setMaterialNameERKSs+0xc2)[0x176d936]
./Torchlight.bin.x86_64(_ZN20CParticleTechWrapper21createTextureMaterialEv+0x903)[0xfbbc1f]
./Torchlight.bin.x86_64(_ZN30CParticleTechWrapperDescriptor29DescriptorObjectHasBeenInitedEP17CEditorBaseObject+0x1c)[0xa2b090]
./Torchlight.bin.x86_64(_ZN12CEditorScene20loadCompressedLayoutER9CFileInfoP10TArrayListIP17CEditorBaseObjectEx+0x963)[0xab4919]
./Torchlight.bin.x86_64(_ZN12CEditorScene9loadSceneESbIwSt11char_traitsIwESaIwEEbP10TArrayListIP17CEditorBaseObjectExbP13CTimerStatics+0x2ac)[0xab0bfa]
./Torchlight.bin.x86_64(_ZN7CLayout14loadLayoutFileERKSbIwSt11char_traitsIwESaIwEEbP13CTimerStaticsbbj+0x6c6)[0xf02ad8]
./Torchlight.bin.x86_64(_ZN18CParticlePreloader12LoadParticleESbIwSt11char_traitsIwESaIwEES3_+0x25b)[0xfaf349]
./Torchlight.bin.x86_64(_ZN18CParticlePreloader12LoadParticleESbIwSt11char_traitsIwESaIwEE+0xdd)[0xfaed1b]
./Torchlight.bin.x86_64[0xf02041]
./Torchlight.bin.x86_64(_ZN7CLayout14loadLayoutFileERKSbIwSt11char_traitsIwESaIwEEbP13CTimerStaticsbbj+0x265)[0xf02677]
./Torchlight.bin.x86_64(_ZN17CLayoutDescriptor18Set_loadLayoutFileEP17CEditorBaseObjectP14UNIONDATA16BITj+0x7c)[0x855992]
./Torchlight.bin.x86_64(_ZN15CDescriptorProp15setDataOnObjectEP17CEditorBaseObjectPK14UNIONDATA32BITj+0x17a)[0xa55fe0]
./Torchlight.bin.x86_64(_ZN15CDescriptorProp7setDataEPK14UNIONDATA32BITjP17CEditorBaseObject+0x59)[0xa55d09]
./Torchlight.bin.x86_64(_ZN11CDescriptor24loadObjectFromBinaryFileER11COgreReaderP17CEditorBaseObjectR28CDescriptorLoadConfiguration+0x36f)[0xa42107]
./Torchlight.bin.x86_64(_Z26loadObjectByCompressedFileP12CEditorSceneR28CDescriptorLoadConfigurationR11COgreReaderP17CEditorBaseObjectb+0x122)[0xab39f2]
./Torchlight.bin.x86_64(_ZN12CEditorScene20loadCompressedLayoutER9CFileInfoP10TArrayListIP17CEditorBaseObjectEx+0x3b3)[0xab4369]
./Torchlight.bin.x86_64(_ZN12CEditorScene9loadSceneESbIwSt11char_traitsIwESaIwEEbP10TArrayListIP17CEditorBaseObjectExbP13CTimerStatics+0x2ac)[0xab0bfa]
./Torchlight.bin.x86_64(_ZN7CLayout14loadLayoutFileERKSbIwSt11char_traitsIwESaIwEEbP13CTimerStaticsbbj+0x6c6)[0xf02ad8]
./Torchlight.bin.x86_64(_ZN11CSkillEventC2EP6CSkillP14CSkillPropertyP16CResourceManagerP10CDataGroup+0x19a3)[0x1344eeb]
./Torchlight.bin.x86_64(_ZN14CSkillPropertyC1EP6CSkillP16CResourceManagerPS_P10CDataGroupj+0x34ff)[0x1386e4f]
./Torchlight.bin.x86_64(_ZN6CSkillC2EP16CResourceManagerP10CDataGroup+0x5e9)[0x1319c2d]
./Torchlight.bin.x86_64(_ZN12CSkillParser8getSkillEP16CResourceManagerRKSbIwSt11char_traitsIwESaIwEE+0xf3)[0x1375f9d]
./Torchlight.bin.x86_64(_ZN13CSkillManager8addSkillERKSbIwSt11char_traitsIwESaIwEEb+0xe6)[0x1363f7e]
./Torchlight.bin.x86_64(_ZN9CBaseUnit14addSkillByNameERKSbIwSt11char_traitsIwESaIwEEb+0x9c)[0xbf53c4]
./Torchlight.bin.x86_64(_ZN9CBaseUnit8unitInitEP10CDataGroupb+0x13fc)[0xbf756c]
./Torchlight.bin.x86_64(_ZN10CCharacter8unitInitEP10CDataGroupb+0x63)[0xc29de5]
./Torchlight.bin.x86_64(_ZN8CMonster8unitInitEP10CDataGroupb+0x2f)[0xd4ae41]
./Torchlight.bin.x86_64(_ZN10CCharacter16convertCharacterESbIwSt11char_traitsIwESaIwEEb+0x873)[0xc29241]
./Torchlight.bin.x86_64(_ZN10CCharacter18createNewCharacterESbIwSt11char_traitsIwESaIwEE+0x1cf)[0xc2897d]
./Torchlight.bin.x86_64(_ZN18CMonsterDescriptor22Set_createNewCharacterEP17CEditorBaseObjectP14UNIONDATA16BITj+0x5f)[0x8955ed]
At the time, a patch for this wasn't available, so I had to work around that bug in the following way:
- find
pak.zipinside the game directory (/opt/torchlight/pak.zipon my machine) - unpack this file using
unzip(or my favourite:aunpack) - find the offending texture (
media/particles/textures/trail/trail37.ddsas per the error message above) - copy a similar one (I picked
trail36.ddsin the same directory) on top of it - zip up the modified directory and replace the original
pak.zipwith this one
In theory, this hack could cause some visual glitches, but I didn't actually notice anything and was quite happy to be able to rescue my game.
(I ran into this bug on an Ubuntu 12.04 Precise laptop using an Intel GPU.)
After moving from a hard drive to an SSD on my work laptop, I decided to keep the hard drive spinning and use it as a backup for the SSD.
With the following setup, I can pull the SSD out of my laptop and it should still boot up normally with all of my data on the hard drive.
Manually setting up an encrypted root partition
Before setting up the synchronization between the two drives, I had to replicate the partition setup.
I used fdisk, cfdisk and gparted to create the following partitions:
Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 499711 248832 83 Linux
/dev/sdb2 501760 500117503 249807872 5 Extended
/dev/sdb5 503808 500117503 249806848 83 Linux
and then loosely followed
these instructions
to create an encrypted root partition on /dev/sdb5:
$ cryptsetup luksFormat /dev/sdb5
$ cryptsetup luksOpen /dev/sdb5 sdb5_crypt
$ pvcreate /dev/mapper/sdb5_crypt
$ vgcreate akranes2 /dev/mapper/sdb5_crypt
$ vgchange -a y akranes2
$ lvcreate -L247329718272B -nroot akranes2
$ lvcreate -L8468299776B -nswap_1 akranes2
$ mkfs.ext4 /dev/akranes2/root
Finally, I added the new encrypted partition to the list of drives to bring up at boot time by looking up its UUID:
$ cryptsetup luksUUID sdb5_crypt
and creating a new entry for it in /etc/crypttab.
Copying the boot partition
Setting up the boot partition was much easier because it's not encrypted. All that was needed was to format it and then copy the files over:
$ mkfs.ext2 /dev/sdb1
$ mount /dev/sdb1 /mnt/boot
$ cp -a /boot/* /mnt/boot/
The only other thing to remember is to install grub on the boot loader of
that drive. On modern Debian systems, that's usually just a matter of
running dpkg-reconfigure grub-pc and adding the second drive (/dev/sdb
in my case) to the list of drives to install grub on.
Sync scripts
To keep the contents of the SSD and the hard drive in sync, I set up a
regular rsync of the root and boot partitions using the following mount
points (as defined in /etc/fstab):
/dev/mapper/akranes-root / ext4 noatime,discard,errors=remount-ro 0 1
/dev/mapper/akranes2-root /mnt/root ext4 noatime,errors=remount-ro 0 2
UUID=0b9109d0-... /boot ext2 defaults 0 2
UUID=6e6f05fb-... /mnt/boot ext2 defaults 0 2
I use this script (/usr/local/sbin/ssd_boot_backup) for syncing the boot
partition:
#!/bin/sh
if [ ! -e /mnt/boot/hdd.mounted ] ; then
echo "The rotating hard drive is not mounted in /mnt/boot."
exit 1
fi
if [ ! -e /boot/ssd.mounted ] ; then
echo "The ssd is not the boot partition"
exit 1
fi
nice ionice -c3 rsync -aHx --delete --exclude=/ssd.mounted --exclude=/lost+found/* /boot /mnt
and a similar one (/usr/local/sbin/ssd_root_backup) for the root
partition:
#!/bin/sh
if [ ! -e /mnt/root/hdd.mounted ] ; then
echo "The rotating hard drive is not mounted in /mnt/root."
exit 1
fi
if [ ! -e /ssd.mounted ] ; then
echo "The ssd is not the root partition"
exit 1
fi
nice ionice -c3 rsync -aHx --delete --exclude=/dev/* --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* --exclude=/boot/* --exclude=/mnt/* --exclude=/lost+found/* --exclude=/media/* --exclude=/var/tmp/* --exclude=/ssd.mounted --exclude=/var/lib/lightdm/.gvfs --exclude=/home/francois/.gvfs /* /mnt/root/
To ensure that each drive is properly mounted before the scripts run, I
created empty ssd.mounted files in the root directory of each of the
partitions on the SSD, and empty hdd.mounted files in the root directory
of the hard drive partitions.
Cron jobs
The sync scripts are run every couple of hours through this crontab:
10 */4 * * * root /usr/local/sbin/ssd_boot_backup
20 0,4,8,12,16,20 * * * root /usr/local/sbin/ssd_root_backup
20 2,6,10,14,18,22 * * * root /usr/bin/on_ac_power && /usr/local/sbin/ssd_root_backup
which includes a reduced frequency while running on battery to avoid spinning the hard drive up too much.
Taking advantage of a new hard drive, I decided to reinstall my Ubuntu (Precise 12.04.2) laptop from scratch so that I could easily enable full-disk encryption (a requirement for Mozilla work laptops).
Reinstalling and reconfiguring everything takes a bit of time though, so here's the procedure I followed to keep the configuration to a minimum.
Install Ubuntu/Debian on the new drive
While full-disk encryption is built into the graphical installer as of Ubuntu 12.10, it's only available through the alternate install CD in 12.04 and earlier.
Using that CD, install Ubuntu on the new drive, making sure you select the "Encrypted LVM" option when you get to the partitioning step. (The procedure is the same if you use a Debian CD.)
To make things easy, the first user you create should match exactly the one from your previous installation.
Copy your home directory
Once the OS is installed on the new drive, plug the old one back in (in an external enclosure if you need to) and mount its root partition as a read-only drive on /mnt.
Then log in as root (not just using sudo, actually login using the root user) and copy your home directory on the new drive:
rm -rf /home/*
cp -a /mnt/home/* /home/
Then you should be able to log in as your regular user.
Reinstall all packages
The next step is to reinstall all of the packages you had installed on the old OS. But first of all, let's avoid having to answer all of the debconf questions we've already answered in the past:
rm -rf /var/cache/debconf
cp -a /mnt/var/cache/debconf /var/cache/
and set the debconf priority to the one you usually use (medium in my case):
dpkg-reconfigure debconf
Next, make sure you have access to all of the necessary repositories:
cp -a /mnt/etc/apt/sources.list /etc/apt/
cp -a /mnt/etc/apt/sources.list.d/* /etc/apt/sources.list.d/
apt-get update
Getting and setting the list of installed packages
To get a list of the packages that were installed on the old drive, use dpkg on the old install:
mount -o remount,rw /mnt
chroot /mnt
dpkg --get-selections > /packages
exit
mount -o remount,ro -r /mnt
Use that list on the new install to reinstall everything:
dpkg --set-selections < /mnt/packages
apt-get dselect-upgrade
Selectively copy configuration files over
Finally, once all packages are installed, you can selectively copy the config files from the old drive (in /mnt/etc) to the new (/etc/). In particular, make sure you include these ones:
cp -a /mnt/etc/alternatives/ /mnt/etc/default/ /etc/
(I chose not to just overwrite all config files with the old ones because I wanted to get rid of any cruft that had accumulated there and suspected that there might be some slight differences due to the fresh install of the distro.)
Reboot
Once that's done, you should really give that box a restart to ensure that all service are using the right config files, but otherwise, that's it. Your personal data and config are back, all of your packages are installed and configured the way they were, and everything is fully encrypted!
In order to move my blog to a free-as-in-freedom platform and support the great work that Joey (of git-annex fame) and Lars (of GTD for hackers fame) have put into their service, I decided to convert my Blogger blog to Ikiwiki and host it on Branchable.
While the Ikiwiki tips page points to some old instructions, they weren't particularly useful to me. Here are the steps I followed.
Exporting posts and comments from Blogger
Thanks to Google letting people export their own data from their services, I was able to get a full dump (posts, comments and metadata) of my blog in Atom format.
To do this, go into "Settings | Other" then look under "Blog tools" for the "Export blog" link.

Converting HTML posts to Markdown
Converting posts from HTML to Markdown involved a few steps:
- Converting the post content using a small conversion library to which I added a few hacks.
- Creating the file hierarchy that ikiwiki requires.
- Downloading images from Blogger and fixing their paths in the article text.
- Extracting comments and linking them to the right posts.
The Python script I wrote to do all of the above will hopefully be a good starting point for anybody wanting to migrate to Ikiwiki.
Maintaining old URLs
In order to make sure I wouldn't break any existing links pointing to my blog on Blogger, I got the above Python script to output a list of Apache redirect rules and then found out that I could simply email these rules to Joey and Lars to get them added to my blog.
My rules look like this:
# Tagged feeds
Redirect permanent /feeds/posts/default/-/debian http://feeding.cloud.geek.nz/tags/debian/index.rss
Redirect permanent /search/label/debian http://feeding.cloud.geek.nz/tags/debian
# Main feed (needs to come after the tagged feeds)
Redirect permanent /feeds/posts/default http://feeding.cloud.geek.nz/index.rss
# Articles
Redirect permanent /2012/12/keeping-gmail-in-separate-browser.html http://feeding.cloud.geek.nz/posts/keeping-gmail-in-separate-browser/
Redirect permanent /2012/11/prefetching-resources-to-prime-browser.html http://feeding.cloud.geek.nz/posts/prefetching-resources-to-prime-browser/
Collecting analytics
Since I am no longer using Google Analytics on my blog, I decided to take advantage of the access log download feature that Joey recently added to Branchable.
Every night, I download my blog's access log and then process it using awstats. Here is the cron job I use:
#!/bin/bash
BASEDIR=/home/francois/documents/branchable-logs
LOGDIR=/var/log/feedingthecloud
# Download the current access log
LANG=C LC_PAPER= ssh -oIdentityFile=$BASEDIR/branchable-logbot b-feedingthecloud@feedingthecloud.branchable.com logdump > $LOGDIR/access.log
It uses a separate SSH key I added through the Branchable control panel and outputs to a file that gets overwritten every day.
Next, I installed the awstats Debian package, and configured it like this:
$ cat /etc/awstats/awstats.conf.local
SiteDomain=feedingthecloud.branchable.com
LogType=W
LogFormat=1
LogFile="/var/log/feedingthecloud/access.log"
Even if you're not interested in analytics, I recommend you keep an eye on the 404 errors for a little while after the move. This has helped me catch a critical redirection I had forgotten.
Limiting Planet feeds
One of the most common things that happen right after someone migrates to a new blogging platform is the flooding of any aggregator that subscribes to their blog. The usual cause being the change in post identifiers.
Unsurprisingly, Ikiwiki already had a few ways to avoid this problem. I chose to simply modify each tagged feed and limit them to the posts added after the move to Branchable.
Switching DNS
Having always hosted my blog on a domain I own, all I needed to do to
move over to the new platform without an outage was to change my CNAME to
point to feedingthecloud.branchable.com.
I've kept the Blogger blog alive and listening on feeding.cloud.geek.nz to
ensure that clients using a broken DNS resolver (which caches records for
longer than requested via the record's
TTL) continue to
see the old posts.
I wanted to be able to use the GMail web interface on my work machine, but for privacy reasons, I prefer not to be logged into my Google Account on my main browser.
Here's how I make use of a somewhat hidden Firefox feature to move GMail to a separate browser profile.
Creating a separate profile
The idea behing browser profiles is simple: each profile has separate history, settings, bookmarks, cookies, etc.
To create a new one, simply start Firefox with this option:
firefox -ProfileManager
to display a dialog which allows you to create new profiles:

Once you've created a new "GMail" profile, you can start it up from the profile manager or directly from the command-line:
firefox -no-remote -P GMail
(The -no-remote option ensures that a new browser process is created for it.)
To make this easier, I put the command above in a tiny gmail shell script that lives in my ~/bin/ directory. I can use it to start my "GMail browser" by simply typing gmail.
Tuning privacy settings for 2-step authentication
While I initially kept that browser profile in private browsing mode, this was forcing me to enter my 2-factor authentication credentials every time I started the browser. So to avoid having to use Google Authenticator (or its Firefox OS cousin) every day, I ended up switching to custom privacy settings and enabling all cookies:

It turns out however that there is a Firefox extension which can selectively delete unwanted cookies while keeping useful ones.
Once that add-on is installed and the browser restarted, simply add accounts.google.com to the whitelist and set it to clear cookies when the browser is closed:

Then log into GMail and tick the "Trust this computer" checkbox at the 2-factor prompt:

With these settings, your browsing history will be cleared and you will be logged out of GMail every time you close your browser but will still be able to skip the 2-factor step on that device.
One of the great ways to reduce the perceived load time of pages on your site is to prefetch resources that will be needed while users are busy reading or interacting with the current page.
There are a few ways to ensure that the browser will already have a page in its cache when users visit them. In this particular case, I wanted improve the load time of the Persona dialog while users are busy looking at the third-party site they want to login into.
Hidden images and iframes
The classic way that many image galleries used to prefetch the "next" image in a series was to include it on the current page as a tiny 1x1 pixel image.
Ignoring the fact that this is now usually blocked because it looks like a web bug (2x2 is the way to go now), the problem with this technique is that the browser now has to pull two images at once, which, for slower connections, means that the load time for the initial page will be longer.
A variant of this technique, with the same problem, is to load the next page in a hidden iframe:
<iframe style="display: none" src="/prefetch.html"></iframe>
XHR fetch after the page load
A more modern version of the hidden image technique is to use an XHR request to download the resources to the cache once the page has finished loading.
It looks like this:
<script>
window.onload = function () {
var xhr = new XMLHttpRequest();
xhr.open('GET', '/css/styles.css', false);
xhr.send(null);
};
</script>
The main downside of this solution is that you can only prefetch resources on the same server because of browsers' same-origin restrictions (unless you can use HTTP access controls (CORS)). In the case of Persona though, I wanted third-party sites to trigger the prefetching of resources available from a different host: login.persona.org.
HTML5's prefetch links
HTML5 has defined a prefetch relation type for links and similar tags. All you need to do to use it is to put something like this in your page head:
<link rel="prefetch" href="/img/background.jpg">
<link rel="prefetch" href="/css/styles.css">
<link rel="prefetch" href="/js/navigation.js">
These prefetching hints work on HTTPS pages and across domains, but only get loaded when the browser is idle and believes that the network can handle it. In other words, with the exception of a few extra bytes added to the HTML, these hints will not affect the loading time of the page they are on.
Prefetching will not parse HTML or execute Javascript though (i.e. it's not recursive). So if you want to put a page and all of its resources into the cache ahead of time, you'll need to have link tags for each child resource in the page head.
While this has been implemented in Chrome, it got disabled when they added support for the non-standard prerender relation type. So right now, you may need to use both. Note that either of these relations include DNS resolution, which makes dns-prefetch relations unnecessary.
Dynamic list of resources to prefetch
If you're trying to use prefetch links to get other sites to pre-cache your resources, you're essentially limited to a static list of resources that doesn't change. Unfortunately, that's not how modern sites work. Many sites now host each version of a static file on a unique (e.g. versioned or timestamped) URL with very long expire headers.
Ideally, we would give the browser a list of resources to prefetch. While that feature doesn't exist yet, we can approximate it by loading a hidden iframe containing the necessary prefetch links. That way, third-party sites can hardcode the iframe on their pages and you can update the list of resources to prefetch anytime you want.
The great thing about the way Persona works is that we already use an iframe as a commnication channel between the site and the code on login.persona.org. So all I had to do in the end was to add prefetch links to that existing iframe. The result of this is that every site using the Observer API gets the benefit of prefetching without having to change anything!
Testing
If you want to make sure that your prefetching works:
- Clear your browser cache.
- Load the page that has prefetch links.
- Take a look at the contents of your cache (
about:cachein Firefox andchrome://cache/in Chrome) and look for the resources that should be prefetched.
While trying to add gzip compression to a custom node.js reverse proxy server through connect's compress middleware, I ran into a really strange problem: my browser would fetch the first 5 resources without problems, then it would stall for 2 minutes before gettting the next 5 resources, stall for another 2 minutes for the next five, and so on.
If I waited long enough, all of the resources would be loaded correctly and the page would look fine.
This is what I saw in Firebug:

The Firebug documentation explains that the "blocking" state is the time that the browser spends waiting for a connection. Given that this was an HTTP connection (i.e. no SSL), I wasn't sure what was causing this but it looked like some kind of problem with the backend.
HTTP Keep-alive and the Content-Length header
Thanks to a brilliant co-worker who suggested that I check the content-length headers, it turned out that the problem had to do with persistent connections (HTTP keep-alive) being enabled and the middleware not adjusting the length of the body after compression.
In particular, what was happening is that the browser would request the first batch of resources (it appears to keep a maximum of 5 connections open at once) and receive gzipped resources tagged with the size of the original uncompressed resources.
In most cases, the compressed resources are smaller than the original ones and so the browser would sit on the connection, waiting for the remaining bytes until it timed out. Hence the fixed 2-minute delay for each batch of requests.
When using a persistent connection, browsers need a way to know when a given response is done and when to request the next one. It can be done through the content-length header, but in the case of compression, that would mean buffering the whole resource before sending it to the client (i.e. no streaming). An alternative is for the server to return the resource using the chunked transfer encoding. This is what compress does by default.
Solution for connect.compress() and http-proxy
My original goal was to enable a compression middleware inside a simple application that proxied HTTP content using the http-proxy node.js module.
The http-proxy documentation claims that it is possible and comes with an example program that didn't work for me. So I decided to replace connect-gzip in that example with the standard compress middleware that is now bundled with connect.
Unfortunately, because of the fact that the compress middleware needs to run before the rest of the response code, it would take a look at headers before they were written out and refuse to compress anything.
Once I worked around this, I discovered that compress would attempt to remove the content-length header from the proxied responses and switch to a chunked transfer encoding. However, because the header had not been written yet, this would have no effect and and the old content-length would stick around and break keep-alive.
The fix was simple: I simply had to make sure that compress has everything it needs from the headers before it starts compressing anything.
A few months ago, a collection of essays called Open Advice was released. It consists of the things that seasoned Free and Open Source developers wish they had known when they got started.
The LWN book review is the best way to get a good feel for what it's about, but here are the notes I took while reading it:
- Write code first: a project doesn't exist before there is code for it.
- Listen to your initial users and prioritise their bugs / feature requests: they will tell you how to turn your pet project into something that others will want to use.
- Documentation is not a waste of time: at some point, someone will need to take over from you. You're not just coding for yourself.
- When you first come to a project, you can make valuable contributions to the project's on-ramping process and documentation (other developers can't see or have forgotten about this stuff).
- Many potential contributors think they're not good enough to contribute to your project: invite them to do a specific task personally.
- Validated backups give you the freedom to do what's needed.
- Writing a library is a good way to enable cross-project collaboration.
- For many newcomers, documentation is a gateway in the FOSS community.
- Real programmers virtues: laziness, patience and humility (politeness is critical).
- Documentation can occur in sprints.
- Professional writers should focus on teaching and mentoring, not writing documentation.
- Watching real users use your software is the best way to improve its usability.
- The best designs evolve slowly and get refactored over time using well-known patterns.
- Outsiders can inject different viewpoints into the discussion, particularly around prioritization.
- Community managers should be good listeners and learn from their peers.
- Have a willingness to accept that you can and will be wrong sometimes.
- Sometimes you need to walk away from a project or argument, allow a day for things to settle down and for you to get a chance to breathe.
- Do not ask people what they think, state that you will do something by a date pending any objections from others.
- Creating a community team should never assume that people will stay fully committed the whole time. They will be in for a while then disappear for longer periods and then come back.
Libravatar recently upgraded its support for the Persona authentication system (formerly BrowserID).
Here are some notes on what was involved in migrating to the Observer API for those who want to do the same on their sites.
Moving away from hidden forms
Libravatar used to POST the user's assertion to the server-side verification code through a hidden HTML form, just like the example Python CGI from the BrowserID cookbook.
This was a reasonable solution when the Persona code was only needed on a handful of pages, but the new API recommends loading the code on all pages where users can be logged in. Therefore, instead of copying this hidden form into the base template and including it on every page, I decided to switch to a jQuery.post()-based solution prior to making any other changes.
As a side-effect of interacting with the backend in an AJAX call, the error pages were converted to JSON structures and are now displayed in a popup alert.
From .get() to .watch() and .request()
By far the biggest change that the new API requires is the move from navigator.id.get() to navigator.id.watch() and navigator.id.requets(). Instead of asking for an assertion to verify, two callbacks are registered through watch() and identification is triggered through request() (which triggers the onlogin callback).
In the case of Libravatar, this involved:
- including the Persona Javascript shim on every page
- moving the assertion verification code from the
get()callback to the newonlogincallback - adding a redirection to the existing logout page from the new
onlogoutcallback - sharing part of the session state (i.e. which user is currently logged in, if any) with Persona through the
loggedInEmailoption towatch()
One thing to note is that while loggedInEmail is going to be renamed to loggedInUser, this change hasn't hit the production version of Persona yet and so I reverted to the old name after noticing that onlogin was unnecessarily called on every page load (a fairly expensive operation given the need to transmit and verify the assertion server-side).
Simplifying Content Security Policy headers
The CSP headers that Libravatar used to set on the pages that made use of the Persona Javascript shim now need to be set on every page, which is actually a nice simplication of our Apache config.
Note that if your CSP headers still refer to browserid.org, you must change them to login.persona.org.
Letting Persona know about changes in login state
One important change with respect to the old API is that Persona now keeps track of the login state for your site. If Persona finds a discrepancy between its idea of what your state should be and what you are advertising, it will trigger the appropriate callback (onlogin or onlogout) and attempt to resolve the conflict.
This is a very important feature since it will enable features like global logout and persistent cross-device logins, but it does mean that you have to notify Persona whenever your login state changes. If you forget to do this, your state will be automatically changed to match what Persona expects to see.
In Libravatar, this means that when users delete their account, we need to kill their session and tell Persona about it (through navigator.id.logout()). Otherwise, Persona will log them in again, which will of course cause a new account to be provisioned.
Working around the internal login state
The most complicated part of this migration to the new API was around our "add email" functionality, which lets users add extra emails to their existing Libravatar account.
With the old get() API, adding emails was as easy as requesting additional assertions and verifying them. Under the Observer API, requesting an assertion also changes the internal state that Persona keeps for that website. In practice, it means that after adding a new email in Libravatar, we need to update the "logged in" identifier to match the new one. Failure to do this will prompt Persona to invoke the onlogout callback with a different email, which will cause the email to get added to a new Libravatar account instead.
There are also two corner cases where Libravatar needs to fallback to its manual authentication backend and tell Persona that nobody is logged in:
- when users remove from their account the email address that their Persona session is tied to
- when users unsuccessfully attempt to add an email that's already claimed by another account
In any case, despite these hacks, I got it all working in the end which is why I'm hopeful that we'll find a way to support this use case.
Taking advantage of the new features
The most visible feature that the new API brings (as options to request()) is the ability to add your name and logo to the Persona popup window:
![]()
The second feature I tried to enable on Libravatar is the new redirectTo request() option. Unfortunately, I had to revert this change since in our case, going straight to the profile page causes the @login_required Django decorator to run before the onlogin callback has a chance to set the session cookie.
In any case, redirecting to the login page already worked and so Libravatar probably doesn't need to make use of this Persona feature.
Conclusion
This migration was harder than I was expecting, but I'm confident that it will become easier in the next few weeks as the implementation is polished and documentation refreshed. I'm very excited about the Observer API because of the new security features and native integration it will enable.
If you use Persona on your site, make sure you sign up to the new service announcement list.