I wondered if I could get a measure of server availability as a single number, automatically (for calculating things like how tragically few nines of uptime my own servers have)
So, I wrote a tool called
long-uptime which you use like this:
The first time you run the code, initialise the counter. You can specify your estimate, or let it default to 0:
$ long-uptime --initand then every minute in a cronjob run this:
$ long-uptime 0.8974271427587808which means that the site has 89.7% uptime.
It computes an exponentially weighted average with a decay constant (which is a bit like a half life) of a month. This is how unix load averages (the last three values that come out of the
uptime command) are calculated, though with much shorter decay constants of 1, 5, and 15 minutes.
When the machine is up (that is, you are running
long-uptime in a cron job), then the average moves towards 1. When the machine is down (that is, you are not running
long-uptime), then the average moves towards 0. Or rather, the first time you run
long-uptime after a break, it realises you haven't run it during the downtime and recomputes the average as if it had been accumulating 0 scores.
Download the code:
$ wget http://www.hawaga.org.uk/tmp/long-uptime-0.1.tar.gz $ tar xzvf long-uptime-0.1.tar.gz $ cabal install $ long-uptime --init