Ben Clifford Technical Blog: unix

Showing posts with label unix. Show all posts

29 January, 2018

Yellow Pages for Modern Times

Early on in my career it was pretty common to use clusters comprised of a pile of heterogeneous unix systems: some Sun boxes, some Linux machines, maybe IRIX and AIX in there too.

The thing that made them into a single cluster was that your user account existed on them all, and you had the same home directory on them all: make files on one machine, and they're visible on the other machines; but you still had access to the machine specific features of each host.

The technology of the time often used Network File System (NFS) and Network Information Service (NIS) (formerly known as Yellow Pages, with that name living on in the yp prefix of commands like yppasswd).

Fast-forward a decade or two and things look different: virtual machines are thing now, and more recently, containers. It's now very common to custom build a virtual machine or a container, both with something approximating an entire OS, specifically for running one application, or for running just a single piece of one application.

So maybe you'd connect these pieces - virtual machines or containers - with some kind of socket connection: a web front end exposing HTTP and talking to a PostgreSQL database in another container with no shared files between them.

I did a bunch of stuff this way, and it was great: you can install separate package stacks in isolation from each other. Want this weird version of a library or compiler? Or to run some curl | sudo script without messing up the rest of your system? Or stick with an old distribution of your OS just for one tool? Easy.

But it was a pain getting getting files between different places. Got my text editor and version control set up in one place, but need to compile in another? There are all sorts of different ways to get files between those places: for example, commit regularly to version control; or rsync.

Docker running on one host has options for mounting pieces of the host file system inside containers; but I lacked a good idea of what to mount where.

It was all so simple before: you had ~ everywhere, and nothing else.

So I started using the unix cluster model, described at the top of this post, to guide how I set up a lot of my containers and virtual machines.

The actual technology (NFS, docker volume mounts, YP, LDAP, HESIOD, ...) isn't massively relevant: I've used different mechanisms in different places.

What really matters is: all the (regular human) users get their home directory, mounted at the same place (under /home).

Pretty much with most ways of sharing files, that means the unix user id for that user should be the same everywhere too.

I've implemented this basic model in a few different ways: for a couple of VMs inside the same physical server, a traditional NFS and OpenLDAP setup (NFS for file sharing, LDAP for distributing account details) which is a more modern replacement for NFS/NIS; on my laptop and some of my physical servers, I've got a wrapper around docker called cue which creates exactly one user (the invoking user) inside the container, and mounts their home directory appropriately; I have some ad-hoc docker server containers (eg inbound SMTP) where the whole of /home is volume-mounted, and then OpenLDAP is used to share accounts.

There are plenty of downsides: for example, your whole home directory is accessible in more places than it needs to be and so is more vulnerable; you can't access files outside your home directory, so ~ is now a specially magic directory; posix filesystems work badly in distributed systems. For lots of what I want, these downsides are outweighed by the upsides.

One twist that doesn't happen so much with a cluster of physical machines: a server such as a mail server is now a container which has a mail queue which I want to persist across rebuilds. Rebuilding would be unusual in the physcial machine model because you don't usually rebuild physical servers often. So where should that persistent data go? Inside a specific /home directory? in a /container-data directory that is mounted too, like an alternate version of /home? What user-id should own the queue? Different builds of a container might assign different user-ids to the mail server.

10 September, 2017

Unix exit codes as an indicator of tooling (im)maturity.

If your compiler for your new language, or your test running, or whatever, doesn't return a unix exit code when it exits with an error - that's something that annoys me - and it's an indicator that no one is using your tool for serious - for example in an automated build system.

I've hit this a couple of times at least in the last year. grr.

02 October, 2012

cd

# cd
bash: cd: write error: Success

05 June, 2012

slight niggle with permissions.

On most unixes, you don't need to own a file to delete it. Instead, you need write permission on the containing directory (and if you don't have write permissions on the directory, you can't delete a file even if you own it)

That's not true for directories though. If a directory (c) has files in it, the owner of the containing directory (..) can't delete it because they can't (necessarily) delete the contents of the directory (c/*). And the owner of the directory (c) can't necessarily delete it unless they have write permission on parent (..).

I've only just noticed that difference in behaviour between files and directories. Its never been a problem. (of course, I have root on most systems where it would be so its easy to work around). So I guess this counts as obscure?

15 January, 2012

server availability like uptime

I wondered if I could get a measure of server availability as a single number, automatically (for calculating things like how tragically few nines of uptime my own servers have)

So, I wrote a tool called long-uptime which you use like this:

The first time you run the code, initialise the counter. You can specify your estimate, or let it default to 0:

$ long-uptime --init

and then every minute in a cronjob run this:

$ long-uptime
0.8974271427587808

which means that the site has 89.7% uptime.

It computes an exponentially weighted average with a decay constant (which is a bit like a half life) of a month. This is how unix load averages (the last three values that come out of the uptime command) are calculated, though with much shorter decay constants of 1, 5, and 15 minutes.

When the machine is up (that is, you are running long-uptime in a cron job), then the average moves towards 1. When the machine is down (that is, you are not running long-uptime), then the average moves towards 0. Or rather, the first time you run long-uptime after a break, it realises you haven't run it during the downtime and recomputes the average as if it had been accumulating 0 scores.

Download the code:

$ wget http://www.hawaga.org.uk/tmp/long-uptime-0.1.tar.gz
$ tar xzvf long-uptime-0.1.tar.gz
$ cabal install
$ long-uptime --init

22 August, 2010

rsync --fake-super

Every now and then I discover new options on old utilities.

One I am very happy to have discovered in rsync is the --fake-super option.

Scenario:

I have machine A. I want to back up (some portion of) the file system onto machine B. I want to include permissions and ownership (for example, because I am backing up /home.

I can run rsync on machine A as root from a cron job. OK. But then (traditionally) it needs root access to machine B in order to set permissions and ownersip of the files it creates. I can't have it connect to machine B as some normal user because of that. Additionally, the user-id and group-id name/number spaces on both machines need to match up somewhat so that users on machine B don't get access to files they shouldn't have access to.

--fake-super changes that. When rsync tries to change the permission or ownership of a file and finds that it cannot do that, it instead stores that information in extended attributes on that file. So now access to machine B can be through some normal user account without special privileges.

A downside is that if some user has an account on both sides, they don't get full privilege access to the backups of their own files.

Another use I found for this is on my laptop under OS X, where one of my external hard-drives is now mounted with an option to prevent user and group IDs being changed, or makes them ignored somehow (presumably for a better experience when using the hard-drive on multiple machines).
Incremental rsync backups were experience an inability to change group ownership on files, which mean that instead of being hard-linked (using --link-dest) they were being copied afresh each time. This was fixed by --fake-super too - instead of changing ownership on the external HD filesystem, they're added to the extended attributes.

15 July, 2010

autoconf and portable programs: the joke

This is not my rant, but I like it:

I have a simple request for you: each time you run into software that does
this idiotic kind of test, please interact with the idiots upstream for
whom all the world is linux, and try to get them to replace their "joke"
of an autoconf macro with actual genuine tests that actually CHECK FOR THE
FUCKING FEATURE.

http://marc.info/?l=openbsd-ports&m=126805847322995&w=2

21 May, 2010

For reasons which might not be intuitively obvious, the broken behavior is required

I previously regarded /usr/bin/env as being portable onto any sane unix platform.

But now I've had to work with FreeBSD and my beliefs are dashed.

FreeBSD /usr/bin/env implements what posix says, rather than what everyone else does. That leads to trouble when you want to pass a parameter in a shebang, like this: #!/usr/bin/env perl -w which works almost everywhere but not in FreeBSD :(

Ben Clifford Technical Blog