Showing posts with label rsync. Show all posts
Showing posts with label rsync. Show all posts

29 November, 2011

mysql rsync backup doh!

Well, I moved a big live database from one server to another for a customer. I was planning on taking the site offline first and doing an sql dump/move/restore, but someone accidentally deleted the old server before they were meant too (oops) - leaving me with some rsync backups that I'd made of the whole server over the previous days, to guard against such an eventuality.

Turns out rsyncing a live mysql database is not the right way to do things - I ended up with a database that managed to get mysqld to actually crash when I tried to dump out the database contents. I managed to fix this by dropping the tables that it was crashing on: it conveniently told me which ones it was crashing on, and it let me drop them, just not dump them. Luckily they were all cache tables so there didn't seem to be harm in dropping them and recreating them empty.

It also turns out (obviously, post facto) that mysql databases don't play nicely with rsync's --link-dest option, which is supposed to reduce disk space on the target file system when making multiple snapshots. Because it only optimises away duplicate data for identical files, but mysql stores the entire database (to a first approximation) in a single file, hiding its internal structure from rsync, and so the whole db gets duplicated each time.

Taking an SQL dump should solve the first problem. Not entirely sure what the right way to do an incremental backup of the database is. Maybe I shouldn't even try.

22 August, 2010

rsync --fake-super

Every now and then I discover new options on old utilities.

One I am very happy to have discovered in rsync is the --fake-super option.

Scenario:

I have machine A. I want to back up (some portion of) the file system onto machine B. I want to include permissions and ownership (for example, because I am backing up /home.

I can run rsync on machine A as root from a cron job. OK. But then (traditionally) it needs root access to machine B in order to set permissions and ownersip of the files it creates. I can't have it connect to machine B as some normal user because of that. Additionally, the user-id and group-id name/number spaces on both machines need to match up somewhat so that users on machine B don't get access to files they shouldn't have access to.

--fake-super changes that. When rsync tries to change the permission or ownership of a file and finds that it cannot do that, it instead stores that information in extended attributes on that file. So now access to machine B can be through some normal user account without special privileges.

A downside is that if some user has an account on both sides, they don't get full privilege access to the backups of their own files.

Another use I found for this is on my laptop under OS X, where one of my external hard-drives is now mounted with an option to prevent user and group IDs being changed, or makes them ignored somehow (presumably for a better experience when using the hard-drive on multiple machines).
Incremental rsync backups were experience an inability to change group ownership on files, which mean that instead of being hard-linked (using --link-dest) they were being copied afresh each time. This was fixed by --fake-super too - instead of changing ownership on the external HD filesystem, they're added to the extended attributes.

24 April, 2010

google chart venn diagram of disk usage



This compares disk usage of a backup I made of /home two years ago with one made this month. There's a lot of overlap - more than I expected.

Method: creating the second backup, I used rsync with --link-dest specified; for measuring, I used du -hsc old new; du -hsc new old. The size of the overlap is the size of new given by the second du minus the size of new given by the first du.

I'm trying to figure out ways of showing overlapping disk usage for more than two or three rsync trees created in this way (for example, to visualise monthly, weekly and daily backups over a long time period)