Ben Clifford Technical Blog: ec2

Showing posts with label ec2. Show all posts

31 July, 2012

setting up live resize on ec2

ec2 doesn't let you do a live resize on an attached elastic block store; and the procedure for resizeng offline is a bit awkward - make a snapshot and restore that snapshot into a bigger EBS volume (here's a stack overflow article about that).

LVM lets you add space to a volume dynamically, and ext2 can cope with live resizing of a filesystem now. So if I was using LVM, I think I'd be able to do this live.

So what I'm going to do is:

firstly move this volume to LVM without resizing. This will involve downtime as it will be roughly a variant of the above-mentioned "go offline and restore to a different volume"
secondly use LVM to add more space: by adding another EBS to use in addition (rather than as a replacement) for my existing space; adding that to LVM; and live resizing the ext2 partition.

First, move this volume to LVM without resizing.

The configuration at the start is that I have a large data volume mounted at /backup, directly on an attached EBS device, /dev/xvdf.

$ df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/xvdf              99G   48G   52G  49% /backup

in AWS web console, create a volume that is a little bit bigger than the volume i already have. so 105 gb. no snapshot. make sure its in same availability zone as the instance/other volume.

attach volume to instance, in the aws console.

on the linux instance, it should now appear:

$ dmesg | tail
[15755792.707506] blkfront: regular deviceid=0x860 major,minor=8,96, assuming parts/disk=16
[15755792.708148]  xvdg: unknown partition table
$ cat /proc/partitions 
major minor  #blocks  name
 202        1    8388608 xvda1
 202       80  104857600 xvdf
 202       96  110100480 xvdg

xvdg is the new EBS device.

Despite that dmesg warning, screw having a partition table - I'm using this as a raw device. It might suit your tastes at this moment to create partitions though, but it really doesn't matter.

Now I'm going to make that 105Gb on xvdg into some LVM space: (there's a nice LVM tutorial here if you want someone else's more detailed take)

 # pvcreate /dev/xvdg
  Physical volume "/dev/xvdg" successfully created
# vgcreate backups /dev/xvdg
  Volume group "backups" successfully created

Now we've created a volume group backups which contains one physical volume - /dev/xvdg. Later on we'll add more space into this backups volume group, but for now we'll make it into some space that we can put a file system onto:

# vgdisplay | grep 'VG Size'
  VG Size               105.00 GiB

so we have 105.00 GiB available - the size of the whole new EBS volume created earlier. It turns out not quite, so I'll create a logical volume with only 104Gb of space. What's a wasted partial-gigabyte in the 21st century?

# lvcreate --name backup backups --size 105g
  Volume group "backups" has insufficient free space (26879 extents): 26880 required.
# lvcreate --name backup backups --size 104g
  Logical volume "backup" created

Now that new logical volume has appeared and can be used for a file system:

$ cat /proc/partitions 
major minor  #blocks  name

 202        1    8388608 xvda1
 202       80  104857600 xvdf
 202       96  110100480 xvdg
 253        0  109051904 dm-0
# ls -l /dev/backups/backup
lrwxrwxrwx 1 root root 7 Jul 25 20:35 /dev/backups/backup -> ../dm-0

It appears both as /dev/dm-0 and as /dev/backups/backup - this second name based on the parameters we supplied to vgcreate and lvcreate.

Now we'll do the bit that involves offline-ness: I'm going to take the /backup volume (which is /dev/xvdf at the moment) offline and copy it into this new space, /dev/dm-0.

# umount /backup
# dd if=/dev/xvdf of=/dev/dm-0

This dd takes quite while (hours) - its copying 100gb of data. While I was waiting, I discovered that you can SIGUSR1 a dd process on linux to get IO stats: (thanks mdm)

$ sudo killall -USR1 dd
$ 41304+0 records in
41303+0 records out
43309334528 bytes (43 GB) copied, 4303.97 s, 10.1 MB/s

Once that is finished, we can mount the copied volume:

# mount /dev/backups/backup /backup
# df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/backups-backup
                       99G   68G   32G  69% /backup

Now we have the same sized volume, with the same data on it, but now inside LVM.

Second, add more space

Now we've got our filesystem inside LVM, we can start doing interesting things.

The first thing I'm going to do is reuse the old space on /dev/xvdf as additional space.

To do that, add it as a physical volume; add that physical volume to the volume group; allocate that new space to the logical volume; and then resize the ext2 filesystem.

These commands add the old space into the volume group:

# pvcreate /dev/xvdf
  Physical volume "/dev/xvdf" successfully created
# vgextend backups /dev/xvdf
  Volume group "backups" successfully extended

... and these commands show you how much space is available (by trying to allocate too much) and then add to the space:

# lvresize /dev/backups/backup -L+500G
  Extending logical volume backup to 604.00 GiB
  Insufficient free space: 128000 extents needed, but only 25854 available
# lvresize /dev/backups/backup -l+25854
  Rounding up size to full physical extent 25.25 GiB
  Extending logical volume backup to 129.25 GiB
  Logical volume backup successfully resized

Even though we've now made the dm-0 / /dev/backups/backup device much bigger, the filesystem on it is still the same size:

 df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/backups-backup
                       99G   68G   32G  69% /backup

But not for long...

Unfortunately:

# resize2fs /dev/backups/backup
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/backups/backup is mounted on /backup; on-line resizing required
old desc_blocks = 7, new_desc_blocks = 9
resize2fs: Kernel does not support online resizing

the version of the kernel on this host doesn't allow online resizing (some do). So I'll have to unmount it briefly to resize:

# umount /backup
# resize2fs /dev/backups/backup
resize2fs 1.41.12 (17-May-2010)
Resizing the filesystem on /dev/backups/backup to 33882112 (4k) blocks.
The filesystem on /dev/backups/backup is now 33882112 blocks long.

# mount /dev/backups/backup /backup
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/backups-backup
                      128G   68G   60G  53% /backup

So there's the bigger fs. (though not as big as I had expected... I only seem to have got 30G extra worth of storage, not 100 as I was expecting...

Well it turns out that all the space wasn't allocated to this LV even though I thought I'd done that:

# vgdisplay
...
  Alloc PE / Size       33088 / 129.25 GiB
  Free  PE / Size       19390 / 75.74 GiB
...

but no matter. I can repeat this procedure a second time without too much trouble (indeed doing this procedure easily is the whole reason I want LVM installed...

Having done that, I end up with the expected bigger filesystem:

# df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/backups-backup
                      202G   68G  135G  34% /backup

Now whenever I want to add more space, I can repeat step 2 with just a tiny bit of downtime for that particular filesystem; and if I get round to putting on a kernel with online resizing (my raspberry pi has it, why doesn't this?) then I won't need downtime at all...

21 May, 2011

ruby cloudwatch -> mrtg interface

I use mrtg to gather historical data on some of my servers. One of those servers lives in Amazon's Elastic Compute Cloud (EC2) and so is also monitored by Amazon CloudWatch.

Can I get cloudwatch data into mrtg?

mrtg has a fairly straightforward interface for plugging in arbitrary unix executables to collect data, so my first attempt was to use the main Java-based cloudwatch client to get data. that attempt started up one jvm for each metric collected, which massively overloaded my ec2 microinstance, keeping the load average around 3. pretty lame.

Amazon also provides a ruby interface. I had never programmed in ruby before, but its often interesting to learn a new language.

Here's what I ended up with.

First the config block for mrtg, which calls out to the ruby-mrtg-cloudwatch3 program that I wrote:

Target[cloudwatch_network]: `/home/mrtg/ruby-mrtg-cloudwatch/ruby-mrtg-cloudwatch3 NetworkIn NetworkOut AWS
/EC2 InstanceId=i-26bcaf51`
Title[cloudwatch_network]: Network traffic according to cloudwatch
options[cloudwatch_network]: growright,absolute,logscaleMaxBytes[cloudwatch_network]: 100000000

This gives a graph of network traffic according to cloudwatch. I can compare that alongside the network traffic graph for eth0 gathered from the local interface statistics. They should roughly match up, and they do (well hopefully they still do by the time you read this - these are live images):

According to the on-host network interface:

According to cloudwatch:

Now the actual ruby code:

#!/usr/bin/ruby1.8

require 'rubygems'
require 'AWS'

The two cloudwatch metric names, one that measures output data, one that measures input data, are give on the command line:

metrico=ARGV[0]
metrici=ARGV[1]

My code has hardcoded access keys at the moment which is a bit shitty:

ACCESS_KEY_ID='foo'
SECRET_ACCESS_KEY='bar'

Using the above credentials, a new cloudwatch object is made, @cw.

@cw = AWS::Cloudwatch::Base.new(:access_key_id => ACCESS_KEY_ID, :secret_access_key => SECRET_ACCESS_KEY, :server => "eu-west-1.monitoring.amazonaws.com" )

Each of the two metrics will be probed with the probe function. This uses a state file based on the metric name to get only readings which have not already been seen by this script. The two metrics use separate state files because cloudwatch doesn't give an atomic read for multiple metrics at once. The state file stores the time of the last seen reading. If there is no state file, we have to invent a time. There is a subtlety here: data does not appear in cloudwatch until around 5 minutes after its time stamp, so using the current time as an initial value results in not seeing any results. Instead, I go back about 15 minutes the first time, which will seems to be far enough back to get something.

def probe(metric)

  et = Time.now()

  statusfn="cloudwatch-"+ARGV[3]+"-"+metric+".status"
  if FileTest.exist?(statusfn) then

    f = File.new(statusfn, "r")
    tstring = f.gets
    ts = Time.parse(tstring)
    f.close
  else
    ts = et - 900 # needs to be more than 5 mins because otherwise we never get any data.
end

  res = @cw.get_metric_statistics(:measure_name => metric,  :statistics => 'Average,Sum', :namespace => ARGV[2], :period => 300, :start_time => ts, :end_time => et, :dimensions=> ARGV[3])

Now we're going to look at the rows that come back. Usually only one row will come back, if we're running this at about the same rate that cloudwatch is adding readings, but sometheres there will be more, or fewer.

In the case of network traffic, I want to return the sum of all readings for this metric. In other cases, such as disk usage, I would want to return the mean. This distinction is the same as default vs gauge measurements in MRTG.

samples = 0
  sum = 0
  avgsum = 0

  datapoints = res["GetMetricStatisticsResult"]["Datapoints"]

  lt = ts
  if datapoints.nil? then
   # nop
  else
    rows = datapoints["member"]

    rows.each { |r|
      nlt = Time.parse(r["Timestamp"])
      if(nlt < ts) then
        # nop - time was before requested start
      else
        samples += Float(r["Samples"])
        avgsum += Float(r["Average"])
        sum += Float(r["Sum"])
        nlt += 1
        if(nlt > lt) then
          lt = nlt
        end
      end
    }

Now we can write out the new state file:

f=File.new(statusfn, "w")
    f.puts(lt)
    f.close
  end
  return sum
end

and finally output the MRTG format information:

sumo=probe(metrico)
sumi=probe(metrici)

# output mrtg format
puts sumo
puts sumi
puts 0
puts "cloudwatch: "+metrico+" and "+metrici

The end.

14 January, 2011

IPv6 in Amazon EC2

Amazon declares that IPv6 is unsupported on EC2 (the Elastic Compute Cloud), but I wanted it anyway.

I tried two ways, one which worked well, and one which did not.
The first way I tried, which did not work well, was using 6to4; the way which did work was a tunnel from Hurricane Electric.

I did everything below on one of the free Linux micro-instances supplied by Amazon, with an elastic IP address attached so that I'd have a permanent address.

The first approach I tried was using 6to4. This is a protocol which automatically gives a large range of IPv6 addresses to anyone with a single static IPv4, through a decentralised network of protocol gateways.

In another blog post, I described how to get 6to4 running on Linux in 5 command lines. I ran those commands on my EC2 instance and end up with my own IPv6 address configured.

But having an IPv6 address configured is not the same as having an IPv6 address reachable from the internet. There were severe reachability problems. The problems boil down to two causes:

The decentralised management of 6to4 leads to lots of gateways being broken in a way that makes them act of blackholes; and the protocol makes it hard to discover which those gateways are to fix them. This is a problem for any host, not just for EC2 instances.
Even with the EC2 firewall turned off as much as possible (i.e. no firewall rules at all), EC2 doesn't cleanly deliver IP traffic to instances. For most traffic, this is not a problem; but it interacts with 6to4 in a terrible way:
your instance can only receive traffic from any particular 6to4 gateway if it has recently sent at least one packet to that gateway; there is a global network of 6to4 gateways, any of which can send traffic to you; when you send 6to4 traffic, you send it only to your closest 6to4 gateway. As a result, basically no 6to4 gateways can ever send traffic to you.

That second bullet point was especially hard to debug before I figured out what the EC2 firewall was doing - because certain pings and probes to test reachability caused the firewall to open up, at which point IPv6 traffic would flow between some sets of hosts until a few minutes later when it would mysteriously stop working again.

6to4 is basically useless on ec2.

The second method I tried, with much greater success [to the extent that, a year later, I regard this as production quality] is a manually configured tunnel via Hurricane Electric. HE have been around a long time; have a good reputation; and I used them before years ago.

The configuration at their end is a set of fairly straightforward web forms. I was allocated a /64 prefix, but they have options for more.

That web form also gives example configuration instructions for a variety of platforms. The Linux-net-tools instructions are the ones I used. I pasted the 4 given commands literally into a root prompt on my EC2 machine, and those configured the interface correctly (at least until reboot).

I get similar "stateful firewall" behaviour from EC2 as I mentioned above, but the difference here is that that the connection is only between me and the HE tunnel endpoint, rather than to arbitrary 6to4 gateways around the network. As long as *anything* goes over the tunnel every few minutes, then connectivity with the entire IPv6 world stays up. Compared to 6to4, when that tunnel is up, the connectivity from machines D and P seem *much* better. I can ping both ways without any mysterious losses. I need a ping to the tunnel endpoint (or anywhere on the IPv6 internet, really) every minute or so. That's no big deal - I have MRTG set up to measure some ipv6 latencies anyway, and that is generating this traffic as a side-effect.

So HE is a little bit (a few web forms) more effort to set up. But the connectivity is much much better. I recommend HE over 6to4 for this.

Other links: aco wrote about getting IPv6 on EC2 using sixxs, and if you're interested in getting a shell account on this machine (barwen.ch) to try for yourself: www.barwen.ch
. I found this this online ping tool useful during testing.

Modified: 2011-04-19 Rephrasing a bit based on ongoing experience, and some more hyperlinks
Modified: 2012-05-05 Rephrasing some more.

Ben Clifford Technical Blog