Preparing EC2 Instance Store with cloud-init
Most Amazon Machine Images (AMIs) are backed by an Elastic Block Store (EBS) volume. This volume houses the operating system and any additional software added to the machine image. When you launch an instance of an EBS backed AMI, the resulting EC2 instance usually includes some amount of instance store storage as well. Instance store is fast (relative to EBS), but also temporary, and physically attached to the virtual machine host.
Unprepared Instance Store
Instance store is associated with an EC2 instance via a block device mapping. Usually, instance store mappings carry a virtual device name of ephemeral0
to ephemeralN
and are pre-formatted as ext3
. Unfortunately, no formatted ext3
file system exists if you’re using SSD-based instance store with TRIM support (only r3.*
and i2.*
instances right now).
If you’re dealing with instance store that’s not pre-formatted, or you want to use a filesystem other than ext3
, how do you remedy that elegantly inside of EC2? One possible answer is a set of cloud-init
directives via EC2 user data.
User Data and cloud-init
Before launching an EC2 instance, you can provide it with a bit of user data. User data can either be a shell script or a set of cloud-init
directives.
Using the fs_setup
cloud-init
module, formatting a pair of SSD volumes looks something like:
fs_setup:
- label: ephemeral0,
filesystem: ext3
extra_opts: [ "-E", "nodiscard" ]
device: ephemeral0
partition: auto
- label: ephemeral1,
filesystem: ext3
extra_opts: [ "-E", "nodiscard" ]
device: ephemeral1
partition: auto
After the volumes are formatted, you probably also want to mount them somewhere. The mounts
module can handle that:
mounts:
- [ ephemeral0, null ] # Override any default EC2 mounting behavior
- [ ephemeral1, null ] # Override any default EC2 mounting behavior
- [ ephemeral0, "/media/ephemeral0", "ext3", "defaults,nobootwait,discard", "0", "2" ]
- [ ephemeral1, "/media/ephemeral1", "ext3", "defaults,nobootwait,discard", "0", "2" ]
Lastly, we can change the user and group for these mounts with runcmd
so that users other than root
(here I’m using hdfs
) can read and write to them:
runcmd:
- [ chown, hdfs, "/media/ephemeral0" ]
- [ chgrp, hdfs, "/media/ephemeral0" ]
- [ chown, hdfs, "/media/ephemeral1" ]
- [ chgrp, hdfs, "/media/ephemeral1" ]
After putting all of these snippets together inside of a .yml
file with #cloud-config
at the top, it’s ready to be fed through the launch process of new EC2 instances via user data. In the end, hopefully producing a few nicely formatted and mounted volumes of instance store.