Preparing EC2 Instance Store with cloud-init
Most Amazon Machine Images (AMIs) are backed by an Elastic Block Store (EBS) volume. This volume houses the operating system and any additional software added to the machine image. When you launch an instance of an EBS backed AMI, the resulting EC2 instance usually includes some amount of instance store storage as well. Instance store is fast (relative to EBS), but also temporary, and physically attached to the virtual machine host.
Unprepared Instance Store
Instance store is associated with an EC2 instance via a block device mapping. Usually, instance store mappings carry a virtual device name of
ephemeralN and are pre-formatted as
ext3. Unfortunately, no formatted
ext3 file system exists if you’re using SSD-based instance store with TRIM support (only
i2.* instances right now).
If you’re dealing with instance store that’s not pre-formatted, or you want to use a filesystem other than
ext3, how do you remedy that elegantly inside of EC2? One possible answer is a set of
cloud-init directives via EC2 user data.
User Data and
Before launching an EC2 instance, you can provide it with a bit of user data. User data can either be a shell script or a set of
cloud-init module, formatting a pair of SSD volumes looks something like:
fs_setup: - label: ephemeral0, filesystem: ext3 extra_opts: [ "-E", "nodiscard" ] device: ephemeral0 partition: auto - label: ephemeral1, filesystem: ext3 extra_opts: [ "-E", "nodiscard" ] device: ephemeral1 partition: auto
After the volumes are formatted, you probably also want to mount them somewhere. The
mounts module can handle that:
mounts: - [ ephemeral0, null ] # Override any default EC2 mounting behavior - [ ephemeral1, null ] # Override any default EC2 mounting behavior - [ ephemeral0, "/media/ephemeral0", "ext3", "defaults,nobootwait,discard", "0", "2" ] - [ ephemeral1, "/media/ephemeral1", "ext3", "defaults,nobootwait,discard", "0", "2" ]
Lastly, we can change the user and group for these mounts with
runcmd so that users other than
root (here I’m using
hdfs) can read and write to them:
runcmd: - [ chown, hdfs, "/media/ephemeral0" ] - [ chgrp, hdfs, "/media/ephemeral0" ] - [ chown, hdfs, "/media/ephemeral1" ] - [ chgrp, hdfs, "/media/ephemeral1" ]
After putting all of these snippets together inside of a
.yml file with
#cloud-config at the top, it’s ready to be fed through the launch process of new EC2 instances via user data. In the end, hopefully producing a few nicely formatted and mounted volumes of instance store.