Recently, we at Infopark have been exploring another way to speed up our integration tests.
Utilizing EC2 instances created by AWS OpsWorks to run our test suite, the process of building an instance using only custom Chef recipes took quite a bit of time. With our recipes not being touched most of the time, we decided to switch to a snapshot based approach, using Amazon Machine Images (AMI).
OpsWorks recommends using EBS-backed instances for creating custom AMIs. We are using instance store-based EC2 machines. So the documentation on the subject doesn't exactly fit our use case. I had a go, went the trial and error route, and here's how I did it, reducing our instance start-up time from 17 to 6 minutes:
The Ubuntu 12 package
ec2-ami-tools has a Bug with Ruby 1.9. So I learnt why Amazon recommends using their latest download.
ec2-bundle-vol by default excludes far too many files (notably certificates for HTTPS downloads done internally by OpsWorks) to get a working OpsWorks instance from the generated AMI. I resolved this by using
--no-filter, and specifying
apache2 service needs to be stopped.
Extra precautions timing the snapshot properly (i. e. making an instance successfully start from the resulting AMI) had to be taken by me. Your mileage may vary on these:
- Wait for the instance to be 'ready'. In our case I had to wait until our Monit controlled processes were no longer in the
Initializingstate. (Which is just a wild guess, maybe this hides a completely unrelated timing issue.)
- Don't wait too long after issuing the image preparation commands from the OpsWorks guide. After pausing a few minutes,
mntdirectories controlled by OpsWorks via
automountgave me trouble.
- Seeing this mentioned in some example scripts, I added a
sync;syncbefore vol bundling (but didn't double-check whether this is needed at all).
Update: After stumbling upon another hidden OpsWorks artefact, cleaning the Autofs config before creating the image solved an intermittent failure (OpsWorks missing
/var/log/apache2 on instance start):
sed -i /opsworks/d /etc/auto.master
Our snapshots are made on demand during a test run. The preparation steps listed in the documentation assume the machine will be shut down and thus won't keep the instance in a well-behaving state. So I replaced the
rm -rf commands by
--exclude arguments to
ec2-bundle-vol. After taking the snapshot I can now just restart Apache, and do a
rm -rf /root/.monit.state && service monit start to restore a happily running OpsWorks instance.
apt-get clean saves us about 25 percent in image size.
Since we're using our AMIs solely for private purposes, the private key and certificate required by
ec2-bundle-vol are less relevant. So I just generate throw-away keys with
openssl req -x509 -newkey rsa:2048 -keyout private_key.pem -out cert.pem -days 365 -nodes -batch.