Creating an Instance Store-Backed AMI from a Running OpsWorks Instance

Recently, we at Infopark have been exploring another way to speed up our integration tests. Utilizing EC2 instances created by AWS OpsWorks to run our test suite, the process of building an instance using only custom Chef recipes took quite a bit of time. With our recipes not being touched most of the time, we decided to switch to a snapshot based approach, using Amazon Machine Images (AMI). The challenge OpsWorks recommends using EBS-backed instances for creating custom AMIs. We are using instance store-based EC2 machines. So the documentation on the subject doesn't exactly fit our use case. I had a go, went the trial and error route, and here's how I did it, reducing our instance start-up time from 17 to 6 minutes: Basics As described by Amazon, I'm using the AMI Tools ec2-bundle-vol and ec2-upload-bundle to make an instance store-backed snapshot. The final registration of the resulting image is done by an API call. Pitfalls The Ubuntu 12 package ec2-ami-tools has a Bug with Ruby 1.9. So I learnt why Amazon recommends using their latest download. ec2-bundle-vol by default excludes far too many files (notably certificates for HTTPS downloads done internally by OpsWorks) to get a working OpsWorks instance from the generated AMI. I resolved this by using --no-filter, and specifying --excludes manually. The apache2 service needs to be stopped. Extra precautions timing the snapshot properly (i. e. making an instance successfully start from the resulting AMI) had to be taken by me. Your mileage may vary on these: Wait for the instance to be 'ready'. In our case I had to wait until our Monit controlled processes were no longer in the Initializing state. (Which is just a wild guess, maybe this hides a completely unrelated timing issue.) Don't wait too long after issuing the image preparation commands from the OpsWorks guide. After pausing a few minutes, mnt directories controlled by OpsWorks via automount gave me trouble. Seeing this mentioned in some example scripts, I added a sync;sync before vol bundling (but didn't double-check whether this is needed at all). Update: After stumbling upon another hidden OpsWorks artefact, cleaning the Autofs config before creating the image solved an intermittent failure (OpsWorks missing /var/log/apache2 on instance start): sed -i /opsworks/d /etc/auto.master Amendments Our snapshots are made on demand during a test run. The preparation steps listed in the documentation assume the machine will be shut down and thus won't keep the instance in a well-behaving state. So I replaced the rm -rf commands by --exclude arguments to ec2-bundle-vol. After taking the snapshot I can now just restart Apache, and do a rm -rf /root/.monit.state && service monit start to restore a happily running OpsWorks instance. Bonus points apt-get clean saves us about 25 percent in image size. Since we're using our AMIs solely for private purposes, the private key and certificate required by ec2-bundle-vol are less relevant. So I just generate throw-away keys with openssl req -x509 -newkey rsa:2048 -keyout private_key.pem -out cert.pem -days 365 -nodes -batch.