When setting up a deployment process on the cloud one faces a variety of choices. In this post, we’ll share what we’ve learned from our extensive experience in guiding clients toward the best choice for their needs.
Suppose we want a deployment for a small startup with fewer than 20 developers, each needing to host a webapp that’s gaining traction and for which rapid growth is expected. Its requirements are as follows:
Autoscaling support to handle expected surges in demand.
Maximizing developer efficiency by automating tedious tasks and improving dev flow.
Encouraging mature processes for building a stable foundation as the codebase grows.
Maintaining flexibility and agility to handle hotfixes of a relatively immature codebase.
Counting on a few sources to fail, since any of them can cause deployment failure—imagine github failing or a required plugin becoming unavailable.
Narrowing the focus a bit more, let’s assume the codebase is using Ruby on Rails, as is the case for several of our clients. We’ll examine various deployment choices in detail, walk through a thorough analysis and then provide recommendations for anyone that fits our sample client profile.
1. The Plain Vanilla AMI Method
This proven deployment is a well-tested Amazon OpsWorks Standard recommendation. Each time a new node comes up fresh, it requires running all Chef recipes. To automate this process, Cloud-init is used to run scripts for handling code and environment updates that occur when running nodes.
Pros: This approach requires no AMI management. The process is straightforward, self-documenting and brings up a clean environment every time. Updates and patches are applied very quickly.
Cons: Bringing up new instances is extremely slow, there are many moving parts and there’s a high risk of failure.
Our Recommendation: While this is a clean solution, the frequent-failure rate and amount of time needed for bringup makes the Plain Vanilla AMI impractical for a use-case with autoscaling.
2. The Bake-Everything AMI Method
This deployment option is proven to work at Amazon Video and Netflix. One runs all Chef recipes once, fetches the codebase, and then bakes and uses the AMI. Each change requires a new AMI and an ASG replacement within the ELB, including code and environment changes.
Keep in mind that the environment and configuration management parts of the deployment still need automation using tools like Chef and Puppet. Lack of automation can otherwise make AMI management a nightmare, as one tends to lose track of how the environment actually looks within the AMI.
Pros: Provides the fastest bringup, requires no installation and includes the fewest moving parts, so error rates are very low.
Cons: Each code deployment requires baking a new AMI. This requires a lot of effort to ensure that the process is as fast as possible in order to avoid developer bottlenecks. This setup also makes it harder to deploy hotfixes.
Our Recommendation: This is generally a best practice, but requires a certain level of codebase maturity and a high level of infrastructure sophistication. For example, Netflix has spent a lot of time speeding up the process of baking AMIs by using their Aminator project.
3. A Hybrid Method Using Chef to Handle Complete Deployment
This method strikes a balance between the Plain Vanilla AMI and the Bake-Everything AMI. An AMI is baked using Chef for configuration and environment, but one can’t check the codebase or deploy the app. Chef does those once the node is brought up.
Pros: Since all packages are pre-installed, this method is significantly faster than using a Plain Vanilla AMI. Also, since the code is pulled once a node is commissioned, the ability to provide hotfixes is improved.
Cons: Since we’re relying on Chef in production there’s a dependency on the repository, and pulling from the repository may fail.
Our Recommendation: We consider this to be a medium-risk implementation due to its reliance on Chef.
4. A Hybrid Method Using Capistrano to Handle Code Deployment
This is similar to the hybrid Chef deployment approach, but with code deployed through Capistrano. Capistrano is a mature platform for deploying Rails code that includes several features and fail-safe mechanisms that make it better than Chef. In particular, if pull from the repo fails, Capistrano deploys an older revision from its backups.
Pros: The same as for the Chef hybrid, except that Capistrano is more mature than Chef, especially in handling repository failures.
Cons: It requires two tools instead of one, which increases management overhead even though they’re tied together. In addition, the gap between environment and code is wider, and managing the tools separately is difficult.
Our Recommendation: Capistrano is a better Rails solution for code deployment than Chef and the ability to apply fixes quickly may make it the best solution.
5. The AMI-Bake and CRON-based Chef-client Method
This deployment method resembles that of the hybrids. However, it provisions features that allow auto-propagation of changes because each AMI runs chef-client every N minutes. New AMIs are baked only for major changes. It can provide continuous deployment, but continuous deployment is an aggressive tactic that requires excellent continuous integration on the backend.
Pros: Allows continuous code deployment.
Cons: It’s prone to errors if Continuous Integration is not stable. In addition, Chef rebootstraps aren’t reliable and may fail.
Our Recommendation: Not recommended unless CI is solid.
6. The Cloud-Init and Docker Method
We’re now developing this setup at Flux7 Labs. Although only in experimental phase, we believe Docker is the best choice for this use case and that it will come closer to a bake-everything solution while getting around bake-everything’s biggest drawbacks. It allows AMIs to be baked once and rarely changes after that. Both the environment and the app code are contained inside an LXC container, with each AMI consisting of one container. Upon code deployment, a new container is simply pushed, which provides deployment-process flexibility.
Pros: Docker containers provide a history with which one can compare containers, which helps with issues of undocumented steps in image creation. Code and environment are tied together. The repository structure of containers leads to faster deployment than does baking a new AMI. Docker also helps to create a local environment similar to the production environment.
Cons: Docker is still in early phases of development and suffers from several issues, including bugs, a limited tools ecosystem, app compatibility issues and a limited feature set.
Our Recommendation: If you adopt this approach, you’ll be doing considerable trailblazing. There’s very little information available, so comparing notes with other pioneers will be helpful. At Flux7, we’d love to discuss your experiences, so feel free to contact us at firstname.lastname@example.org.