Checklist for Validating A DevOps Architecture | Part 2
Last week we explored how business goals should inform every good DevOps strategy. This week we’ll discuss how to use those goals to validate your DevOps architecture. From our experience at Flux7, the best way to do this is to define the workflows of key users.
To ensure that an architecture will meet a client’s business goals, we ask ourselves the following questions:
What is the developer workflow and how will we enable it?
How will we handle mirroring environments for disasters?
How will we handle scaling up and down?
How will we update the environment?
How will we update the code?
How will we keep the code and environment aligned?
How will we make changes to the infrastructure?
To illustrate how these questions inform our work, we’ll walk you through them using our setup from the previous post, “The Best Way To Deploy Ruby On Rails in AWS”, which was as follows:
Chef used to deploy and bake the environment.
Capistrano used to handle code deployments.
Git repository on GitHub used to store code.
We used CloudFormation templates for infrastructure deployment.
Now let’s examine how this setup addressed the seven questions above.
What was the developer workflow and how did we enable it?
Using CloudFormation templates to orchestrate infrastructure deployment, the developers selected a pre-baked AMI with the correct environment setup. Even though we deployed the code with Capistrano, we also created a Chef recipe for deployment.
How did we handle mirroring environments for disasters?
Our Ruby on the Rails deployment was a real-time experience for a startup client. They could afford a cold DR provided the right alerts were set up for monitoring the website. It’s a good idea to make regular production-AMI backups to S3 and to make a copy to the DR region. In case of disaster, the environment can be retrieved by using the CloudFormation template with the latest AMI in the new region and then updating the route 53 to point to the new region.
How did we handle scaling up and down?
We implemented autoscaling. It’s important to know that an app server is “hot” when online without having to intervene manually. This may require scripting because the same AMI needs to work in several different environments.
How did we update the environment?
We edited the Chef recipe, checked for proper functioning and then baked the AMI. To improve Chef recipe debug loops, we experimented with recipes inside a Docker container. This approach ensured rapid revert to a previous state in case of failure.
How did we update the code?
We pushed the code from the dev branch to the master branch and ran the Capistrano recipe. Capistrano connected to the GitHub account and checked the latest copy of the required code revision. Since the code was pulled at deployment, rather than being baked into the AMI, baking a new AMI for each code update wasn’t needed. This approach is particularly suitable for hotfixes.
How did we keep the code and environment aligned?
Manual overhead made sure that the deployed code worked in its respective environment. Docker may come in handy in such cases since it versions both code and environment, but we haven’t yet tried this approach.
How did we make changes to the infrastructure?
We updated the CloudFormation template, deployed the environment and code, checked for complete proper functioning, and qualified template changes. We assessed the outage caused by the template update and, depending on the outage, updated the previous stack or created a new stack, and transitioned to S3 when completed.
Given the wide variety of needs for various organizations, there’s no right or wrong approach to developing your DevOps architecture. But it’s always best to make small iterative-but-real improvements because a huge project that tries to accomplish everything is far more likely to fail. The key to success is not to prevent failure, but rather to maintain a low failure cost.