Håkon presenting AWS Well-Architected Framework Reviews

Move fast and avoid surprises – Be Well-Architected

In the series “Move fast and avoid surprises” I’m elaborating on topics and concepts I’ve learned that have had major impact in real-life situations. In the modern software industry, being fast is not sufficient on it’s own, if you get caught up in problems by breaking things along the way. I rather prefer to avoid surprises.

To be able to make progress we can’t accept the status quo. To keep up with trends and competition you need to go at at least at the same pace as those we relate to. Otherwise someone more lightweight and nimble may suddenly overtake you!

To create a fly-wheel for innovation with short feedback loops, it’s vital to stay in the flow and avoid disruptions. Surprises can also synonymize risk. How can we manage risk, be aware of potential pitfalls and make data driven decisions to achieve set business outcomes?

Be Well-Architected

The first concept that I would like to share is how you can move fast and avoid surprises by following the principles of the AWS Well-Architected Framework, which describes key concepts, design principles, and architectural best practices for succeeding with workloads in the cloud.

The consistent process of learning AWS best practices, measuring architecture against these practices, identifying architecture risks and creating an improvement plan to address them is what’s called a Well-Architected Framework Review. By adopting WAFR in your project, using it as a gap analysis (and actually closing the gaps), you should be able to move faster.

Figure – Well-Architected Framework review cycle courtesy of AWS

But, doing a review for the review itself is not the point. The ultimate goal is to improve cloud based architecture and organizational setup to achieve desired business outcomes faster.

“Creating a software system is a lot like constructing a building. If the foundation is not solid, structural problems can undermine the integrity and function of the building. When architecting technology solutions, if you neglect the six pillars of Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization and Sustainability, it can become challenging to build a system that delivers on your expectations and requirements.

Incorporating these pillars into your architecture will help you produce stable and efficient systems. This will allow you to focus on the other aspects of design, such as functional requirements.”

Source: https://docs.aws.amazon.com/wellarchitected/latest/framework/the-pillars-of-the-framework.html

So, to be able to move fast and deliver business value, we need to ensure that the basics are taken care of, similarly as technical debt at the application layer. Having some level of technical debt is perfectly acceptable as long as you are aware of in which areas and how much. We can make informed decisions to reduce efforts in one area to deliver and validate some piece of functionality, if we also commit to make a plan for paying down the debt we introduced, and dedicating time for the work in the not too distant future.

How to plan a review (learn)

What are the main business goals your project is expected to achieve within the next 6 – 12 months? Try to see if there are one or two pillars which can support the most important objectives.

Is your project or service ramping up on costs? Plan a review with the Cost Optimization pillar as the top priority.

Is your service not meeting reliability goals (SLA/SLO) or do you have customer complaints about performance? Plan a review with the Reliability and Performance Efficiency pillars as primary point of attention.

Do you have a workload which is processing sensitive or personally identifiable information (PII), and/or are you preparing for an upcoming security audit? Plan a review focusing on the Security pillar.

Are you developing a new service and you plan to build it sustainable from day one? Organize a workshop around the Sustainability pillar and look to the technical implementation guidance.

If you are getting ready for a big launch, you want it to go smoothly. A review can help you understand any problems you might have missed. Even if it might be challenging to address all of the improvement items, at least you’ve learned about the risks and you can prepare some operational procedures to handle them, should they materialize. At least they should not surprise you now.

As you can see, various situations can benefit from leveraging the Well-Architected toolbox for systematic gap analysis and guidance about best practices. I highly recommend to define some S.M.A.R.T. goals to define desired outcomes of the review process, so that you can measure the success of the improvement actions. Some examples:

  • Cost per conversion / customer is reduced by 30%.
  • Reliability for key business transactions are now at >= 99.95%.
  • End user experience performance for browsing product recommendations is improved by 200 ms.
  • There are no Critical or High unresolved findings in AWS Security Hub.

How to conduct a review (measure)

When you have identified a desired result or set of business outcomes the project is going to achieve you can start to prepare the review, identify relevant people to invite and create the review instance in the Well-Architected Tool. It’s important to have representatives of all roles in a DevOps team involved, to build ownership and trust, in addition to the service/project owner which can help prioritize afterwards.

The review process should not be viewed as an audit, with right or wrong answers, but rather as a constructive conversation among technical peers to assess the state of the current implementation and identify areas that could be improved. Teams may have made decisions with best intentions according to the information they had at the time of implementation, but the situation may have changed and it could be time to evolve the architecture. Be respectful.

I won’t go through all the details in this article, so to move forward I highly recommend to read about the review process in the Well-Architected Documentation. Then check out Ebrahim Khiyami’s excellent AWS Blog series on “How to perform a Well-Architected Framework Review“.

How to close gaps (improve)

After you have performed a review with the AWS Well-Architected Tool you are left with a list of improvement items categorized as High or Medium risks. Look back to the pillar priority defined in the planning phase to identify items that are most applicable for your project in the relevant time frame.

One approach to classify improvement items is to use an Eisenhower decision matrix, categorized by effort of implementation/ease and impact. Ideally you would like to tackle the ones that have high impact and are easy to do/requires low effort, first.

WAFR prioritization matrix
Figure courtesy of Duncan Bell and Johnny Hanley, AWS

Look at the whitepaper pillar documentation for technical implementation guidance, like SEC01-BP02 Secure account root user and properties. Here you can see contextual information, the desired outcomes of performing the described actions, some common anti-patterns and detailed implementation steps.

Then create User Stories with specified Definition of Done and put them into your project backlog (Jira et al.). Consider these as important, non-functional requirements. I recommend to view this as general improvement or technical debt, to be systematically paid down over time, together with regular software development items, instead of stopping all feature development for a certain time period. You don’t need to tackle everything at once, start at the top of the list. Make it a habit to include one or two items into every sprint.

If you are unsure about priority, the AWS Foundational Security Best Practices (FSBP) standard in AWS Security Hub can help you for security related focus (Hint: Start with Critical findings, then High, and so on). AWS Cost Explorer can help you regarding cost optimization. Your monitoring tools can probably provide some input regarding Reliability and Performance Efficiency.

Every 3-6 months, or after substantial changes in requirements or architecture one can perform a lightweight milestone review in the Well-Architected Tool where the team checks off the remediated best practices. As time passes and the project/service matures, the amount of detected risks should decrease and the team can dedicate more of their focus higher up in the stack, because they know, by having performed a systematic evaluation, that the underlying foundation is ship shape and is serving them well.

Conclusion

In this article we explored some practical steps on how you can adopt Well-Architected as a toolbox in your project to avoid unplanned work, issues and outages, and rather accelerate business outcomes. By making an initial investment upfront, ensuring good quality at the infrastructure level and leveraging a systematic process to learn, measure and improve, you will be in a favorable position where you are able to really pick up the pace to develop those new cool features and overtake your competitors.

Resources


Posted

in

,

by

Tags: