How to measure Well-Architected maturity

How to measure Well-Architected maturity?

TABLE OF CONTENTS

Introduction

“I always think that you should be asking yourself:
Are you Well-Architected?”

Dr. Werner Vogels, AWS re:Invent keynote 2018

Being Well-Architected means that you have taken care of the basics and your foundation is solid, so that you can move fast and focus on business requirements, with decreased risk of surprises. But how can we answer Werner’s question, with confidence? How can we measure Well-Architected maturity in a tangible manner? And what is Well-Architected enough, in your project phase?

As the first part of the journey I would suggest to plan and conduct a Well-Architected Framework Review; to have a conversation about the solution architecture and how the different best practices from AWS could apply in your context (Measure).

In most review conversations I observe teams are spending a substantial amount of time trying to understand if the designed or provisioned resources are meeting the AWS recommended best practice configurations. “Did we set up alerting for this scenario?”, “Did we configure encryption at rest for the database cluster?”, “Did anyone get any alerts about spiking costs?”, “Is our documentation still up to date?” and so on.

During a conversation, spending less efforts on Measure could provide us with more time to Learn about best practices and align on opportunities for Improvement. Personally, I don’t believe that automation and AI capabilities will fully replace the WAFR lifecycle, but they may help accelerate them, reducing Mean-Time To Deployed Improvement for your users (MTTDI) [yes, I just invented that term!].

AWS WAFR process
Figure – Well-Architected Framework review cycle courtesy of AWS

Challenge

There are many ways of getting insight to how cloud resources are configured according to AWS best practices. Most commonly:

Some of these options may not be available during a Well-Architected Framework Review due to company policies on changes potentially affecting the entire AWS Organization, cost concerns, security validations or procurement.

But, what if AWS native technology, provisioned for a limited time period, would be acceptable? In this article I will share an possible approach to measure Well-Architected maturity in the form of AWS Config Conformance packs.

Measuring Well-Architected maturity with Terraform and AWS Config

Recently I have been working on a Terraform module which can be utilized in scenarios with constraints as described above. A particular reflection I’ve made is that most tools focus primarily on security and reliability. Some dedicated offerings focus solely on Cloud Financial Management and Cost Optimization (also called FinOps), but finding one complete COTS Solution To Rule Them All (that doesn’t charge a premium) is not that likely.

If we wrap up our sleeves and develop our own solution supporting our own custom logic we could also cover other aspects such as Cost Optimization, Performance Efficiency and Sustainability.

This Terraform module deploys AWS Config Conformance Packs mapped to pillars in the Well-Architected Framework.

For relevant pillars in the AWS Well-Architected Framework, each best practice that is specific enough to be detected will report to be COMPLIANT or NON_COMPLIANT. Some best practices are harder to measure, or up to subjective consideration if a team is happy with how things are, or if the team considers there might be room for improvement:

  • How a team evaluates culture and priorities.
  • How satisfied a team is with insight into their workload(s) or business continuity and disaster recovery planning.
  • How to practice cloud financial management.
  • To which degree performance and cost are factored in when choosing components.

Best practices in Operational Excellence are not straight forward to detect, as implementation of observability may have subjective opinion on room for improvement or may be performed with 3rd party tools. The main outcome of this module is to accelerate the Well-Architected Framework Review conversation, not to replace it with automation. Our hope is to shift the focus from “how did we configure this?” to “this is our common understanding of where we are today, what could we do to improve for the future?”, thus freeing up valuable time for busy teams.

In addition, the Notes field in the Well-Architected Tool can be populated directly with AWS Config resource compliance check results, leaving you with more insight to discuss improvement actions.

Well-Architected Framework PillarStatus as of April 2025
Operational Excellence0 checks
Security (majority of checks)128 checks
Reliability69 checks
Performance Efficiency0 checks
Cost Optimization6 checks
Sustainability0 checks

Functional flow of the solution

Figure – Flow sequence for measuring Well-Architected maturity and accelerating review conversations

Conceptual AWS architecture diagram

  1. AWS Config Configuration Recorder
    • Records configuration changes for resources in your local AWS account (no impact or dependencies on AWS Organizations Config recorder(s))
    • Set to record either daily or continuously (configurable)
    • Stores configuration snapshots in a dedicated Amazon S3 bucket
  2. Amazon S3 Bucket
    • Stores AWS Config configuration snapshots
    • Stores CloudFormation templates for conformance packs
    • Encrypted with a dedicated KMS key
  3. AWS Config Conformance Packs
    • Well-Architected-Security
    • Well-Architected-Reliability
    • Well-Architected-Cost-Optimization
    • Well-Architected-IAM (optional, subset of Security checks)
  4. Custom Lambda Functions
    • Cost Optimization checks:
    • Account structure implementation
    • AWS Budgets configuration
    • AWS Cost Anomaly Detection
    • Organization information in cost and usage
    • EC2 instances without Auto Scaling Groups
  5. Well-Architected Tool Updater Lambda function
    • Retrieves compliance data from AWS Config
    • Maps compliance results to specific Well-Architected Framework best practices
    • Updates Notes fields in Well-Architected Tool

How to deploy and utilize

  1. At least two days before your planned review, deploy the module as suggested in examples/main.tf and described below. Compliance checks will update on a daily basis, to optimize costs for AWS Config Evaluations.
  2. Right before the review, trigger the Lambda function well_architected_tool_updater to update the Well-Architected Tool workload notes sections based on AWS Config Conformance packs compliance status.
  3. Run the review, look to the data in the notes field for discussion. No checked/answered questions will be modified, that would be up to subjective evaluation.
provider "aws" {
  region = "eu-west-1"  # Change to your preferred region
}

module "well_architected_conformance" {
  source = "git::https://github.com/soprasteria/terraform-aws-wellarchitected-conformance.git?ref=c006f439fc07d2e898cc7f67c5e7bcad1dcbd2e8"

  # AWS Config recording configuration
  recording_frequency = "DAILY"  # Use DAILY to reduce costs

  # Deploy conformance packs
  deploy_security_conformance_pack          = true
  deploy_reliability_conformance_pack       = true
  deploy_cost_optimization_conformance_pack = true
  deploy_iam_conformance_pack               = true
}

Viewing measurement insights in AWS Console

Navigating to AWS Config – Conformance packs will present a dashboard with packs for the Security, Reliability and Cost Optimization Pillars by default, plus IAM for Identity and Access Management, if enabled.

You can view the compliance score trend for each pillar/pack, demonstrating increased compliance for Cost Optimization to 83%:

You can also view the compliance status for each check, prefixed with the related best practice question, mapped to the AWS Well-Architected Framework whitepaper.

Well-Architected Tool integration

This module can also automatically update Well-Architected Tool workloads with compliance data from the AWS Config Conformance Packs.

The Lambda function well_architected_tool_updater will:

  1. Process each conformance pack (Security, Reliability, Cost Optimization).
  2. Loop through all rules in sequence (SEC01, SEC02, REL01, REL02, COST01, etc.).
  3. For each rule, list the resource type, resource ID, and compliance status in the Notes field of the corresponding best practice question of your Well-Architected Tool workload.
    • The notes field has a limitation of maximum 2084 characters. When more resources are discovered than there is room for, resources will be summarized.
  4. Overwrite old data if triggered more than once.
  5. If you would like to erase all contents in all notes field, set the clean_notes input parameter to 1.

The source code for the Lambda function is located in the src/wa_tool_updater directory.

To trigger the Well-Architected Tool updater, go to Well-Architected Tool and extract the Workload ID (not the full resource ARN).

Then go to AWS Lambda and find the function well_architected_tool_updater. Create test event JSON definition as follows (Console or CLI):

Event JSON examples for dry_run/live mode

Extract the Well-Architected Tool Workload ID from Properties – ARN. This example with dry_run = 1 will find relevant compliance data and log to CloudWatch Logs. No changes or updates will be performed.

{
  "workload_id": "141970ea95fd5b4329cea05202659f39",
  "dry_run": 1,
  "clean_notes": 0
}

Flipping dry_run = 0 will perform updates of the notes field. No checked/answered questions will be modified.

{
  "workload_id": "141970ea95fd5b4329cea05202659f39",
  "dry_run": 0,
  "clean_notes": 0
}

Event JSON for cleaning notes fields for all questions

If you end up with a lot of mess and would like a fresh start, setting clean_notes to 1 will clean the notes field for all questions and return. No further changes to checked/answered questions or compliance data updates will be performed.

{
  "workload_id": "141970ea95fd5b4329cea05202659f39",
  "dry_run": 1,
  "clean_notes": 1
}

Expected output is as follows. Full log output is available in Cloudwatch Logs.

Back in Well-Architected Tool, the notes field will now be updated with detected compliance for SEC 4. How do you detect and investigate security events?

Notice about compliance checks and automation

Check data is based on all resources in the current AWS account. Tagging based filtering is currently not supported. Be aware if you have multiple workloads in the same AWS account.

Cost of AWS Config evaluations

According to the AWS Config pricing page; “With AWS Config, you are charged based on the number of configuration items recorded, the number of active AWS Config rule evaluations, and the number of conformance pack evaluations in your account. A configuration item is a record of the configuration state of a resource in your AWS account. An AWS Config rule evaluation is a compliance state evaluation of a resource by an AWS Config rule in your AWS account. A conformance pack evaluation is the evaluation of a resource by an AWS Config rule within the conformance pack”.

AWS Config supports Continuous recording and Daily recording. You can choose between Daily or Continuous by setting the desired value for the variable recording_frequency, which defaults to DAILY.

How to remove and decommission after use

Some might see this solution as valuable long-term, others might have other tools coming in which overlaps.

As this Terraform module deploys an S3 bucket for storing Config evaluations, the bucket must be emptied before it can be deleted. This is taken care of in Terraform with the force_destroy parameter set for the module provisioning the S3 bucket. Just be sure, because the AWS Config evaluation data will not be recoverable.

All you have to do is:

  1. Remove the Terraform module call declaration from your code base.
  2. Trigger your CI/CD pipeline.

Behind the scenes

AWS Config general resources

To avoid dependencies or conflicts with existing AWS Organization based AWS Config, this module deploys a dedicated AWS Config Recorder, which has to be started after provisioning.

# Excerpts for illustration, not complete example, see main.tf

# AWS Config Delivery Channel to S3
resource "aws_config_delivery_channel" "well_architected" {
  name           = "well_architected_config_delivery_channel"
  s3_bucket_name = module.aws_config_well_architected_recorder_s3_bucket.s3_bucket_id
  depends_on     = [aws_config_configuration_recorder.well_architected]
}

# AWS Config Configuration Recorder with recording_frequency set by input variable
resource "aws_config_configuration_recorder" "well_architected" {
  name     = "well-architected"
  role_arn = aws_iam_role.config_role.arn

  recording_group {
    all_supported                 = true
    include_global_resource_types = true
  }

  recording_mode {
    recording_frequency = var.recording_frequency
  }
}

# AWS Config retention configuration: Number of days AWS Config stores your historical information.
resource "aws_config_retention_configuration" "example" {
  retention_period_in_days = 400
}

# Manages status (recording / stopped) of an AWS Config Configuration Recorder.
resource "aws_config_configuration_recorder_status" "well_architected" {
  name       = aws_config_configuration_recorder.well_architected.name
  is_enabled = true
  depends_on = [aws_config_delivery_channel.well_architected]
}

AWS Config Conformance Packs

Security, Reliability and IAM conformance packs are based on AWS’ library of Conformance Pack Sample Templates for AWS Config (in Cloudformation format):

The underlying checks are AWS Config Managed Rules and cannot be edited. The Cloudformation templates are imported in Terraform as data objects. ConfigRuleNames are replaced to suit the particular Well-Architected Framework Pillar and best practice.

# Excerpts for illustration, not complete example
locals {
  url_template_body_wa_security_pillar    = "https://raw.githubusercontent.com/awslabs/aws-config-rules/refs/heads/master/aws-config-conformance-packs/Operational-Best-Practices-for-AWS-Well-Architected-Security-Pillar.yaml"
}

data "http" "template_body_wa_security_pillar" {
  url = local.url_template_body_wa_security_pillar
}

data "util_replace" "transformed_wa_security_pillar" {
  content = data.http.template_body_wa_security_pillar.response_body
  replacements = {
    "account-part-of-organizations" : "SEC01-securely-operate_bp_account-part-of-organizations",
    "ec2-instance-managed-by-systems-manager" : "SEC01-securely-operate_bp_ec2-instance-managed-by-systems-manager",
    "codebuild-project-envvar-awscred-check" : "SEC01-securely-operate_bp_codebuild-project-envvar-awscred-check",
    "mfa-enabled-for-iam-console-access" : "SEC02-identities_bp_mfa-enabled-for-iam-console-access"
     # .. and so on
    }
}

# Render templates to file on S3 to avoid template_body file limitation of 51,200 bytes
resource "aws_s3_object" "cloudformation_wa_config_security_template" {
  bucket       = module.aws_config_well_architected_recorder_s3_bucket.s3_bucket_id
  key          = "Cloudformation/wa-config-security.yaml"
  content      = data.util_replace.transformed_wa_security_pillar.replaced
  content_type = "application/yaml"
}

# Takes the source Cloudformation file from S3, generates an AWS Config Conformance pack which behind the scenes creates an AWS managed Cloudformation stack. 
resource "aws_config_conformance_pack" "well_architected_conformance_pack_security" {
  count           = var.deploy_security_conformance_pack ? 1 : 0
  name            = "Well-Architected-Security"
  template_s3_uri = "s3://${module.aws_config_well_architected_recorder_s3_bucket.s3_bucket_id}/${aws_s3_object.cloudformation_wa_config_security_template.key}"
  depends_on      = [aws_config_configuration_recorder.well_architected]
}

The Cost Optimization Conformance Pack is built from scratch. Custom Lambda Rules may be implemented like this:

# AWS Lambda function based on module from terraform-aws-modules
module "lambda_function_wa_conformance_cost_03_aws_budgets" {
  source                            = "git::https://github.com/terraform-aws-modules/terraform-aws-lambda.git?ref=f7866811bc1429ce224bf6a35448cb44aa5155e7"
  trigger_on_package_timestamp      = false
  function_name                     = "WA-COST03-BP05-AWS-Budgets"
  description                       = "AWS Config Custom Rule which checks for AWS Budgets setup according to WAF COST03-BP05."
  handler                           = "index.lambda_handler"
  runtime                           = var.lambda_python_runtime
  source_path                       = "../../local-modules/wa-config-conformance/src/cost03_aws_budgets/index.py"
  attach_policy_statements          = true
  timeout                           = var.lambda_timeout
  cloudwatch_logs_retention_in_days = var.lambda_cloudwatch_logs_retention_in_days
  policy_statements = {
    statement = {
      effect = "Allow"
      actions = [
        "budgets:DescribeBudgets",
        "budgets:ViewBudget",
        "config:PutEvaluations"
      ]
      resources = ["*"]
    }
  }

  tags = {
    Name = "Well-Architected-Conformance-COST03-BP05-AWS-Budgets"
  }
}

resource "aws_config_config_rule" "cost_01_aws_budgets" {
  name        = "cost01-cloud-financial-management_bp_aws-budgets"
  description = "Checks for AWS Budgets setup according to WAF COST01-BP05 Report and notify on cost optimization."

  source {
    owner             = "CUSTOM_LAMBDA"
    source_identifier = module.lambda_function_wa_conformance_cost_03_aws_budgets.lambda_function_arn

    source_detail {
      message_type                = "ScheduledNotification"
      maximum_execution_frequency = var.scheduled_config_custom_lambda_periodic_trigger_interval
    }
  }

  depends_on = [module.lambda_function_wa_conformance_cost_03_aws_budgets]
}

# Lambda permissions for all AWS Config Custom Lambda Rules
resource "aws_lambda_permission" "config_permissions" {
  for_each = toset([
    module.lambda_function_wa_conformance_cost_02_account_structure_implemented.lambda_function_name,
    module.lambda_function_wa_conformance_cost_03_aws_budgets.lambda_function_name,
    module.lambda_function_wa_conformance_cost_03_aws_cost_anomaly_detection.lambda_function_name,
    module.lambda_function_wa_conformance_cost_03_add_organization_information_to_cost_and_usage.lambda_function_name,
    module.lambda_function_wa_conformance_cost_04_ec2_instances_without_auto_scaling.lambda_function_name
  ])

  statement_id   = "AllowConfigInvoke"
  action         = "lambda:InvokeFunction"
  function_name  = each.value
  principal      = "config.amazonaws.com"
  source_account = local.aws_account_id
}

Feedback and contributions

If you have any feedback, please let me know through your preferred medium of contact.

If you would like to contribute with bugfixes, additional functionality or check coverage, pull requests are welcome!

Resources


Posted

in

by