AWS FinOps in 6 quarters: where mid-market actually finds the money

The conversation we keep having A mid-market CFO asks why the AWS bill keeps growing. The CTO points at the new product launch. The VP Engineering points at customer growth. The DevOps lead points at the dev team's habit of leaving environments running over the weekend. Everyone is partially right. Nobody owns the answer. This is the situation that…

## The conversation we keep having

Technical diagram showing vulnerability chain

Figure 1: Visual representation of the BeyondTrust vulnerability chain

The conversation we keep having

A mid-market CFO asks why the AWS bill keeps growing. The CTO points at the new product launch. The VP Engineering points at customer growth. The DevOps lead points at the dev team's habit of leaving environments running over the weekend. Everyone is partially right. Nobody owns the answer.

This is the situation that produces the 18-30% AWS bill reduction we typically deliver in the first year of an engagement — without touching application code, without architectural rewrites, without telling anyone they cannot have what they need to do their job. The savings come from operational discipline applied in a specific order. Below is the order.

Layer 1: Tagging discipline (week 1-2)

You cannot allocate costs to teams, products, or environments without consistent resource tags. You cannot make informed cost decisions without cost allocation. Therefore: tagging is the foundation. Every other layer assumes it.

Most mid-market AWS shops we audit have 30-50% of resources with inconsistent or missing tags at engagement start. The fix is two-week project:

Document the required tags (typically: Owner, Environment, Service, CostCenter — sometimes Project for grant-funded workloads).
Run a Service Control Policy at the OU level that prevents creation of untagged resources in non-production accounts.
Run a clean-up script against existing untagged resources — assigning Owner based on the IAM principal that created them, Environment based on the account, and a placeholder for Service/CostCenter that the owner has 30 days to update or the resource auto-deletes in non-prod.

The resistance you will hit: engineers do not love writing tags. The way through is automation — Terraform / CDK templates that bake the tags in, IAM policies that block resource creation without tags. Make the right thing easy.

Typical year-one savings unlocked by tagging: 0% directly. But every subsequent layer depends on it.

Figure 2: How the authentication bypass vulnerability works

Layer 2: Idle resources (week 1, in parallel with tagging)

The fastest wins. EBS volumes detached from terminated instances. Snapshots from 2019 that nobody can identify. Idle Elastic Load Balancers from a service that got decommissioned. RDS instances stopped for months but still billing for storage. EC2 instances that have been "temporarily off" since Q1 2024.

Trusted Advisor + Compute Optimizer surface 80% of this in the first hour. The remaining 20% takes a week of digging through Cost Explorer. The criteria for killing something:

Resource has not been touched in 90+ days.
The IAM principal that created it cannot identify what it is for.
The team it is tagged to (or, if untagged, the team in whose account it lives) cannot identify what it is for.

If all three fail, kill it. Snapshot first if it is data, in case someone screams.

Typical year-one savings: 5-15% of bill recovered in week one from killing things nothing depends on.

Layer 3: Compute Savings Plans rebalancing (month 1)

Most mid-market shops have legacy Reserved Instances locked to specific instance families that have since changed. The classic example: a 3-year all-upfront RI for an m5 instance family bought in 2022, paid for in advance, and now half-utilized because the team migrated to m7g (Graviton).

The fix: move from Reserved Instances to Compute Savings Plans, which apply across instance families and regions. The math has to be done — there are workloads where instance-specific RIs still beat Savings Plans, but they are increasingly rare in 2026.

The pattern: model 60-80% of baseline as covered by Savings Plans, with the remainder on-demand to absorb growth. Quarterly portfolio review keeps coverage tight without over-committing. New Savings Plans purchased monthly to ladder the commitments.

Typical year-one savings: 8-15% additional from converting RIs to Savings Plans + rightsizing coverage.

Figure 3: Privilege escalation from user to SYSTEM level

Layer 4: EBS storage class + S3 lifecycle (month 2)

Two trivial changes that produce real money:

EBS gp2 → gp3. gp3 is faster AND cheaper than gp2 for nearly every workload. AWS announced this in late 2020. Most mid-market shops still have meaningful gp2 inventory because nobody migrated. The migration is in-place (no downtime), takes a few hours of engineering time per environment, and saves 15-20% on EBS spend.

S3 lifecycle policies. Standard → Intelligent-Tiering for data with unpredictable access patterns. Standard → Glacier Instant Retrieval for data accessed once a quarter or less. Standard → Glacier Deep Archive for data accessed once a year or less (compliance retention, mostly). Configure once. Save indefinitely.

Typical year-one savings: 3-8% on storage with zero application changes.

Layer 5: Right-sizing compute (month 2-3)

Compute Optimizer identifies over-provisioned instances based on actual utilization. Engineers default to the next-larger size "to be safe" — the same way restaurant servings have grown over time. Compute Optimizer typically finds 20-30% of instances over-sized.

The fix is per-environment:

Dev / staging: Right-size aggressively. Anything under 30% sustained CPU/memory utilization gets dropped one size.
Production: Right-size conservatively. Anything under 20% sustained utilization with no burst pattern gets dropped one size, with a rollback runbook ready.

Coordinate the changes with the engineering teams that own the workloads. The savings show up as either smaller instances at the same Savings Plan coverage (price drop) or wider Savings Plan coverage at the same instance count (better economics).

Typical year-one savings: 5-12% saved by matching instance size to actual usage.

Layer 6: Workload-level architecture (months 3-12)

The architectural changes that produce compounding returns over multiple years:

Spot instances for batch / dev / non-critical workloads. The 60-90% discount is real; the interruption pattern is well-understood for the workloads that fit.
Aurora Serverless v2 for variable-load databases. Pay for what you use; scales to zero on idle.
Lambda for event-driven workloads instead of always-on EC2. The price model rewards low-volume usage.
CloudFront in front of S3 + dynamic content. Reduces egress costs at the application layer.

These are larger changes with longer payback. They typically come from the second-year FinOps cycle, after the first-year low-hanging fruit has been picked.

Typical year-two-and-beyond savings: where the compounding 5-15% additional reductions come from.

The cadence question

The companies that maintain low AWS spend over years run quarterly FinOps reviews with named accountability. The companies that drift do not.

The cadence has three parts:

Monthly cost review. Cost Explorer dashboard reviewed by engineering leads. Anomalies flagged for follow-up.
Quarterly portfolio rebalancing. Savings Plans coverage reviewed against baseline. New commitments purchased. Rightsizing recommendations actioned.
Annual architecture review. Workload-level architecture revisited against current AWS service catalog. Spot adoption, Serverless migration, CloudFront expansion, instance family modernization (Graviton).

The cadence is the work. The savings are the result.

What this looks like in practice

A typical mid-market engagement we run:

Month 1: tagging cleanup + idle resources killed → 8-12% reduction visible.
Month 2-3: Savings Plans rebalanced + EBS migrated + S3 lifecycle deployed → another 8-15% reduction.
Month 3-6: Right-sizing executed across environments → another 5-10% reduction.
Total year-one reduction: 18-30% in most cases. Some shops with extreme accumulated debt go higher; some shops with already-disciplined practices go lower.

The savings are not magical. They are the result of a six-layer pattern applied with discipline, in order.

The work, and the offer

The free 90-minute IT health check we run for prospective clients includes an AWS cost audit: Cost Explorer review, Trusted Advisor + Compute Optimizer scan, Savings Plans coverage analysis, and a prioritized first-year FinOps roadmap. Yours to keep either way.

The full AWS positioning lives at /aws. The Well-Architected approach is at /aws/well-architected. The case-study gallery is at /aws/case-studies.

The 18-30% reduction is real. The work is operational, not glamorous. The cadence is the moat.

AWS FinOps in 6 quarters: where mid-market actually finds the money

The conversation we keep having

Layer 1: Tagging discipline (week 1-2)

Layer 2: Idle resources (week 1, in parallel with tagging)

Layer 3: Compute Savings Plans rebalancing (month 1)

Layer 4: EBS storage class + S3 lifecycle (month 2)

Layer 5: Right-sizing compute (month 2-3)

Layer 6: Workload-level architecture (months 3-12)

The cadence question

What this looks like in practice

The work, and the offer

Related Topics

AWS IAM Identity Center: the 50-engineer transition that closes most audit findings

Microsoft 365 vs Google Workspace in 2026: the honest comparison

Multi-account AWS architecture: why it is non-negotiable in 2026

The conversation we keep having

Layer 1: Tagging discipline (week 1-2)

Layer 2: Idle resources (week 1, in parallel with tagging)

Layer 3: Compute Savings Plans rebalancing (month 1)

Layer 4: EBS storage class + S3 lifecycle (month 2)

Layer 5: Right-sizing compute (month 2-3)

Layer 6: Workload-level architecture (months 3-12)

The cadence question

What this looks like in practice

The work, and the offer

Related Topics

Related Reading

AWS IAM Identity Center: the 50-engineer transition that closes most audit findings

Microsoft 365 vs Google Workspace in 2026: the honest comparison

Multi-account AWS architecture: why it is non-negotiable in 2026