The FinOps Architecture Review: Design Cost Controls Into the Infrastructure

The FinOps Architecture Review: Design Cost Controls Into the Infrastructure
Published

19 Jun 2026

Author
Renji Yonjan

Renji Yonjan

The FinOps Architecture Review: Design Cost Controls Into the Infrastructure
5:57
Table of Contents

A company's cloud bill doubled over six months. Not because traffic surged or because the product added a wave of new features — the workload profile was largely unchanged. The growth came from auto-scaling groups with no upper limits, database instances left at their launch-day sizing long after the workload stabilised, and storage tiers that nobody had revisited since the original provisioning. Nobody reviewed the spend until the quarterly invoice arrived, and by then the trajectory was already locked in. The conversation that followed was the same one that happens in every organisation where cost management is a monthly review exercise rather than an architectural decision: "How did this happen?" followed by "Who's going to fix it?"

The answer, almost always, is that nobody designed the infrastructure to control costs. They designed it for availability, for performance, for deployment speed — all legitimate priorities. But cost was treated as an output to be reviewed, not a constraint to be engineered. The cloud bill was something that happened to the organisation, not something the organisation controlled.

A FinOps architecture review™ changes when cost enters the conversation. Instead of reviewing spend after it accumulates, you embed cost controls — budget alerts, auto-scaling limits, right-sizing rules, resource lifecycle policies — into the infrastructure itself. Across 900+ projects delivered, EB Pearls has seen the distinction play out consistently: cloud costs controlled by architecture don't spiral. Cloud costs controlled by monthly reviews always do. The review is not a cost-cutting exercise. It is the practice of making cost a first-class architectural constraint alongside performance, security, and reliability.

Why Reactive Cost Management Fails at Scale

Most organisations manage cloud costs the same way: someone pulls the monthly report, identifies the largest line items, asks a few questions, and flags anything that looks unusual. Occasionally this catches something — an instance left running after a load test, a logging volume that spiked. But the fundamental problem with reactive cost management is that it can only catch what's already been spent. By the time the anomaly appears in a monthly report, weeks of excess spend have already occurred.

Three structural dynamics make reactive management increasingly inadequate as cloud environments grow.

Auto-scaling without cost boundaries. Auto-scaling is designed to add resources when demand increases. Without explicit limits, it does exactly that — indefinitely. A traffic spike, a bot crawl, a misconfigured health check that triggers scaling events — any of these can push resource counts well beyond what the workload requires. The scaling policy responds to a performance signal. It has no awareness of cost. According to the FinOps Foundation's State of FinOps report, managing commitment-based discounts and optimising compute remains among the top challenges for FinOps practitioners, precisely because scaling decisions are decoupled from cost awareness.

Right-sizing neglect. Infrastructure is typically provisioned for anticipated peak load with a safety margin. This is prudent at launch. Six months later, when actual usage patterns are established and the workload runs at a fraction of provisioned capacity for most of the day, the original sizing persists. Nobody revisits it because the application performs well — and performance, not cost, is what engineering teams are measured on. The result is systematic over-provisioning that compounds with every new service.

No budget signal in the deployment pipeline. Engineers deploy infrastructure through CI/CD pipelines that validate syntax, run tests, and check security policies. Cost is rarely part of that validation. A developer can provision a database instance class that costs ten times what the workload requires, and the pipeline will approve it because it's syntactically correct and compliant with security rules. The cost implication surfaces in next month's report — long after the resource is running and the deployment is forgotten.

These aren't failures of discipline. They're failures of design. When cost controls aren't built into the infrastructure, cost management depends entirely on human vigilance applied after the fact. That vigilance degrades over time, because the people reviewing costs aren't the same people making provisioning decisions, and the feedback loop between spending and detection is measured in weeks, not minutes.

What a FinOps Architecture Review Covers

A FinOps architecture review is a structured examination of how cost is — or isn't — controlled at the infrastructure level. It evaluates the architecture not for what it costs today, but for whether the design inherently limits, alerts on, and optimises cost over time. The review typically covers five areas.

Budget Alerts and Spend Boundaries

The most basic cost control is also the most frequently absent: automated alerts when spending exceeds defined thresholds. A FinOps architecture review evaluates whether budget alerts exist for every account, project, and team; whether the thresholds are set at meaningful levels (not just a single alert at 100% of budget, but graduated alerts at 50%, 75%, 90%); and whether the alerts route to people who can act on them.

Beyond alerts, the review assesses whether hard spending limits are technically possible and appropriately configured. Some cloud services support spending caps. Others don't, which means the architectural response must be different — auto-scaling limits, resource quotas, or lifecycle policies that prevent unbounded growth.

Auto-Scaling Cost Controls

Auto-scaling is essential for handling variable workloads, but unconstrained auto-scaling is a cost risk. The review examines every auto-scaling configuration for maximum instance limits, scaling cooldown periods, and scale-down aggressiveness. A scaling policy that adds instances quickly but removes them slowly is functionally a cost leak — it responds to spikes but doesn't reclaim capacity once the spike passes.

The review also evaluates whether scaling policies are workload-appropriate. Not every service needs auto-scaling. Batch processing jobs, background workers, and internal tools with predictable usage patterns may be better served by fixed capacity that's right-sized for the actual load.

Right-Sizing Architecture

Right-sizing is the practice of matching resource specifications — instance types, database classes, storage tiers, network throughput allocations — to actual workload requirements. The review analyses utilisation data across compute, database, and storage resources to identify where provisioned capacity significantly exceeds actual usage.

This is not a one-time exercise. The review evaluates whether the architecture includes mechanisms for ongoing right-sizing: utilisation monitoring with automated recommendations, scheduled reviews of instance sizing, and engineering processes for acting on right-sizing data. An architecture that was right-sized at launch but has no mechanism for re-evaluation will drift back to over-provisioning within months as workloads evolve.

Resource Lifecycle Policies

Cloud resources are easy to create and easy to forget. The review examines whether the architecture enforces lifecycle management: automatic deletion of unused resources after a defined period, retention policies for snapshots and backups, expiration of temporary environments, and tagging requirements that enable identification of orphaned resources.

The absence of lifecycle policies is one of the most common findings in a FinOps architecture review. Snapshots accumulate indefinitely. Development environments persist long after the feature branch is merged. Storage volumes remain attached to terminated instances. Each individual resource is inexpensive. Collectively, they represent a significant and growing portion of the bill.

Cost Visibility in the Deployment Pipeline

The final area evaluates whether cost information is available at the point where provisioning decisions are made. This means cost estimation in the CI/CD pipeline — tools like Infracost that show the monthly cost impact of a Terraform change before it's applied. It also means tagging enforcement so every new resource is attributable to a team, project, and environment from the moment it's created.

When cost is visible at deployment time, provisioning decisions change. An engineer who sees that their database instance choice costs $800 per month when an alternative that meets the workload requirements costs $200 will often choose differently. The cost signal doesn't need to be a blocker — it needs to be visible.

How to Conduct a FinOps Architecture Review

The review follows a structured process that can be completed in two to three weeks. It produces a set of architectural recommendations, not just a list of things to turn off. Here's how we approach it within our delivery framework.

Step 1: Map the current cost architecture. Document every existing cost control: budget alerts, scaling limits, lifecycle policies, tagging enforcement, cost estimation in pipelines. For most organisations, this step reveals that cost controls are sparse, inconsistent, or absent entirely. The map shows not just what exists, but what's missing.

Step 2: Analyse scaling and sizing. Pull utilisation data for all compute, database, and storage resources over the preceding three to six months. Identify auto-scaling configurations and their limits. Map the gap between provisioned capacity and actual usage. The output is a resource-by-resource assessment of where cost is structurally embedded in sizing decisions.

Step 3: Evaluate resource lifecycle management. Catalogue resources without lifecycle policies — snapshots without retention limits, environments without expiration dates, storage without tiering rules. Quantify the monthly cost of resources that should have been decommissioned or transitioned to lower-cost tiers.

Step 4: Assess pipeline cost integration. Review the deployment pipeline for cost signals: cost estimation, tagging validation, resource quota checks. Identify where provisioning decisions are made without cost data. This step often reveals that the pipeline validates everything except cost.

Step 5: Produce the architectural recommendations. The output is not a report that says "spend less." It is a set of specific architectural changes: budget alerts at defined thresholds for each account and team, maximum instance limits for each auto-scaling group, right-sizing recommendations for each over-provisioned resource, lifecycle policies for each resource category, and cost estimation integration in the deployment pipeline.

Our ISO 9001 and ISO 27001-certified processes ensure that each recommendation is documented with its expected cost impact, implementation effort, and priority.

The Bill That Doubled — and the Controls That Would Have Caught It

Return to the company from the opening. Their cloud costs doubled over six months. A FinOps architecture review conducted before that growth would have identified every contributing factor in the first week.

The auto-scaling groups had no maximum instance limits. During a sustained period of elevated traffic — not a spike, just a gradual increase in baseline load — the scaling policies added instances incrementally. Each addition was small. Collectively, over six months, compute costs grew substantially. A maximum instance limit, paired with an alert when scaling approached the cap, would have surfaced the trend within days.

The database instances were still running on the instance classes chosen during initial provisioning. Usage data showed that average utilisation sat well below the provisioned capacity for all but a brief daily peak. Right-sizing these instances to match actual workload patterns would have reduced database costs meaningfully — not by degrading performance, but by eliminating capacity that was never used.

Storage costs grew silently. Automated snapshots accumulated daily with no retention policy. Log data remained on high-performance storage tiers months after it was last accessed. Development environment data persisted indefinitely. Lifecycle policies — snapshot retention at thirty days, log tiering after fourteen days, development resource expiration after sprint completion — would have prevented the accumulation entirely.

None of these controls required new tooling or significant engineering effort. They required architectural intent — the decision to treat cost as a design constraint at the same level as performance and availability. The FinOps architecture review produces that intent as a set of implementable specifications.

When a FinOps Architecture Review Matters Most — and When It Can Wait

Invest now if your cloud spend has grown by more than 20% over two consecutive quarters without a corresponding increase in workload or users. That growth pattern indicates structural cost accumulation — resources that grow without governance — rather than legitimate business-driven scaling. The review will identify where controls are missing and what they should be.

Invest now if you're planning to scale. An architecture that's reasonably efficient at current load can become expensive quickly when traffic doubles or triples. Building cost controls before the scaling event means the architecture scales efficiently. Adding controls after the scaling event means reversing decisions that have already locked in spend. This is particularly relevant for organisations tracking development trends and preparing for growth.

It can wait if your cloud bill is small, stable, and fully understood. An organisation spending a few thousand dollars a month on a single application with predictable usage doesn't need architectural cost controls — the overhead would exceed the savings. But the moment the environment includes multiple services, multiple teams, or auto-scaling of any kind, the review becomes the difference between costs that track business growth and costs that outpace it.

The indicator is simple: if nobody in the organisation can answer the question "What prevents our cloud costs from doubling next quarter?" with specific architectural controls, the answer is "nothing" — and the review is overdue.

What to Do Next

Open your cloud cost dashboard. Look at the last six months. If the trend line slopes upward and you can't point to specific architectural controls — budget alerts, scaling limits, lifecycle policies — that govern it, the architecture is permitting uncontrolled growth.

When you're ready to design cost controls into your infrastructure rather than reviewing them after the fact, talk to our DevOps team. We build the architectural constraints that keep cloud costs proportional to business value — not proportional to time.

Frequently Asked Questions

What is a FinOps architecture review?

A FinOps architecture review is a structured assessment of how cost is controlled — or not controlled — at the infrastructure level. It examines budget alerts, auto-scaling limits, resource right-sizing, lifecycle policies, and cost visibility in deployment pipelines. The review produces architectural recommendations that embed cost controls into the infrastructure itself, rather than relying on periodic manual reviews to catch spending anomalies after they occur.

How is a FinOps architecture review different from a FinOps baseline assessment?

A baseline assessment maps where your cloud money is going today — it's diagnostic, focused on visibility and attribution. An architecture review evaluates whether the infrastructure is designed to control costs over time. The baseline tells you what you're spending. The architecture review tells you whether anything in the design prevents that spend from growing unchecked. Most organisations benefit from both: the baseline first to establish visibility, then the architecture review to embed controls.

What budget alerts should we set up?

At minimum, graduated alerts at 50%, 75%, and 90% of budget for each account, project, or team. Alerts should route to both the engineering team that owns the resources and the finance stakeholder who manages the budget. The specific thresholds depend on your organisation's tolerance for variance, but the principle is consistent: multiple early warnings are better than a single alert at 100% that arrives too late to prevent the overage.

How do we right-size without risking performance?

Right-sizing recommendations are based on utilisation data collected over weeks or months, not a point-in-time snapshot. The process identifies resources where peak utilisation consistently falls well below provisioned capacity. Changes are implemented incrementally — one instance class reduction at a time — with monitoring to confirm that performance remains within acceptable thresholds. The goal is eliminating unused capacity, not reducing headroom to zero.

What are auto-scaling cost controls?

Auto-scaling cost controls are architectural constraints on how scaling policies operate. They include maximum instance limits (preventing unbounded horizontal scaling), cooldown periods (preventing rapid, oscillating scale-up and scale-down), scale-down aggressiveness (ensuring capacity is reclaimed promptly after demand drops), and workload-appropriateness assessments (determining whether a service needs auto-scaling at all or would be better served by right-sized fixed capacity).

How often should we repeat a FinOps architecture review?

Annually for stable environments, or whenever the architecture changes significantly — new services, new cloud providers, migration to containers or serverless, or a planned scaling event. The review should also be triggered by cost anomalies: if the bill increases unexpectedly despite existing controls, the controls need re-evaluation. Many organisations delivered by EB Pearls, drawing on experience across 1400+ businesses, integrate architecture cost reviews into their quarterly planning cycle.

How Much Does It Actually Cost to Build Your App?

It depends — and anyone who quotes you before understanding your product is guessing. Book a scoping call, walk us through what you're building, and we'll give you an honest range within 48 hours. No NDA. No commitment. No inflated number to close the deal.