Security Architecture Review: Build Security In, Don't Bolt It On

Security Architecture Review: Build Security In, Don't Bolt It On
Published

15 Jun 2026

Author
Renji Yonjan

Renji Yonjan

Table of Contents

Six weeks before launch, somebody on the team books a penetration test. The report comes back two weeks later with a list of findings: hardcoded credentials in environment variables, an authentication flow that does not survive a replay attack, a database without encryption at rest, IAM roles with broad permissions because tightening them up was always going to happen later. The team triages frantically, ships the patches they can fit into a fortnight, files the rest as post-launch security debt, and goes live with a quietly compromised posture that nobody outside engineering knows about. The pattern is so consistent across products we audit on inheritance that it has become the industry's silent default.

A security architecture review prevents this. It runs at design time — before code is written, before infrastructure is provisioned, before the authentication flow is committed — and produces a documented set of threat scenarios, control commitments, and architectural decisions the build is then held to. It is not a penetration test or a compliance checklist. It is the engineering conversation that decides what the system must defend against, where the trust boundaries sit, and which controls are non-negotiable before sprint one. Security designed in at this stage costs almost nothing in delivery time. Security retrofitted after launch costs everything.

In the Built to Last™ 2.0 framework — the AI-native delivery system EB Pearls runs across every engagement — the review sits inside P02 The Right Infrastructure. It pairs with the Production Readiness Review™ that gates go-live and the DevSecOps build standard that scans every commit. The review is not a heavy ceremony — typically a structured half-day to two-day exercise depending on system complexity — and the output is a short, dense set of design decisions that subsequent sprints are built against. This article walks through what it covers, who needs to be in the room, what gets documented, the failure modes that survive even when the review is run, and how the commercial conversation changes when threat modelling moves from a pre-launch panic to a design-time decision.

The Cost of Bolting Security On Later

The damage from skipping a design-time security review rarely surfaces at launch. It surfaces three months later, when a near-miss incident — an unexpected access pattern, a credential found in a public Git history — forces a rotation nobody had planned for. It surfaces six months later, when the first compliance audit lands and the controls that exist in policy are not the controls that exist in code. It surfaces eighteen months later, when someone traces an incident back to a design decision that was never written down and has since been built on top of in ways that make it expensive to unwind.

The cost compounds along four axes. First, direct rework: retrofitting encryption at rest into a live database is not the same job as designing it in. Re-platforming an authentication flow built before MFA was on the requirements list is engineering work in the same range as the original feature. Second, audit cost: SOC 2, ISO 27001, HIPAA, and PCI-DSS ask not just "is this control present" but "is it documented, evidenced, and reviewed". A control bolted on under deadline pressure rarely carries the documentation an auditor accepts on first review.

Third, the incident cost. When a breach lands, the bulk of the cost is not the technical response but the regulatory disclosure, customer notification, legal review, and brand impact. Fourth, the velocity cost: a team carrying a known but unfixed security finding from launch begins making delivery decisions around it. Features get scoped around the gap rather than fixing the gap. The system grows around the wound. These are the costs that surface when a new client hands us a codebase built without a security architecture review and asks us to bring it up to the standard their next investor, regulator, or enterprise customer will require.

What a Security Architecture Review Actually Covers

The review is a working session, not a sign-off ceremony. It runs before the architecture session locks the technology stack and the data model, and the two feed each other — security decisions constrain architecture decisions, and vice versa. The input is the product's intended functionality, the regulatory environment, the data classes involved, and the threat actors plausibly interested in the system. The output is a Security Architecture Record that becomes part of the canonical project documentation. The agenda covers seven decision areas, run in order because each constrains the next.

Threat modelling. The team identifies the assets the system must protect — user data, customer funds, intellectual property, system availability — and the threat actors plausibly interested in each. The aim is a ranked list of attack scenarios the design must defend against. STRIDE (spoofing, tampering, repudiation, information disclosure, denial of service, elevation of privilege) is the most common framing, and we use it because it covers the dimensions most teams forget under deadline pressure.

Authentication and authorisation design. Who can authenticate, by what method, and what privileges does each role carry? MFA enforcement, session lifecycle, token storage, password reset flows, account recovery, and service-to-service authentication are decided here. Many of the security failures we audit on inheritance trace back to an authentication flow designed for the first user role and extended ad-hoc for the second and third. Designing for the full role set at this stage is far cheaper than refactoring later.

Encryption and data protection. Encryption in transit and at rest. Key management — where keys live, how they are rotated, who can access them. Data classification — which classes need stronger protection, and what controls follow from each. Whether to encrypt at the field, database, or disk level is decided here, with the trade-offs explicit.

Secrets management. Where secrets live, how they get into the runtime environment, how they are rotated, who has access. The single most consistent finding when we audit inherited codebases is that secrets sit in environment variables, in Git history, or in shared documents — because no one decided up front where they would live, and the path of least resistance won.

Network architecture as a security boundary. Network topology, segmentation, egress controls, private connectivity to third parties, zero-trust patterns where they fit the workload. The network is one of the cheapest places to enforce security if designed that way, and one of the most expensive to fix if designed as connectivity wiring with controls bolted on later.

Compliance alignment. SOC 2, ISO 27001, HIPAA, PCI-DSS, the Australian Privacy Principles, GDPR — the relevant regimes are identified, and the controls that apply are mapped to the design choices being made. The aim is to make sure the design supports the audit before the build commits to a design that does not.

Incident response architecture. What happens when something goes wrong. Logging design — what gets logged, what doesn't, where the logs live, who can read them — plus audit trails, alerting paths, escalation contacts, and the runbook templates the on-call engineer will reach for at 2am. This is the section most teams skip, and the section that most distinguishes a system that survives an incident from one that doesn't.

Beyond the agenda, two things matter as much as the decisions. Who is in the room: the technical lead, the architect who will own the system, the product owner accountable for the commercial outcome, and where the system touches regulated data, a privacy or compliance representative. What gets documented: a Security Architecture Record — typically four to ten pages — covering each of the seven decision areas, naming the decisions made, the alternatives considered, and the rationale. This document is referenced in every subsequent architecture review, every code review for security-critical paths, and every compliance audit the system goes through.

A review can be run and still fail. The most common failure is the review that produces decisions nobody enforces — the document exists, the controls are not in the pipeline, and nobody notices until audit. The second is the review that treats threat modelling as a one-time event — the model is built before the first major feature shifts the threat surface, and is never updated. The third is the review run with only engineering in the room — the design fits the engineering reality but not the regulatory one, and the gap surfaces six months later. The discipline that prevents each is enforcement: controls in the pipeline, the threat model reviewed every quarter, and the compliance voice in the room.

Running the Review on Your Project

If you are mid-build and a security architecture review has not happened, the first useful step is the threat model. You can do it in half a day with a whiteboard and the people who can answer "who would attack this system, and what would they gain". You do not need a dedicated security specialist in the room — most of the value is in the conversation. If it surfaces decisions that should have been made and were not, the rest of the review structure follows naturally.

If you are pre-build, the review is one of the cheapest exercises you will run on the project. Schedule it after the architecture session has narrowed the technology choices but before the data model is locked. Allow a day for a single-team product and two to three days for a multi-system integration or a regulated workload. The time cost is recovered many times over by the controls that don't have to be retrofitted.

Prerequisites are the ones the Discovery Workshop™ should have produced: a clear statement of what the system does, who uses it, what data classes it handles, and the regulatory environment it operates in. If those are not documented, run discovery first — there is no useful security architecture review for a system whose commercial purpose is not yet clear.

What to do this week. If the system is in design, schedule the review and identify participants. Send the seven-area agenda a few days in advance so people arrive with context. If the system is mid-build, run an abbreviated review focused on the three highest-risk areas — almost always authentication, secrets management, and data protection. The smaller scope makes it possible to fit a useful review into a single working day with people already on the project.

What to avoid. Do not run the review as a sign-off ceremony — it is a working session that produces design decisions, not a gate. The sign-off mentality produces a document that looks complete on paper and is uncoupled from the build. Do not skip the compliance voice when the system handles regulated data. And do not let "we will come back to this" become the answer to any threat scenario with a non-trivial cost of failure — the cost of coming back to it later is one of the things the review exists to surface.

Implementation depends on other parts of the framework. The DevSecOps build standard turns the review's decisions into pipeline gates. The IAM Least-Privilege Architecture and Secrets Management Standard pick up specific commitments. The Production Readiness Review checks that what was decided here is what got built.

A Financial Product, Two Starts

A financial product we were brought in to audit had launched eighteen months earlier with hardcoded secrets in environment variables, a single-factor authentication flow, and broad IAM permissions on every service. The team was competent and the engineering generally strong, but timeline pressure at launch meant the security work had been deferred and then deferred again. The trigger for the audit was a near-miss six months in: a credential found in a public Git history, rotated within hours of detection, with no evidence of exploitation but no certainty of non-exploitation either. The post-incident review identified seven control gaps a design-time security architecture review would have caught. Closing them retroactively took the better part of three months across two engineers, on top of the regulatory disclosure work the near-miss had forced.

Contrast that with a comparable financial product we worked on from inception. Same risk patterns — sensitive financial data, regulatory exposure, a payment integration with the same trust assumptions. The security architecture review at design time named the threats explicitly: credential leakage, replay attacks against the authentication flow, broad IAM roles in the payments service. The decisions that fell out of the review — secrets in a dedicated manager from environment one, MFA on every administrative role, scoped IAM policies per service, encryption keys under a documented rotation policy — cost the team a few additional design days and zero rework. Same risk profile, same regulatory environment, and a security posture that has so far required no retroactive remediation.

The lesson the inherited codebase taught its team was not that security is hard. It was that designing it in is paid in days, and retrofitting it is paid in months — and the price difference does not bend.

When the Review Is Critical, and When It Can Wait

The review is non-negotiable for systems that handle financial data, personal health information, government records, or regulated decisions about credit, employment, or access. It is non-negotiable when the system will undergo a SOC 2 or ISO 27001 audit, when it will be sold into enterprise environments with their own security review gates, or when the commercial relationship depends on the customer trusting the system's posture. It is non-negotiable for any system whose breach would produce a notifiable data incident under the Australian Privacy Principles or GDPR.

The review can be lighter — though not absent — for internal tools, throwaway prototypes, and proofs-of-concept that will demonstrably never carry production data. Even there, a thirty-minute version of the threat model is worth running, because the failure pattern we see most often is the prototype that was never meant to go to production and went to production anyway.

The review can defer specific compliance mappings when the regulatory environment is not yet final. It cannot defer threat modelling, authentication design, secrets management, or encryption strategy — those decisions get baked into early code, and unbaking them is the expensive thing. If they are not decided at design time, they are decided by default, and the default is rarely what a serious team would choose deliberately.

What to Do Next

If you have a system in design and the review has not happened, schedule it this week. The seven-area agenda above is enough to start with. If you have a system in flight and are not sure whether the review's substance has been covered, run a one-day audit against the seven decision areas — threat model, authentication, encryption, secrets, network, compliance, incident response — and identify the gaps. The gaps you find now are cheaper than the gaps you find at audit.

For the broader engineering discipline this sits inside, see how we approach building custom software end to end.

Frequently Asked Questions

What security risks haven't we considered?

At least the ones your threat model does not name. The discipline that surfaces them is structured threat modelling — typically STRIDE — run with the people who understand the system, the data, and the regulatory environment in the same room. The gaps we see most often on inheritance are: service-to-service authentication that was assumed rather than designed, secrets that live in environment variables because nowhere else was specified, IAM roles whose scopes grew over time without review, and incident response that exists in the on-call engineer's head rather than in a runbook.

When should we do threat modelling?

At design time, before the architecture session locks the data model and before development begins. Then again at every significant change to the threat surface — a new integration, a new data class, a new user role, a new regulatory regime. Treat the threat model as a living document. The teams that run it once and never update it end up defending the system they built rather than the system they have.

How do we balance security with delivery speed?

The framing is almost always wrong. Security designed in costs days; security retrofitted costs months. The trade-off teams perceive between security and speed is usually a trade-off between thinking now and rework later — and rework later is always slower. Where genuine trade-offs exist — a control that materially slows a critical user journey — the review is the conversation to have them in, with the right people present and the rationale documented.

What's the cost of fixing security later?

It depends on what is being fixed. Encryption at rest retrofitted into a live database is engineering work measured in weeks per data store, plus migration and validation effort. Rebuilding an authentication flow to support MFA after launch is in the same range. IAM tightening across a live system can take quarters to do safely. Those are the technical costs alone — the audit, regulatory, and incident-response costs retrofitting attracts typically exceed the technical work itself.

Is a penetration test a substitute for a security architecture review?

No. A pen test is necessary, but it tests the system that was built, not the system that should have been designed. It surfaces specific vulnerabilities in specific code paths. It does not tell you whether the trust boundaries are in the right places, whether the authentication model fits the role structure, or whether the data classification is right. The two are complementary; running only the pen test means finding out about design errors after they have been built into the system.

Who should be in the room?

At minimum, the technical lead, the architect, the product owner accountable for the commercial outcome, and where the system handles regulated data, a privacy or compliance voice. For client engagements, the client's security or platform engineering function joins. The decisions made are commercial and technical at once — they need both kinds of authority present.

How does this fit with our compliance obligations?

The review is where the controls your compliance regime requires are mapped to the design decisions that will support them. SOC 2, ISO 27001, HIPAA, PCI-DSS, the Australian Privacy Principles, and GDPR ask similar architectural questions in different vocabulary — the review answers them once, at design time, in a form that survives the build and supports the audit. Continuous compliance validation later picks up where the review leaves off.