Performance Optimisation: Fast Before Users Decide You're Slow

Performance Optimisation: Fast Before Users Decide You're Slow
Published

17 Jun 2026

Author
Akash Shakya

Akash Shakya

Performance Optimisation: Fast Before Users Decide You're Slow
5:54
Table of Contents

Every team that ships a slow product believed it was fast enough. They tested it on their machines, over their office Wi-Fi, with their test data — a few hundred records, not a few hundred thousand. The API felt responsive. The screens loaded. Nobody complained during QA because nobody was using it the way real users would: on a 4G connection, with a cold cache, in the middle of a commute, with exactly enough patience to wait three seconds before closing the app and opening a competitor's.

Performance problems discovered in production are always more expensive than performance designed in. Not slightly more expensive — categorically. A slow API endpoint found during load testing costs a developer a day to fix. The same endpoint found after launch costs you the users who left, the app store reviews they wrote on the way out, and the retention curve that never recovered. The fix is identical. The cost isn't even comparable.

This is what performance optimisation pre-launch actually means: defining what "fast enough" looks like before a single user touches the product, testing against those benchmarks under realistic conditions, and treating performance regressions with the same urgency as functional bugs. Not as a polish phase. Not as a nice-to-have sprint before launch. As a continuous engineering discipline that starts in sprint one. Across 900+ projects and 600+ products shipped, we've seen enough mobile apps and custom software builds to know that the teams who set performance benchmarks during build don't end up rebuilding after launch.

Three Seconds Is the Entire Window

Users don't measure your app's performance in milliseconds. They measure it in feelings — fast, fine, or frustrating. Research from Google has consistently shown that as page load times increase from one second to three seconds, the probability of a user bouncing increases by 32%. At five seconds, the probability climbs to 90%. For mobile apps, expectations are even less forgiving because native apps set the baseline. Users compare your app to the ones built by teams with thousands of engineers and global CDN infrastructure.

The consequences aren't just about first impressions. Slow performance compounds into every metric that matters.

Retention degrades silently. Users don't complain about performance — they just stop opening the app. You see a declining DAU/MAU ratio, attribute it to engagement, and start building features to win users back. But the features load slowly too, because the underlying performance problem was never addressed. You're adding weight to a vehicle that's already struggling to accelerate.

Support costs inflate. When an app is slow, users interpret it as broken. "The app crashed" often means "the app took so long to respond I assumed it had crashed." Support tickets increase, and your team spends time investigating phantom crashes that are actually timeout behaviours.

Development velocity drops. Slow test environments, slow CI pipelines, slow staging deploys — performance debt doesn't stay contained to the user-facing product. It bleeds into every environment and slows down the team building the product. A build pipeline that takes forty minutes instead of ten means every developer loses half an hour per cycle, multiplied across every commit.

The teams that avoid this aren't lucky. They defined performance benchmarks before writing code and treated those benchmarks as acceptance criteria throughout the build.

What Performance Optimisation Pre-Launch Actually Covers

Performance optimisation pre-launch is the practice of defining measurable performance targets, building testing infrastructure to validate those targets, and treating performance regressions as blocking issues — all before a single real user interacts with the product. It's not about making things faster after they're built. It's about building things fast from the start.

Through our Production Readiness Review™ process, performance benchmarks are established during architecture, not discovered in production complaints. Here's what that covers.

API Response Time Benchmarks

Every API endpoint gets a response time budget. For most mobile applications, the targets look like this: p50 (median) under 200ms, p95 under 500ms, p99 under 1 second. These aren't aspirational — they're acceptance criteria. An endpoint that exceeds its budget gets the same treatment as one that returns incorrect data: it doesn't ship.

The benchmarks account for realistic conditions. Testing an endpoint with an empty database tells you nothing. Testing it with the data volume you expect at six months tells you whether your query patterns and indexing strategy will hold. Testing it at projected peak concurrency tells you whether your connection pooling and caching layers are sized correctly.

Database Query Performance

Most API slowness originates in the database. An unindexed query that returns in 5ms against a test dataset returns in 2 seconds against production-scale data. The fix — adding an index — takes minutes. Finding the problem after launch, when it's buried in a stack of slow endpoints and user complaints, takes days.

Pre-launch database performance work means: running explain plans on every query path, indexing for your actual read patterns, identifying N+1 query problems before they hit production, and load testing with realistic data volumes. It also means setting up slow query logging from day one so regressions are caught immediately, not months later when someone finally investigates why the app feels sluggish.

Front-End Rendering and Mobile Performance

API response time is half the equation. The other half is what happens after the data arrives. Front-end rendering performance covers: time to interactive (how quickly the user can actually use the screen), frame rate during scrolling and transitions (anything below 60fps feels janky), bundle size and its impact on initial load, and memory consumption on mid-range devices.

Mobile performance specifically demands testing on real devices, not simulators. A screen that renders smoothly on the latest flagship phone stutters on the two-year-old mid-range device that 60% of your users actually carry. Testing on the bottom quartile of your target device range is what separates performance that works in the lab from performance that works in the field.

Performance Benchmarks as CI Gates

Individual performance tests are useful. Automated performance benchmarks that block deploys are transformational. When your CI pipeline includes a performance test suite that fails the build if response times exceed their budgets, performance regressions can't silently accumulate. Every merge request that degrades performance gets flagged before it reaches staging.

This requires investment in test infrastructure: a dedicated performance testing environment with production-like resources, automated load generation, and baseline measurements that the pipeline compares against. The DevOps cost of setting this up is measured in days. The cost of not having it is measured in the slow accumulation of performance debt that eventually requires a dedicated optimisation sprint — or worse, a rewrite.

Where It Breaks Down

Performance optimisation pre-launch fails when benchmarks are set without understanding real usage patterns. A team that optimises for throughput when their actual bottleneck is latency wastes effort. A team that load tests with uniform request patterns when their real traffic is bursty gets false confidence.

It also fails when treated as a one-time gate rather than a continuous practice. Performance established at launch degrades over time as features are added, data grows, and dependencies change. The benchmarks set pre-launch become the baseline that ongoing monitoring compares against — not a box that was checked and forgotten.

How to Build Performance In From Sprint One

You don't need a separate performance optimisation phase. You need performance to be part of every phase. Here's how this works when it's built into the delivery process from the start.

During architecture: set the benchmarks. Define response time budgets for every major user flow. Identify the data volumes you expect at one month, six months, and twelve months. Choose your caching strategy, your database indexing approach, and your CDN configuration based on those projections, not on defaults.

During development: test against benchmarks every sprint. Include performance tests in your definition of done. Every API endpoint gets a load test. Every screen gets a rendering performance test on a target device. Regressions are caught within the sprint they're introduced, when the developer who wrote the code still has full context.

Before launch: run full load tests. Simulate your expected peak traffic against a production-mirror environment. Identify bottlenecks under concurrency — connection pool limits, database lock contention, memory leaks under sustained load. Fix them before users find them. According to web.dev, establishing and tracking Core Web Vitals benchmarks pre-launch is now a baseline expectation, not a competitive advantage.

After launch: monitor against baselines. The pre-launch benchmarks become your monitoring thresholds. When p95 response time drifts above its budget, an alert fires. When database query times increase, the slow query log catches it. Performance monitoring isn't a feature you add later — it's the continuation of the discipline you established during build.

The App That Lost a Month of Users

A mobile app in the health and wellness space launched with feature parity to its competitors and a strong marketing push. The product was functionally complete. Every user story had been accepted. QA had signed off. What nobody had tested under load was the API layer.

Average API response times in production were 2.8 seconds. On a 4G connection with real-world latency, screens took four to five seconds to populate. The home feed — the first screen every user saw after login — made six sequential API calls. Users opened the app, watched a spinner, and closed it.

Retention dropped 30% in the first month compared to projections. App store reviews mentioned "slow" and "laggy" within the first week. The team scrambled. Over three weeks, they implemented query indexing that should have been in place from the start, added a caching layer for the most-requested endpoints, compressed API payloads that were returning full database objects when the client needed three fields, and parallelised the home feed's API calls.

Response times dropped to 400ms average. The app felt fast. Retention for new users acquired after the fix matched projections. But the users lost in month one — roughly 40% of the launch cohort — never came back. They'd already formed their opinion. They'd already written their reviews. The product had one chance to be fast, and it wasn't.

The optimisation work itself was straightforward. Query indexing, caching, payload compression, call parallelisation — none of it was novel engineering. Every fix could have been implemented during the build at a fraction of the cost. The technical debt wasn't complex. It was just expensive because of when it was discovered.

When Performance Optimisation Is Critical — and When It Can Wait

Invest now if your product is user-facing and competing for attention. Any mobile app that users open by choice — not by requirement — is competing with apps built by teams that treat performance as a feature. If your product handles transactions, real-time data, or any flow where delay equals abandonment, performance benchmarks are non-negotiable pre-launch. The same applies to any product where first impressions drive retention: marketplace apps, social platforms, on-demand services.

It can wait if you're building an internal tool where users have no alternative and high tolerance for latency. A back-office admin panel used by a trained team of ten can tolerate a two-second page load in ways that a consumer app cannot. It can also wait during early prototyping when you're validating whether anyone wants the product at all — but set a clear trigger for when performance work begins. The cost of building an app increases when performance retrofits are needed under production load.

Watch the transition. The moment your prototype has real users, performance becomes a product quality issue. Don't wait for complaints. Users who complain are a minority. Users who leave silently are the majority.

What to Do Next

Pick your three most critical user flows — the paths your users take most often. Measure the end-to-end time for each one on the lowest-spec device in your target audience, over a mobile connection. If any of them exceed three seconds, you have a performance problem that your users have already noticed, even if they haven't told you.

When you're ready to set performance benchmarks that hold from launch day forward, talk to our engineering team. We build performance into the delivery process so your product is fast before users decide it isn't.

What are the right performance benchmarks for a mobile app?

For most mobile applications, target API response times of under 200ms at p50 (median), under 500ms at p95, and under 1 second at p99. Time to interactive for any screen should be under 2 seconds on a mid-range device over a 4G connection. Scroll frame rate should maintain 60fps. These aren't universal — they're starting points. Your actual benchmarks should reflect your specific user flows, data volumes, and competitive landscape.

How do we test performance before launch if we don't have real traffic?

You simulate it. Load testing tools like k6, Gatling, or Locust generate synthetic traffic that mimics real user behaviour — concurrent requests, varied endpoints, realistic data payloads. The key is testing against a production-mirror environment with production-scale data, not against a development instance with a seed dataset. Complement automated load tests with manual testing on real devices over throttled network connections to catch front-end performance issues that server-side load tests miss.

What's the difference between load testing and performance testing?

Performance testing is the broader discipline — it includes measuring response times, rendering speed, memory usage, and resource consumption under normal conditions. Load testing is a subset that specifically measures how the system behaves under expected and peak traffic volumes. You need both. Performance testing catches slow endpoints and rendering issues. Load testing catches concurrency problems — connection pool exhaustion, database lock contention, memory leaks that only surface under sustained load.

How much does performance optimisation add to the development timeline?

When built into the process from sprint one, it adds roughly 10-15% to development effort. This covers writing performance tests, running them in CI, and addressing regressions as they appear. When done as a retrofit after launch, it typically consumes two to four weeks of dedicated engineering time, plus the indirect cost of lost users and degraded metrics during the period the product was slow. The retrofit approach costs more in every dimension.

Can we fix performance problems after launch?

Technically, yes. Every performance problem has a technical fix. The issue is what you lose while finding and implementing that fix. Users who experience a slow product in its first week form an opinion that's very difficult to reverse. App store ratings written during a slow launch persist long after the fix. Retention lost to poor performance is rarely recoverable — the users who left aren't monitoring your release notes for performance improvements.

What tools should we use for performance monitoring?

For API monitoring: Datadog, New Relic, or Grafana with Prometheus for response time percentiles, error rates, and throughput. For database performance: built-in slow query logs plus monitoring of connection pool utilisation. For front-end and mobile performance: Firebase Performance Monitoring or native profiling tools (Xcode Instruments for iOS, Android Profiler for Android). For synthetic monitoring: tools like Lighthouse CI integrated into your build pipeline to catch regressions before deploy.

How do we prevent performance from degrading after launch?

Set your pre-launch benchmarks as monitoring thresholds and alert on them. When p95 response time for a critical endpoint drifts above its budget, the alert fires the same day — not after users notice. Include performance tests in your CI pipeline so every merge request is tested against baselines. Review performance metrics in sprint retrospectives alongside functional metrics. Performance isn't a launch milestone — it's an ongoing operational concern that needs the same visibility as uptime and error rates.

 

Like What You Just Read? It's How We Run Every Project.

Discovery workshops, sprint demos, production reviews — this isn't thought leadership. It's our operating system. If you want to see how it works with your product on the table, let's talk.