How do we identify flaky tests in our suite?

Run your test suite multiple times without code changes and flag any tests that produce different results. Many CI/CD analytics tools can automatically detect flaky tests by tracking historical pass rates for each test. Tests with pass rates between one and ninety-nine percent (when code has not changed) are flaky.

Should we use automatic retries for failed tests?

Retries can be a pragmatic short-term solution to prevent flaky tests from blocking deployments, but they should not be a permanent fix. Every retry masks an underlying reliability issue and adds time to your pipeline. Use retries as a temporary measure while you investigate and fix the root causes of flakiness.

How do flaky tests affect deployment frequency?

Flaky tests directly reduce deployment frequency by causing pipeline failures that require investigation and re-runs. If your pipeline fails twenty percent of the time due to flaky tests, you are effectively adding twenty percent overhead to every deployment. Fixing flaky tests is one of the fastest ways to improve deployment frequency.

Test Pass Rate: Benchmarks & How to Fix Flaky Tests

When developers learn to expect random test failures, they stop trusting the test suite entirely. They re-run pipelines reflexively, ignore red builds, and eventually bypass testing altogether. Test pass rate measures how much of that erosion has already happened - and flaky test identification is the fastest path to restoring the confidence your CI/CD pipeline depends on.

What Is Test Pass Rate?

Test pass rate is calculated as the number of successful test runs divided by the total number of test runs over a given period. It can be measured at the individual test level, the test suite level, or the pipeline level. A pass rate of ninety-eight percent means that two out of every hundred test runs fail, regardless of whether those failures indicate real bugs or flaky infrastructure.

The distinction between genuine failures and flaky failures is critical. Genuine failures indicate real bugs in the code and are valuable signals. Flaky failures-tests that pass and fail intermittently without code changes-erode trust in the test suite. When developers learn to expect random failures, they begin ignoring test results entirely, which defeats the purpose of automated testing.

Test pass rate is closely linked to developer productivity and deployment confidence. A reliable test suite gives developers confidence to refactor code, merge changes quickly, and deploy frequently. An unreliable suite creates friction at every step, slowing development and increasing the temptation to bypass testing altogether.

How to Measure Test Pass Rate

Track test pass rate at two levels: the overall pipeline pass rate and the individual test pass rate. Pipeline pass rate tells you how often your entire CI/CD pipeline succeeds, while individual test pass rate helps you identify specific problematic tests. Both metrics are important for different purposes.

Most CI/CD platforms provide test result data that can be aggregated for pass rate calculations. Tools like Datadog, Buildkite Analytics, and Launchable specialise in test analytics and can automatically identify flaky tests, track pass rate trends, and correlate failures with specific code changes.

Track both pipeline-level and individual test-level pass rates
Separate genuine failures from flaky failures in your analysis
Monitor pass rate trends over time to catch degradation early
Use test analytics tools to automatically identify and quarantine flaky tests
Measure the impact of flaky tests on developer productivity and pipeline throughput

Test Pass Rate Benchmarks

High-performing engineering organisations maintain pipeline pass rates above ninety-five percent, with many targeting ninety-eight percent or higher. At the individual test level, each test should have a pass rate of at least ninety-nine percent when the underlying code has not changed. Any test that fails more than one percent of the time without code changes is considered flaky.

Google's internal research found that flaky tests account for approximately sixteen percent of all test failures, and that the cost of investigating and managing flaky tests consumes significant engineering time. Their target is to keep the flaky test rate below two percent of the total test suite.

If your pipeline pass rate is below ninety percent, your team is likely wasting substantial time investigating false failures and re-running pipelines. Every percentage point of improvement in pass rate translates directly to faster delivery cycles and higher developer satisfaction.

Strategies for Improving Test Pass Rate

The first step is identifying and quarantining flaky tests. Run your test suite multiple times without code changes and flag any tests that produce inconsistent results. Move flaky tests to a separate quarantine suite that does not block the main pipeline. Then address the root causes: timing dependencies, shared state, external service calls, and race conditions are the most common culprits.

Improve test isolation to prevent tests from depending on shared state or execution order. Each test should set up its own preconditions and clean up after itself. Use mocking and stubbing to eliminate dependencies on external services, databases, and file systems that introduce variability into test execution.

Identify and quarantine flaky tests to restore trust in the pipeline immediately
Address root causes of flakiness: timing issues, shared state, and external dependencies
Improve test isolation so each test is independent and deterministic
Use retry mechanisms sparingly-they mask flakiness rather than fixing it
Invest in test infrastructure reliability including stable CI runners and consistent environments

Building a Culture of Test Reliability

Treat flaky tests as high-priority bugs. When a test becomes flaky, it should be fixed or quarantined within a defined timeframe, not ignored. Establish a rotation where team members take turns investigating and resolving flaky tests. This distributes the burden and ensures that test reliability receives consistent attention.

Make test reliability metrics visible to the entire team. Include test pass rate in your sprint dashboards and discuss it during retrospectives. When developers see the impact of flaky tests on pipeline throughput and team productivity, they become more invested in writing reliable tests from the start.

Include test quality in your code review process. Reviewers should check that new tests are deterministic, well-isolated, and do not introduce timing dependencies. Catching potential flakiness during review is far cheaper than debugging intermittent failures in the CI pipeline.

Key Takeaways

Test pass rate measures how often your test suite passes on the first attempt-aim for ninety-five percent or higher at the pipeline level
Flaky tests erode developer trust and waste significant time investigating false failures
Quarantine flaky tests immediately to restore pipeline reliability, then fix root causes systematically
Common causes of flakiness include timing dependencies, shared state, and external service calls
Treat flaky tests as high-priority bugs and make test reliability visible to the entire team

Frequently Asked Questions

How do we identify flaky tests in our suite?: Run your test suite multiple times without code changes and flag any tests that produce different results. Many CI/CD analytics tools can automatically detect flaky tests by tracking historical pass rates for each test. Tests with pass rates between one and ninety-nine percent (when code has not changed) are flaky.
Should we use automatic retries for failed tests?: Retries can be a pragmatic short-term solution to prevent flaky tests from blocking deployments, but they should not be a permanent fix. Every retry masks an underlying reliability issue and adds time to your pipeline. Use retries as a temporary measure while you investigate and fix the root causes of flakiness.
How do flaky tests affect deployment frequency?: Flaky tests directly reduce deployment frequency by causing pipeline failures that require investigation and re-runs. If your pipeline fails twenty percent of the time due to flaky tests, you are effectively adding twenty percent overhead to every deployment. Fixing flaky tests is one of the fastest ways to improve deployment frequency.

Identify and Fix Your Flaky Tests

Our tools help you quarantine flaky tests immediately and track the root causes - so your pipeline becomes a signal, not noise.

Learn More

Test Pass Rate: Benchmarks & How to Fix Flaky Tests

What Is Test Pass Rate?

How to Measure Test Pass Rate

Test Pass Rate Benchmarks

Strategies for Improving Test Pass Rate

Building a Culture of Test Reliability

Key Takeaways

Frequently Asked Questions

Identify and Fix Your Flaky Tests

Related Articles

Release Frequency: How to Measure & Increase It

On-Call Load: How to Measure & Reduce Alert Volume

Developer Satisfaction: How to Measure & Improve It

Flow Efficiency: Formula, Benchmarks & How to Improve It

Work in Progress (WIP) Limits: How to Set & Enforce

Escaped Defects: Definition, Tracking & How to Reduce Them