How to Analyze Test Failures in Azure DevOps

A systematic guide to analyzing test failures in Azure DevOps. Covers pipeline test analytics, flakiness detection, failure categorisation, root cause.

A test failure is not the end of the story — it's the beginning of an investigation. How quickly and accurately you diagnose failures determines whether your test suite is a trusted quality signal or background noise that everyone ignores.

The failure analysis workflow

1. Detect     → Pipeline fails → notification to QA team
2. Classify   → Bug? Environment? Flakiness? Test code?
3. Investigate → Use pipeline logs, screenshots, traces
4. Resolve    → Fix code, fix test, or quarantine flaky test
5. Verify     → Confirm fix — pipeline passes consistently
6. Learn      → Update runbook, add monitoring if needed

Step 1: Using Azure DevOps test analytics

Go to Pipelines → [Pipeline] → Analytics → Tests.

Key views:

Top failing tests — sorted by failure count. Tests appearing here consistently are either genuinely broken or reliably flaky. Both need immediate attention.

Test flakiness — tests flagged as flaky (pass sometimes, fail other times). Sort by failure rate. Tests with 10–50% failure rate are most likely timing issues.

Slowest tests — tests taking > 30 seconds are candidates for optimisation. Slow tests often become flaky when the environment is under load.

New failures — tests that started failing after a specific build. Correlate with commit history to identify the regression commit.

Step 2: Classifying the failure

Open a failing test in the Tests tab. Read the error message and classify:

Error pattern	Classification	Action
`AssertionError: Expected 'X' but got 'Y'`	Product bug or test data issue	Investigate the business logic
`TimeoutError: exceeded 30000ms`	Timing/environment issue	Check environment, add wait
`Error: net::ERR_CONNECTION_REFUSED`	Environment down	Check staging status
`ElementNotFoundError`	Selector changed or race condition	Update selector or add wait
`401 Unauthorized`	Auth credentials expired	Rotate test credentials
Fails then passes on re-run	Flaky test	Quarantine and investigate

Step 3: Investigating with traces and logs

Playwright trace viewer

For Playwright tests configured with trace: 'on-first-retry':

Download the trace artifact from the pipeline run
Run: npx playwright show-trace trace.zip
The trace shows: every action, network request, console error, DOM snapshot at each step

TYPESCRIPT
1// playwright.config.ts — enable traces for CI
2use: {
3  trace: 'on-first-retry',
4  screenshot: 'only-on-failure',
5  video: 'on-first-retry',
6}

Pipeline logs

For any pipeline step:

Click the failed step in the pipeline run
The full stdout/stderr is shown
Look for the first error line — subsequent errors are often cascades from the first

Add verbose logging for critical test steps:

TYPESCRIPT
1// Log API responses on failure
2test('Checkout completes', async ({ page, request }) => {
3  const response = await request.post('/api/checkout', { data: checkoutData })
4  
5  if (!response.ok()) {
6    console.error('Checkout API failed:', {
7      status: response.status(),
8      body: await response.text(),
9      headers: Object.fromEntries(response.headers())
10    })
11  }
12  
13  expect(response.status()).toBe(200)
14})

Step 4: Flakiness deep-dive

For a test that fails intermittently:

YAML
1# Run the test 10 times to characterise the flakiness rate
2- script: |
3    for i in {1..10}; do
4      npx playwright test tests/checkout.spec.ts --retries=0
5      echo "Run $i exit code: $?"
6    done
7  displayName: Flakiness characterisation

Common flakiness causes and fixes:

Race condition (element not ready):

TYPESCRIPT
1// Bad — assumes element is immediately ready
2await page.click('[data-testid="submit"]')
3
4// Good — wait for element to be actionable
5await page.waitForSelector('[data-testid="submit"]', { state: 'visible' })
6await page.click('[data-testid="submit"]')

Shared test data (tests step on each other):

TYPESCRIPT
1// Bad — tests share the same user ID
2const userId = 'fixed-user-123'
3
4// Good — each test creates unique data
5const userId = crypto.randomUUID()

Network timing (API not finished before assertion):

TYPESCRIPT
1// Bad — asserts before API response arrives
2await page.click('[data-testid="save"]')
3expect(await page.textContent('.status')).toBe('Saved')
4
5// Good — wait for the network request to complete
6await Promise.all([
7  page.waitForResponse(resp => resp.url().includes('/api/save')),
8  page.click('[data-testid="save"]')
9])
10expect(await page.textContent('.status')).toBe('Saved')

Step 5: Building a failure resolution culture

Target metrics:

Mean time to classify failure: < 30 minutes
Mean time to resolve test failure: < 4 hours
Flaky test count: < 3% of total suite
Quarantined test resolution: < 5 working days

Create a test health backlog:

In Azure Boards, create a dedicated area for test maintenance:

Test Health Backlog
  ├── Quarantined tests (priority: High — must fix within 5 days)
  ├── Slow tests (priority: Medium — optimise when sprint allows)
  └── Outdated test cases (priority: Low — review quarterly)

Treat test health work with the same urgency as product bugs. A flaky test suite erodes trust in 3–6 months even if the product quality is high.

Common errors and fixes

Error: Test analytics shows data for wrong pipeline Fix: Ensure PublishTestResults uses testRunTitle that includes the pipeline name. The analytics tab shows data per pipeline definition — verify you're viewing the correct pipeline.

Error: Traces are empty or corrupted Fix: The trace zip file requires Playwright 1.30+ to view. Update both the test code and the local Playwright CLI: npm install -g playwright@latest.

Error: Flakiness analytics shows 0 flaky tests despite known intermittent failures Fix: Flakiness requires at least 2 different outcomes (pass and fail) across multiple runs. If a test always fails, it's not flagged as flaky — it's broken. Check the failure mode carefully.

How to Analyze Test Failures in Azure DevOps

The failure analysis workflow

Step 1: Using Azure DevOps test analytics

Step 2: Classifying the failure

Step 3: Investigating with traces and logs

Playwright trace viewer

Pipeline logs

Step 4: Flakiness deep-dive

Step 5: Building a failure resolution culture

Common errors and fixes

Share this article

Follow for more

Related Posts

Debug Failed Tests in Azure DevOps Pipelines

Self-Healing Test Automation in Azure DevOps

Track Test Coverage & Defect Metrics: DevOps

Test Data Management in Azure DevOps

Debug Failed Tests in Azure DevOps Pipelines

Self-Healing Test Automation in Azure DevOps

Track Test Coverage & Defect Metrics: DevOps

Test Data Management in Azure DevOps