Skip to main content
Back to blog

How to Analyze Test Failures in Azure DevOps

A systematic guide to analyzing test failures in Azure DevOps. Covers pipeline test analytics, flakiness detection, failure categorisation, root cause investigation, and building a culture of fast test failure resolution.

InnovateBits5 min read
Share

A test failure is not the end of the story — it's the beginning of an investigation. How quickly and accurately you diagnose failures determines whether your test suite is a trusted quality signal or background noise that everyone ignores.


The failure analysis workflow

1. Detect     → Pipeline fails → notification to QA team
2. Classify   → Bug? Environment? Flakiness? Test code?
3. Investigate → Use pipeline logs, screenshots, traces
4. Resolve    → Fix code, fix test, or quarantine flaky test
5. Verify     → Confirm fix — pipeline passes consistently
6. Learn      → Update runbook, add monitoring if needed

Step 1: Using Azure DevOps test analytics

Go to Pipelines → [Pipeline] → Analytics → Tests.

Key views:

Top failing tests — sorted by failure count. Tests appearing here consistently are either genuinely broken or reliably flaky. Both need immediate attention.

Test flakiness — tests flagged as flaky (pass sometimes, fail other times). Sort by failure rate. Tests with 10–50% failure rate are most likely timing issues.

Slowest tests — tests taking > 30 seconds are candidates for optimisation. Slow tests often become flaky when the environment is under load.

New failures — tests that started failing after a specific build. Correlate with commit history to identify the regression commit.


Step 2: Classifying the failure

Open a failing test in the Tests tab. Read the error message and classify:

Error patternClassificationAction
AssertionError: Expected 'X' but got 'Y'Product bug or test data issueInvestigate the business logic
TimeoutError: exceeded 30000msTiming/environment issueCheck environment, add wait
Error: net::ERR_CONNECTION_REFUSEDEnvironment downCheck staging status
ElementNotFoundErrorSelector changed or race conditionUpdate selector or add wait
401 UnauthorizedAuth credentials expiredRotate test credentials
Fails then passes on re-runFlaky testQuarantine and investigate

Step 3: Investigating with traces and logs

Playwright trace viewer

For Playwright tests configured with trace: 'on-first-retry':

  1. Download the trace artifact from the pipeline run
  2. Run: npx playwright show-trace trace.zip
  3. The trace shows: every action, network request, console error, DOM snapshot at each step
// playwright.config.ts — enable traces for CI
use: {
  trace: 'on-first-retry',
  screenshot: 'only-on-failure',
  video: 'on-first-retry',
}

Pipeline logs

For any pipeline step:

  1. Click the failed step in the pipeline run
  2. The full stdout/stderr is shown
  3. Look for the first error line — subsequent errors are often cascades from the first

Add verbose logging for critical test steps:

// Log API responses on failure
test('Checkout completes', async ({ page, request }) => {
  const response = await request.post('/api/checkout', { data: checkoutData })
  
  if (!response.ok()) {
    console.error('Checkout API failed:', {
      status: response.status(),
      body: await response.text(),
      headers: Object.fromEntries(response.headers())
    })
  }
  
  expect(response.status()).toBe(200)
})

Step 4: Flakiness deep-dive

For a test that fails intermittently:

# Run the test 10 times to characterise the flakiness rate
- script: |
    for i in {1..10}; do
      npx playwright test tests/checkout.spec.ts --retries=0
      echo "Run $i exit code: $?"
    done
  displayName: Flakiness characterisation

Common flakiness causes and fixes:

Race condition (element not ready):

// Bad — assumes element is immediately ready
await page.click('[data-testid="submit"]')
 
// Good — wait for element to be actionable
await page.waitForSelector('[data-testid="submit"]', { state: 'visible' })
await page.click('[data-testid="submit"]')

Shared test data (tests step on each other):

// Bad — tests share the same user ID
const userId = 'fixed-user-123'
 
// Good — each test creates unique data
const userId = crypto.randomUUID()

Network timing (API not finished before assertion):

// Bad — asserts before API response arrives
await page.click('[data-testid="save"]')
expect(await page.textContent('.status')).toBe('Saved')
 
// Good — wait for the network request to complete
await Promise.all([
  page.waitForResponse(resp => resp.url().includes('/api/save')),
  page.click('[data-testid="save"]')
])
expect(await page.textContent('.status')).toBe('Saved')

Step 5: Building a failure resolution culture

Target metrics:

  • Mean time to classify failure: < 30 minutes
  • Mean time to resolve test failure: < 4 hours
  • Flaky test count: < 3% of total suite
  • Quarantined test resolution: < 5 working days

Create a test health backlog:

In Azure Boards, create a dedicated area for test maintenance:

Test Health Backlog
  ├── Quarantined tests (priority: High — must fix within 5 days)
  ├── Slow tests (priority: Medium — optimise when sprint allows)
  └── Outdated test cases (priority: Low — review quarterly)

Treat test health work with the same urgency as product bugs. A flaky test suite erodes trust in 3–6 months even if the product quality is high.


Common errors and fixes

Error: Test analytics shows data for wrong pipeline Fix: Ensure PublishTestResults uses testRunTitle that includes the pipeline name. The analytics tab shows data per pipeline definition — verify you're viewing the correct pipeline.

Error: Traces are empty or corrupted Fix: The trace zip file requires Playwright 1.30+ to view. Update both the test code and the local Playwright CLI: npm install -g playwright@latest.

Error: Flakiness analytics shows 0 flaky tests despite known intermittent failures Fix: Flakiness requires at least 2 different outcomes (pass and fail) across multiple runs. If a test always fails, it's not flagged as flaky — it's broken. Check the failure mode carefully.

Free newsletter

Stay ahead in AI-driven QA

Get practical tutorials on test automation, AI testing, and quality engineering — straight to your inbox. No spam, unsubscribe any time.

Discussion

Sign in with GitHub to comment · powered by Giscus