How to Analyze Test Failures in Azure DevOps
A systematic guide to analyzing test failures in Azure DevOps. Covers pipeline test analytics, flakiness detection, failure categorisation, root cause investigation, and building a culture of fast test failure resolution.
A test failure is not the end of the story — it's the beginning of an investigation. How quickly and accurately you diagnose failures determines whether your test suite is a trusted quality signal or background noise that everyone ignores.
The failure analysis workflow
1. Detect → Pipeline fails → notification to QA team
2. Classify → Bug? Environment? Flakiness? Test code?
3. Investigate → Use pipeline logs, screenshots, traces
4. Resolve → Fix code, fix test, or quarantine flaky test
5. Verify → Confirm fix — pipeline passes consistently
6. Learn → Update runbook, add monitoring if needed
Step 1: Using Azure DevOps test analytics
Go to Pipelines → [Pipeline] → Analytics → Tests.
Key views:
Top failing tests — sorted by failure count. Tests appearing here consistently are either genuinely broken or reliably flaky. Both need immediate attention.
Test flakiness — tests flagged as flaky (pass sometimes, fail other times). Sort by failure rate. Tests with 10–50% failure rate are most likely timing issues.
Slowest tests — tests taking > 30 seconds are candidates for optimisation. Slow tests often become flaky when the environment is under load.
New failures — tests that started failing after a specific build. Correlate with commit history to identify the regression commit.
Step 2: Classifying the failure
Open a failing test in the Tests tab. Read the error message and classify:
| Error pattern | Classification | Action |
|---|---|---|
AssertionError: Expected 'X' but got 'Y' | Product bug or test data issue | Investigate the business logic |
TimeoutError: exceeded 30000ms | Timing/environment issue | Check environment, add wait |
Error: net::ERR_CONNECTION_REFUSED | Environment down | Check staging status |
ElementNotFoundError | Selector changed or race condition | Update selector or add wait |
401 Unauthorized | Auth credentials expired | Rotate test credentials |
| Fails then passes on re-run | Flaky test | Quarantine and investigate |
Step 3: Investigating with traces and logs
Playwright trace viewer
For Playwright tests configured with trace: 'on-first-retry':
- Download the trace artifact from the pipeline run
- Run:
npx playwright show-trace trace.zip - The trace shows: every action, network request, console error, DOM snapshot at each step
// playwright.config.ts — enable traces for CI
use: {
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'on-first-retry',
}Pipeline logs
For any pipeline step:
- Click the failed step in the pipeline run
- The full stdout/stderr is shown
- Look for the first error line — subsequent errors are often cascades from the first
Add verbose logging for critical test steps:
// Log API responses on failure
test('Checkout completes', async ({ page, request }) => {
const response = await request.post('/api/checkout', { data: checkoutData })
if (!response.ok()) {
console.error('Checkout API failed:', {
status: response.status(),
body: await response.text(),
headers: Object.fromEntries(response.headers())
})
}
expect(response.status()).toBe(200)
})Step 4: Flakiness deep-dive
For a test that fails intermittently:
# Run the test 10 times to characterise the flakiness rate
- script: |
for i in {1..10}; do
npx playwright test tests/checkout.spec.ts --retries=0
echo "Run $i exit code: $?"
done
displayName: Flakiness characterisationCommon flakiness causes and fixes:
Race condition (element not ready):
// Bad — assumes element is immediately ready
await page.click('[data-testid="submit"]')
// Good — wait for element to be actionable
await page.waitForSelector('[data-testid="submit"]', { state: 'visible' })
await page.click('[data-testid="submit"]')Shared test data (tests step on each other):
// Bad — tests share the same user ID
const userId = 'fixed-user-123'
// Good — each test creates unique data
const userId = crypto.randomUUID()Network timing (API not finished before assertion):
// Bad — asserts before API response arrives
await page.click('[data-testid="save"]')
expect(await page.textContent('.status')).toBe('Saved')
// Good — wait for the network request to complete
await Promise.all([
page.waitForResponse(resp => resp.url().includes('/api/save')),
page.click('[data-testid="save"]')
])
expect(await page.textContent('.status')).toBe('Saved')Step 5: Building a failure resolution culture
Target metrics:
- Mean time to classify failure: < 30 minutes
- Mean time to resolve test failure: < 4 hours
- Flaky test count: < 3% of total suite
- Quarantined test resolution: < 5 working days
Create a test health backlog:
In Azure Boards, create a dedicated area for test maintenance:
Test Health Backlog
├── Quarantined tests (priority: High — must fix within 5 days)
├── Slow tests (priority: Medium — optimise when sprint allows)
└── Outdated test cases (priority: Low — review quarterly)
Treat test health work with the same urgency as product bugs. A flaky test suite erodes trust in 3–6 months even if the product quality is high.
Common errors and fixes
Error: Test analytics shows data for wrong pipeline
Fix: Ensure PublishTestResults uses testRunTitle that includes the pipeline name. The analytics tab shows data per pipeline definition — verify you're viewing the correct pipeline.
Error: Traces are empty or corrupted
Fix: The trace zip file requires Playwright 1.30+ to view. Update both the test code and the local Playwright CLI: npm install -g playwright@latest.
Error: Flakiness analytics shows 0 flaky tests despite known intermittent failures Fix: Flakiness requires at least 2 different outcomes (pass and fail) across multiple runs. If a test always fails, it's not flagged as flaky — it's broken. Check the failure mode carefully.
Stay ahead in AI-driven QA
Get practical tutorials on test automation, AI testing, and quality engineering — straight to your inbox. No spam, unsubscribe any time.
Discussion
Sign in with GitHub to comment · powered by Giscus