Debug Failed Tests in Azure DevOps Pipelines
A systematic guide to debugging failed tests in Azure DevOps pipelines. Learn to diagnose environment issues, flaky tests, authentication failures, and.
A pipeline failure at 2 AM blocks the morning deployment. Effective debugging is the skill that separates a QA engineer who resolves issues quickly from one who raises a ticket and waits. This guide gives you a systematic debugging process for pipeline test failures.
The debugging decision tree
Test fails in pipeline
│
▼
Does it fail locally?
│ │
YES NO
│ │
▼ ▼
Code bug Environment difference?
│ │
YES NO
│ │
▼ ▼
Config/creds Timing/resources
difference issue (flakiness)
Step 1: Read the failure message
Go to Pipelines → [Run] → [Job] → [Failed step].
Each step shows its full stdout/stderr. The failure message is usually near the bottom:
Error: expect(received).toBe(expected)
Expected: "Welcome, Alice"
Received: "Please sign in"
at LoginTest (tests/auth.spec.ts:34:5)
This tells you: the login succeeded locally but the test user credentials don't work in the pipeline environment.
Step 2: Classify the failure type
| Symptom | Likely cause |
|---|---|
Cannot connect to... / ECONNREFUSED | Wrong URL, environment down, VPN required |
401 Unauthorized / 403 Forbidden | Wrong credentials, expired token |
Timeout of 30000ms exceeded | Slow environment, element not appearing, race condition |
Element not found / not visible | Selector changed, feature flag off, race condition |
Expected X, received Y | Data state differs from expected, logic bug |
Cannot read properties of undefined | API returned different structure than expected |
Step 3: Compare local vs pipeline environment
Most pipeline failures come from environment differences. Checklist:
☐ Is BASE_URL set correctly in the pipeline? (check the variable)
☐ Is the test database in the expected state?
☐ Are test user accounts created and active in staging?
☐ Is the feature being tested deployed to staging?
☐ Is there a feature flag that's off in staging but on locally?
☐ Does staging have a different config than local (e.g., different timeout)?
☐ Are SSL certificates valid on staging? (--insecure flag may be needed)
Step 4: Add diagnostic logging
When the error message isn't clear, add temporary diagnostic steps:
YAML1# Add before the failing test step 2- script: | 3 echo "=== Environment Debug ===" 4 echo "BASE_URL: $(BASE_URL)" 5 echo "Node version: $(node --version)" 6 echo "NPM version: $(npm --version)" 7 curl -v $(BASE_URL)/health || echo "Health check failed" 8 displayName: Debug environment
For Playwright, enable verbose tracing:
TYPESCRIPT1// playwright.config.ts — enable for debugging 2use: { 3 trace: 'on', // Capture for every test (expensive but thorough) 4 screenshot: 'on', 5 video: 'on', 6}
Download the trace artifact and open it locally:
BASH1npx playwright show-trace trace.zip
Step 5: Identify flaky tests
Flaky tests fail intermittently without code changes. Signs of flakiness:
- Test fails in pipeline, passes when you re-run without code changes
- Test fails on one shard but passes on others
- Test fails at night (scheduled run) but passes in PR pipeline
Quarantine flaky tests immediately — they destroy trust in the suite:
TYPESCRIPT1// Mark as flaky while investigating 2test.fixme('TC-204: Wishlist limit — needs investigation', async ({ page }) => { 3 // ... 4})
YAML1# Pipeline: add --retries to catch flakiness 2- script: npx playwright test --retries=3
Track retry statistics to identify patterns. Tests that need 3 retries every run have a systemic issue (race condition, timing dependency).
Step 6: Debug authentication failures
The most common pipeline-specific failure: tests pass locally because you're already logged in; in CI, the session starts fresh.
TYPESCRIPT1// Create a reusable auth state 2// setup/auth.ts 3import { chromium } from '@playwright/test' 4 5async function globalSetup() { 6 const browser = await chromium.launch() 7 const page = await browser.newPage() 8 9 await page.goto(process.env.BASE_URL + '/login') 10 await page.fill('[name="email"]', process.env.TEST_EMAIL!) 11 await page.fill('[name="password"]', process.env.TEST_PASSWORD!) 12 await page.click('[type="submit"]') 13 await page.waitForURL('**/dashboard') 14 15 // Save auth state 16 await page.context().storageState({ path: 'auth-state.json' }) 17 await browser.close() 18} 19 20export default globalSetup
TYPESCRIPT1// playwright.config.ts 2export default defineConfig({ 3 globalSetup: './setup/auth.ts', 4 use: { 5 storageState: 'auth-state.json', // Reuse in all tests 6 }, 7})
Step 7: Use re-run diagnostics
Azure DevOps shows run history per test case:
- Go to pipeline run → Tests tab
- Click a failed test → History tab
- See: how many times this test has failed in the last N runs
A test that fails 1/10 times is flaky. A test that fails consistently after a specific commit introduced a regression.
Common errors and fixes
Error: Screenshot not captured for failed tests
Fix: Screenshots are only captured if screenshot: 'only-on-failure' is set in playwright.config.ts AND the test artifacts are published with condition: always().
Error: Trace files are too large to download
Fix: Use trace: 'on-first-retry' instead of trace: 'on'. This captures traces only on the first retry, not for every test.
Error: Tests time out on slow pipeline agents
Fix: Increase timeouts for CI: timeout: process.env.CI ? 60000 : 30000. Microsoft-hosted agents can be slower than local machines, especially for I/O-heavy operations.
Error: Can't reproduce pipeline failure locally
Fix: Use Docker to match the pipeline environment: docker run --rm -v $(pwd):/work -w /work mcr.microsoft.com/playwright:v1.45.0-jammy npx playwright test. This uses the exact same browser version as the pipeline.
Share this article
Follow for more
Follow me on social media for more developer tips, tricks, and tutorials. Let's connect and build something great together!