Debug Failed Tests in Azure DevOps Pipelines

A systematic guide to debugging failed tests in Azure DevOps pipelines. Learn to diagnose environment issues, flaky tests, authentication failures, and.

A pipeline failure at 2 AM blocks the morning deployment. Effective debugging is the skill that separates a QA engineer who resolves issues quickly from one who raises a ticket and waits. This guide gives you a systematic debugging process for pipeline test failures.

The debugging decision tree

Test fails in pipeline
         │
         ▼
Does it fail locally?
    │           │
   YES          NO
    │           │
    ▼           ▼
Code bug    Environment difference?
            │           │
          YES           NO
            │           │
            ▼           ▼
       Config/creds   Timing/resources
       difference     issue (flakiness)

Step 1: Read the failure message

Go to Pipelines → [Run] → [Job] → [Failed step].

Each step shows its full stdout/stderr. The failure message is usually near the bottom:

Error: expect(received).toBe(expected)
Expected: "Welcome, Alice"
Received: "Please sign in"

at LoginTest (tests/auth.spec.ts:34:5)

This tells you: the login succeeded locally but the test user credentials don't work in the pipeline environment.

Step 2: Classify the failure type

Symptom	Likely cause
`Cannot connect to...` / `ECONNREFUSED`	Wrong URL, environment down, VPN required
`401 Unauthorized` / `403 Forbidden`	Wrong credentials, expired token
`Timeout of 30000ms exceeded`	Slow environment, element not appearing, race condition
`Element not found` / `not visible`	Selector changed, feature flag off, race condition
`Expected X, received Y`	Data state differs from expected, logic bug
`Cannot read properties of undefined`	API returned different structure than expected

Step 3: Compare local vs pipeline environment

Most pipeline failures come from environment differences. Checklist:

☐ Is BASE_URL set correctly in the pipeline? (check the variable)
☐ Is the test database in the expected state?
☐ Are test user accounts created and active in staging?
☐ Is the feature being tested deployed to staging?
☐ Is there a feature flag that's off in staging but on locally?
☐ Does staging have a different config than local (e.g., different timeout)?
☐ Are SSL certificates valid on staging? (--insecure flag may be needed)

Step 4: Add diagnostic logging

When the error message isn't clear, add temporary diagnostic steps:

YAML
1# Add before the failing test step
2- script: |
3    echo "=== Environment Debug ==="
4    echo "BASE_URL: $(BASE_URL)"
5    echo "Node version: $(node --version)"
6    echo "NPM version: $(npm --version)"
7    curl -v $(BASE_URL)/health || echo "Health check failed"
8  displayName: Debug environment

For Playwright, enable verbose tracing:

TYPESCRIPT
1// playwright.config.ts — enable for debugging
2use: {
3  trace: 'on',           // Capture for every test (expensive but thorough)
4  screenshot: 'on',
5  video: 'on',
6}

Download the trace artifact and open it locally:

BASH
1npx playwright show-trace trace.zip

Step 5: Identify flaky tests

Flaky tests fail intermittently without code changes. Signs of flakiness:

Test fails in pipeline, passes when you re-run without code changes
Test fails on one shard but passes on others
Test fails at night (scheduled run) but passes in PR pipeline

Quarantine flaky tests immediately — they destroy trust in the suite:

TYPESCRIPT
1// Mark as flaky while investigating
2test.fixme('TC-204: Wishlist limit — needs investigation', async ({ page }) => {
3  // ...
4})

YAML
1# Pipeline: add --retries to catch flakiness
2- script: npx playwright test --retries=3

Track retry statistics to identify patterns. Tests that need 3 retries every run have a systemic issue (race condition, timing dependency).

Step 6: Debug authentication failures

The most common pipeline-specific failure: tests pass locally because you're already logged in; in CI, the session starts fresh.

TYPESCRIPT
1// Create a reusable auth state
2// setup/auth.ts
3import { chromium } from '@playwright/test'
4
5async function globalSetup() {
6  const browser = await chromium.launch()
7  const page = await browser.newPage()
8  
9  await page.goto(process.env.BASE_URL + '/login')
10  await page.fill('[name="email"]', process.env.TEST_EMAIL!)
11  await page.fill('[name="password"]', process.env.TEST_PASSWORD!)
12  await page.click('[type="submit"]')
13  await page.waitForURL('**/dashboard')
14  
15  // Save auth state
16  await page.context().storageState({ path: 'auth-state.json' })
17  await browser.close()
18}
19
20export default globalSetup

TYPESCRIPT
1// playwright.config.ts
2export default defineConfig({
3  globalSetup: './setup/auth.ts',
4  use: {
5    storageState: 'auth-state.json', // Reuse in all tests
6  },
7})

Step 7: Use re-run diagnostics

Azure DevOps shows run history per test case:

Go to pipeline run → Tests tab
Click a failed test → History tab
See: how many times this test has failed in the last N runs

A test that fails 1/10 times is flaky. A test that fails consistently after a specific commit introduced a regression.

Common errors and fixes

Error: Screenshot not captured for failed tests Fix: Screenshots are only captured if screenshot: 'only-on-failure' is set in playwright.config.ts AND the test artifacts are published with condition: always().

Error: Trace files are too large to download Fix: Use trace: 'on-first-retry' instead of trace: 'on'. This captures traces only on the first retry, not for every test.

Error: Tests time out on slow pipeline agents Fix: Increase timeouts for CI: timeout: process.env.CI ? 60000 : 30000. Microsoft-hosted agents can be slower than local machines, especially for I/O-heavy operations.

Error: Can't reproduce pipeline failure locally Fix: Use Docker to match the pipeline environment: docker run --rm -v $(pwd):/work -w /work mcr.microsoft.com/playwright:v1.45.0-jammy npx playwright test. This uses the exact same browser version as the pipeline.

Debug Failed Tests in Azure DevOps Pipelines

The debugging decision tree

Step 1: Read the failure message

Step 2: Classify the failure type

Step 3: Compare local vs pipeline environment

Step 4: Add diagnostic logging

Step 5: Identify flaky tests

Step 6: Debug authentication failures

Step 7: Use re-run diagnostics

Common errors and fixes

Share this article

Follow for more

Related Posts

Self-Healing Test Automation in Azure DevOps

How to Analyze Test Failures in Azure DevOps

Service Hooks in Azure DevOps for Testing

Run Selenium Tests in Azure DevOps Pipelines

Self-Healing Test Automation in Azure DevOps

How to Analyze Test Failures in Azure DevOps

Service Hooks in Azure DevOps for Testing

Run Selenium Tests in Azure DevOps Pipelines