Skip to main content
Back to blog

CSV and JSON in Test Automation: Data-Driven Testing Explained

Learn how to use CSV and JSON files for data-driven testing in Playwright, Jest, and other frameworks. Covers when to use each format, how to convert between them, and practical patterns for parameterised test suites.

InnovateBits7 min read
Share

Data-driven testing is one of the highest-ROI practices in test automation. Instead of writing a separate test for each combination of inputs, you define the logic once and feed it a dataset — each row becomes a test case. The test suite grows in coverage without growing in code.

CSV and JSON are the two formats that underpin data-driven test suites in most projects. Understanding when to use each, and how to convert between them, makes building and maintaining these suites significantly easier.


What data-driven testing looks like

Without data-driven testing, a form validation test suite looks like this:

test('rejects email without @', async ({ page }) => {
  await page.fill('[name="email"]', 'notanemail')
  await page.click('[type="submit"]')
  await expect(page.locator('.error')).toBeVisible()
})
 
test('rejects email without domain', async ({ page }) => {
  await page.fill('[name="email"]', 'user@')
  await page.click('[type="submit"]')
  await expect(page.locator('.error')).toBeVisible()
})
 
// ... 8 more tests with identical structure

With data-driven testing, it looks like this:

const emailCases = [
  { input: 'notanemail',    valid: false, desc: 'no @ symbol' },
  { input: 'user@',         valid: false, desc: 'no domain' },
  { input: '@domain.com',   valid: false, desc: 'no local part' },
  { input: 'user@.com',     valid: false, desc: 'dot-leading domain' },
  { input: 'user@domain',   valid: false, desc: 'no TLD' },
  { input: 'user@domain.com', valid: true, desc: 'valid standard email' },
  { input: 'user+tag@domain.co.uk', valid: true, desc: 'valid with plus tag' },
]
 
for (const { input, valid, desc } of emailCases) {
  test(`email validation: ${desc}`, async ({ page }) => {
    await page.fill('[name="email"]', input)
    await page.click('[type="submit"]')
    if (valid) {
      await expect(page.locator('.error')).not.toBeVisible()
    } else {
      await expect(page.locator('.error')).toBeVisible()
    }
  })
}

One test definition, seven test cases. Add more rows to the array, get more coverage with zero additional test code.


JSON for test data: when and why

JSON is the natural choice when your test data has:

Nested structure — user objects with nested addresses, orders with line items, API request bodies with deeply structured fields.

Mixed types — some fields are strings, some are numbers, some are booleans, some are arrays. JSON preserves types natively. CSV stores everything as strings.

Direct API body mapping — if you're testing a REST API, JSON fixtures map directly to request bodies without any transformation.

[
  {
    "id": "test_001",
    "user": { "name": "Alice", "role": "admin" },
    "permissions": ["read", "write", "delete"],
    "shouldSucceed": true
  },
  {
    "id": "test_002",
    "user": { "name": "Bob", "role": "viewer" },
    "permissions": ["read"],
    "shouldSucceed": false
  }
]
import testCases from './fixtures/permission-tests.json'
 
for (const tc of testCases) {
  test(`permissions: ${tc.id}`, async ({ request }) => {
    const response = await request.post('/api/documents', {
      data: { owner: tc.user, requiredPermissions: tc.permissions }
    })
    if (tc.shouldSucceed) {
      expect(response.status()).toBe(201)
    } else {
      expect(response.status()).toBe(403)
    }
  })
}

CSV for test data: when and why

CSV is the right choice when your test data is:

Flat — rows and columns with no nesting. Login credentials, product listings, user registrations.

Business-owned — product managers, business analysts, and QA leads can edit a spreadsheet and save as CSV without touching code. This democratises test case maintenance.

Large volume — performance test scenarios with thousands of rows are easier to manage as CSV than as JSON arrays.

Imported from external systems — user lists from HR systems, product catalogues from ERP systems, test cases exported from TestRail — almost all come as CSV.

email,password,expectedRole,shouldSucceed
admin@example.com,Admin123!,admin,true
member@example.com,Member123!,member,true
inactive@example.com,Inactive123!,,false
wrongpass@example.com,wrongpassword,,false
import { parse } from 'csv-parse/sync'
import { readFileSync } from 'fs'
 
const csv = readFileSync('./tests/fixtures/login-cases.csv', 'utf8')
const cases = parse(csv, { columns: true, skip_empty_lines: true })
 
for (const tc of cases) {
  test(`login: ${tc.email}`, async ({ request }) => {
    const response = await request.post('/api/auth/login', {
      data: { email: tc.email, password: tc.password }
    })
    if (tc.shouldSucceed === 'true') {
      expect(response.status()).toBe(200)
      const body = await response.json()
      expect(body.user.role).toBe(tc.expectedRole)
    } else {
      expect(response.status()).toBe(401)
    }
  })
}

Converting between CSV and JSON

The need to convert between these formats comes up constantly:

  • A business analyst provides a CSV of test cases; your test framework expects JSON
  • An API returns a JSON array; you want to open it in Excel for analysis
  • Your test data generator produces JSON; your database import tool requires CSV

The CSV ↔ JSON Converter tool handles both directions with support for comma, semicolon, tab, and pipe delimiters. For scripted conversion in CI or local dev:

// CSV → JSON (using csv-parse)
import { parse } from 'csv-parse/sync'
import { readFileSync, writeFileSync } from 'fs'
 
const csv  = readFileSync('input.csv', 'utf8')
const rows = parse(csv, { columns: true, skip_empty_lines: true })
writeFileSync('output.json', JSON.stringify(rows, null, 2))
// JSON → CSV (using csv-stringify)
import { stringify } from 'csv-stringify/sync'
import { readFileSync, writeFileSync } from 'fs'
 
const data = JSON.parse(readFileSync('input.json', 'utf8'))
const csv  = stringify(data, { header: true })
writeFileSync('output.csv', csv)

For one-off conversions without installing a package:

// Minimal JSON → CSV (no dependencies)
function jsonToCsv(rows: Record<string, unknown>[]): string {
  if (rows.length === 0) return ''
  const headers = Object.keys(rows[0])
  const escape = (v: unknown) => {
    const s = String(v ?? '')
    return s.includes(',') || s.includes('"') ? `"${s.replace(/"/g, '""')}"` : s
  }
  return [
    headers.join(','),
    ...rows.map(r => headers.map(h => escape(r[h])).join(','))
  ].join('\n')
}

Choosing a format: decision guide

SituationUse
API request body testingJSON
Form field validation (flat data)CSV or JSON
Business team maintains the dataCSV
Nested objects / arraysJSON
Performance test scenariosCSV
Test data with boolean/number typesJSON
Output from a spreadsheet toolCSV
Data imported directly into API callsJSON
More than 1,000 rowsCSV

Type coercion: the hidden CSV trap

CSV stores everything as strings. When you load a CSV row, every value — including true, false, 42, and null — comes back as a string. This causes subtle failures:

// Loaded from CSV — everything is a string
const row = { active: 'true', count: '42', deleted: 'false' }
 
// This will FAIL — 'false' is truthy as a string
if (row.deleted) { /* always executes */ }
 
// Correct — explicit type conversion
const active  = row.active  === 'true'
const count   = parseInt(row.count, 10)
const deleted = row.deleted === 'true'

When your CSV data drives assertions that depend on type (boolean checks, numeric comparisons), always convert explicitly. A helper function eliminates the repetition:

function coerce(value: string): string | number | boolean | null {
  if (value === '') return null
  if (value === 'true')  return true
  if (value === 'false') return false
  const num = Number(value)
  if (!isNaN(num) && value.trim() !== '') return num
  return value
}

Organising your test data files

A clean fixture directory structure prevents the data sprawl that makes large test suites hard to maintain:

tests/
  fixtures/
    auth/
      valid-credentials.csv
      invalid-credentials.csv
      permission-matrix.json
    products/
      create-valid.json
      create-invalid.json
      search-queries.csv
    api-responses/
      user-profile.json
      order-detail.json
      error-responses.json

Keep test data close to the tests that use it. If a fixture is only used by one test file, it belongs next to that file. If it's shared across multiple test files, it belongs in a shared fixtures directory.

Version your fixtures with your code. A test that depends on a fixture that isn't in the repository is a test that breaks on a fresh clone.

Free newsletter

Stay ahead in AI-driven QA

Get practical tutorials on test automation, AI testing, and quality engineering — straight to your inbox. No spam, unsubscribe any time.

Discussion

Sign in with GitHub to comment · powered by Giscus