Testing & Accessibility for Form Validation

Form validation is only trustworthy when its behavior and its accessibility are continuously verified by an automated test suite. This guide establishes a complete strategy for testing JavaScript form validation — from unit tests that mock ValidityState, through Playwright integration tests that assert aria-invalid and focus order, to automated axe-core audits — and maps every layer back to the relevant WCAG 2.2 Success Criteria so that a green build is also an accessible one.

Untested validation logic is a liability. A regex that quietly stops matching, an error message that no longer associates with its input via aria-describedby, or a focus jump that breaks after a refactor are defects that ship silently because they never throw an exception. The native Constraint Validation API Deep Dive gives us a precise, observable surface — validity flags, validationMessage, the invalid event — and the accessible-error approaches documented in Inline Error Messaging Strategies give us concrete DOM attributes to assert against. Testing is the discipline that locks both in place.

The Testing Pyramid Mapped to Validation Concerns

The classic testing pyramid — many fast unit tests, fewer integration tests, a small number of end-to-end tests — applies cleanly to form validation when each layer is assigned a specific validation concern. Unit tests verify the logic (does this validator return the right error key?), integration and end-to-end tests verify the wiring (does an invalid value actually toggle aria-invalid and move focus?), and automated accessibility audits verify the conformance (does the rendered error state satisfy WCAG). Layered on top is manual screen-reader testing, which catches the announcement-quality problems no automated tool can detect.

Form validation testing pyramid A four-tier pyramid: unit tests mocking ValidityState at the base, then Playwright integration tests, then axe-core audits, with manual screen-reader testing as a separate top layer. Each tier lists the validation concern it covers and the WCAG criteria it touches. Manual screen-reader testing announcement quality, real AT behavior axe-core automated audits labels, aria-*, contrast — SC 1.3.1 / 1.4.3 / 4.1.2 Playwright integration / E2E aria-invalid, aria-describedby, focus order — SC 3.3.1 / 3.3.3 Vitest unit tests — mock ValidityState validator logic, error keys, message mapping fast, many → slow, few
Each tier owns a validation concern; automated layers gate the build while manual screen-reader testing covers what tooling cannot.

Layer Responsibilities & Trade-offs

Layer Tool What it verifies Speed Catches
Unit Vitest + mocked ValidityState Pure validator functions, error-key mapping Milliseconds Logic regressions, off-by-one ranges
Component/integration Vitest + Testing Library / jsdom DOM attribute wiring, aria-describedby linkage Tens of ms Broken ARIA association, missing live region
End-to-end Playwright (real browser) Native validity, focus management, submission flow Seconds Cross-browser quirks, real focus order
Accessibility audit axe-core / @axe-core/playwright Rendered conformance (labels, contrast, roles) Sub-second per scan Missing labels, contrast failures
Manual NVDA / VoiceOver / JAWS Announcement clarity, reading order Minutes (human) Unhelpful phrasing, double-announcements

The trade-off is coverage versus cost. Push as much as possible down to the unit layer, where a mocked ValidityState runs without a DOM and finishes in milliseconds. Reserve Playwright for behavior that genuinely depends on a real browser — native pseudo-class styling, true focus order, the :user-invalid timing — because those are the things jsdom cannot model faithfully.

Unit Testing: Mocking ValidityState in Vitest

Validator logic should be expressible as pure functions that take a value and return either null or an error key. Keeping the logic separate from the DOM makes it trivially unit-testable and reusable across native, framework, and server contexts. When you do need to exercise code that reads input.validity, mock the ValidityState object rather than constructing real DOM nodes — it is faster and lets you assert against impossible-to-reproduce flag combinations.

import { describe, it, expect } from 'vitest';

// A pure validator: no DOM, no side effects.
type ErrorKey = 'valueMissing' | 'tooShort' | 'patternMismatch' | null;

export function validateUsername(value: string): ErrorKey {
  if (value.trim() === '') return 'valueMissing';
  if (value.length < 3) return 'tooShort';
  if (!/^[a-zA-Z0-9_]+$/.test(value)) return 'patternMismatch';
  return null;
}

describe('validateUsername', () => {
  it('flags empty input as valueMissing', () => {
    expect(validateUsername('')).toBe('valueMissing');
  });

  it('flags short input as tooShort', () => {
    expect(validateUsername('ab')).toBe('tooShort');
  });

  it('flags disallowed characters as patternMismatch', () => {
    expect(validateUsername('bad name!')).toBe('patternMismatch');
  });

  it('accepts a valid username', () => {
    expect(validateUsername('valid_user_1')).toBeNull();
  });
});

When a unit under test reads the native validity object directly — for example, a function that maps ValidityState flags to user-facing messages — build a partial mock. The full ValidityState interface is enumerated in the Constraint Validation API Deep Dive; a helper that defaults every flag to false keeps tests readable.

import { describe, it, expect } from 'vitest';

// Factory: every flag false unless overridden, mirroring a real ValidityState.
function mockValidity(overrides: Partial<ValidityState> = {}): ValidityState {
  return {
    badInput: false,
    customError: false,
    patternMismatch: false,
    rangeOverflow: false,
    rangeUnderflow: false,
    stepMismatch: false,
    tooLong: false,
    tooShort: false,
    typeMismatch: false,
    valueMissing: false,
    valid: false,
    ...overrides,
  } as ValidityState;
}

// Unit under test: maps the first failing flag to a message key.
export function messageForValidity(v: ValidityState): string {
  if (v.valueMissing) return 'This field is required.';
  if (v.typeMismatch) return 'Enter a valid value.';
  if (v.tooShort) return 'Value is too short.';
  if (v.patternMismatch) return 'Value contains invalid characters.';
  return '';
}

describe('messageForValidity', () => {
  it('prioritizes valueMissing over other flags', () => {
    const v = mockValidity({ valueMissing: true, tooShort: true });
    expect(messageForValidity(v)).toBe('This field is required.');
  });

  it('returns the typeMismatch message for a malformed email', () => {
    const v = mockValidity({ typeMismatch: true });
    expect(messageForValidity(v)).toBe('Enter a valid value.');
  });
});

This pattern lets you test message-precedence logic exhaustively without ever touching a browser. Because ValidityState flags are mutually combinable in ways a real input rarely produces, the mock is the only practical way to assert your precedence ordering.

Integration & End-to-End Testing with Playwright

Unit tests prove the logic is correct; Playwright proves the logic is wired to the DOM. The defects that matter at this layer are accessibility-attribute defects: an error that renders visually but never sets aria-invalid, a message that is not connected via aria-describedby, or a submission failure that leaves focus stranded instead of moving it to the first invalid field as described in Focus Management & Keyboard Navigation.

import { test, expect } from '@playwright/test';

test('invalid submission sets ARIA state and moves focus', async ({ page }) => {
  await page.goto('/signup');

  // Submit an empty required field to trigger validation.
  await page.getByRole('button', { name: 'Create Account' }).click();

  const username = page.getByLabel('Username');

  // 1. The input is marked invalid for assistive technology.
  await expect(username).toHaveAttribute('aria-invalid', 'true');

  // 2. The error message is programmatically associated.
  const describedBy = await username.getAttribute('aria-describedby');
  expect(describedBy).toBeTruthy();
  const errorRegion = page.locator(`#${describedBy!.split(' ').pop()}`);
  await expect(errorRegion).toContainText(/required/i);

  // 3. Focus has moved to the first invalid control (WCAG-friendly recovery).
  await expect(username).toBeFocused();
});

Playwright also runs in real Chromium, Firefox, and WebKit, which is the only way to verify behaviors that differ across engines — :user-invalid styling, the exact moment native validation fires, and true tab order. Assert focus with toBeFocused() rather than checking document.activeElement manually; the matcher waits for the focus transition and avoids flakiness.

test('error live region announces without stealing focus', async ({ page }) => {
  await page.goto('/signup');
  const email = page.getByLabel('Email');

  await email.fill('not-an-email');
  await email.blur();

  // The live region holds the message but focus stays put after blur.
  const status = page.getByRole('status'); // aria-live="polite"
  await expect(status).toContainText(/valid email/i);
  await expect(email).not.toBeFocused();
});

For dedicated message-content assertions and a full Playwright project setup, the recipe in Testing Form Error Messages with Playwright walks through selectors, retries, and trace capture.

Automated Accessibility Auditing with axe-core

axe-core is a rules engine that walks a DOM subtree and reports WCAG violations as structured data. Unlike a snapshot test, it encodes accessibility expertise: it knows that an <input> without an associated label fails SC 1.3.1, that aria-describedby must point to an existing id, and that error text rendered below 4.5:1 contrast fails SC 1.4.3. Wiring it into the same Playwright run that exercises your validation flow means every error state is audited in the exact rendered condition a user would encounter.

import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';

test('form in error state has no axe violations', async ({ page }) => {
  await page.goto('/signup');
  await page.getByRole('button', { name: 'Create Account' }).click();

  // Scope the scan to the form so unrelated page issues don't fail this test.
  const results = await new AxeBuilder({ page })
    .include('#signup-form')
    .withTags(['wcag2a', 'wcag2aa', 'wcag22aa'])
    .analyze();

  expect(results.violations).toEqual([]);
});

Auditing the error state specifically is the part teams forget. A pristine empty form often passes axe trivially; the violations appear once errors render — a message container with insufficient contrast, an aria-describedby that references a removed node, or a role added without an accessible name. The dedicated guide on axe-core Accessibility Testing covers the rule set in depth, including which rules map to form concerns and how to read a violation node, while Automating axe-core Form Audits in CI shows how to fail the build on any new violation.

WCAG 2.2 Success Criteria for Form Errors

Four Success Criteria govern form-error accessibility. Knowing precisely what each requires turns vague “make it accessible” tickets into concrete, testable assertions. The compliance checklists in WCAG 2.2 Form Compliance Checklists expand each into a per-field audit.

SC Name Level Requirement How to test it
3.3.1 Error Identification A Errors are identified in text and the field in error is indicated Assert aria-invalid="true" and a text message in the live region
3.3.2 Labels or Instructions A Inputs have visible labels/instructions when needed axe label rule + assert hint text presence
3.3.3 Error Suggestion AA When a fix is known, suggest it in the error text Assert message contains the corrective hint, not just “invalid”
3.3.4 Error Prevention AA Legal/financial/data submissions are reversible, checked, or confirmable E2E test of a confirm step before irreversible submit

SC 3.3.1 is the load-bearing one for validation. It demands two things: the error is identified in text (not by color or an icon alone) and the specific field is indicated. The native Constraint Validation API Deep Dive supplies validationMessage as a starting text, but for SC 3.3.3 you typically replace it with a more actionable suggestion (“Enter a date after the start date” rather than “Value out of range”). SC 3.3.3 examples and patterns are detailed in WCAG 3.3.3 Error Suggestion Patterns.

// A single assertion suite covering SC 3.3.1 and 3.3.3 for one field.
import { test, expect } from '@playwright/test';

test('email error satisfies SC 3.3.1 and 3.3.3', async ({ page }) => {
  await page.goto('/signup');
  const email = page.getByLabel('Email');
  await email.fill('bob@');
  await email.blur();

  // SC 3.3.1: field flagged + text error present.
  await expect(email).toHaveAttribute('aria-invalid', 'true');
  const msgId = (await email.getAttribute('aria-describedby'))!.split(' ').pop()!;
  const msg = page.locator(`#${msgId}`);
  await expect(msg).toBeVisible();

  // SC 3.3.3: the message suggests a correction, not just "invalid".
  await expect(msg).toContainText(/include.*@.*domain|enter a valid email/i);
});

Continuous Integration Strategy

Tests that only run locally drift out of date. The goal is a CI pipeline where unit, end-to-end, and accessibility tests all gate the merge, ordered fast-to-slow so a logic bug fails in seconds rather than after a full browser run.

# .github/workflows/test.yml
name: test
on: [push, pull_request]
jobs:
  validation-suite:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
      - run: npm ci
      # Fast logic layer first — fails in seconds on a regression.
      - run: npm run test:unit -- --run
      # Real-browser behavior + accessibility audits gate the merge.
      - run: npx playwright install --with-deps chromium
      - run: npm run test:e2e

Run the unit layer first so the cheap signal arrives before the expensive Playwright install. Cache Playwright browsers between runs to keep the accessibility job fast. The CI-specific concerns — baselining existing violations, failing only on new ones, and scoping scans to changed components — are covered end-to-end in Automating axe-core Form Audits in CI.

What Automation Can and Cannot Catch

Automated tooling is necessary but not sufficient. axe-core’s own documentation is candid that rules-based scanning catches roughly a third to a half of WCAG issues; the rest require human judgment. For forms, the gap is concentrated in announcement quality.

Concern Automatable? Why
Missing label / for association Yes Deterministic DOM rule (SC 1.3.1)
aria-describedby points to real id Yes Reference integrity is checkable
Color contrast of error text Yes Computed style comparison (SC 1.4.3)
Error text suggests a fix Partly Presence testable; quality is not
Screen reader announces the error usefully No Requires hearing the output
Reading order of error + field makes sense No Requires human comprehension
Whether the message wording reduces confusion No Subjective, context-dependent

Manual screen-reader passes — NVDA with Firefox, VoiceOver with Safari, JAWS with Chrome — should run before every release that touches form structure. Listen for three failure modes that automation misses: errors that announce twice (a live region plus a focus move that re-reads the field), errors announced out of context (“required” with no field name), and aria-live="assertive" regions that interrupt the user mid-typing. The decision between polite and assertive announcement timing is a UX concern covered in Inline Error Messaging Strategies.

Implementation Checklist

Frequently Asked Questions

Should I unit-test against a real DOM or a mocked ValidityState?

Mock it. A partial ValidityState object lets you assert message-precedence logic across flag combinations a real input would never produce in one state, and the tests run in milliseconds with no jsdom overhead. Reserve real-DOM and real-browser checks for Playwright, where wiring and focus order actually matter.

Does a passing axe-core scan mean my form is accessible?

No. Rules-based scanning catches roughly a third to a half of WCAG issues — labels, contrast, reference integrity. It cannot judge whether a screen reader announces the error usefully or whether the wording reduces confusion. Manual screen-reader testing remains mandatory for SC 3.3.3 quality.

Why audit the error state and not just the empty form?

Most accessibility violations only exist once errors render: a low-contrast message container, an aria-describedby pointing at a node that was removed, or a role added without an accessible name. A pristine form often passes trivially, so trigger validation first and audit the resulting state.

Which WCAG criterion governs the wording of an error message?

SC 3.3.3 (Error Suggestion, Level AA). When a correction is known, the message must suggest it — "Enter a date after the start date" rather than "Value out of range". SC 3.3.1 (Error Identification, Level A) only requires that the error is identified in text and the field indicated; 3.3.3 raises the bar to actionable guidance.

How do I keep the CI pipeline fast with browser tests?

Order jobs fast-to-slow: run Vitest unit tests first so a logic regression fails in seconds before the Playwright browser install. Cache the Playwright browser binaries between runs, scope axe scans to the form subtree, and reserve the full multi-engine matrix for the main branch rather than every pull request.

← Back to Home

Explore This Section