Skip to content

Visual Regression

Swarm QA includes a visual regression engine that compares screenshots across runs using pixelmatch — a pixel-level comparison library with zero external dependencies.

How It Works

  1. During a scan with screenshots enabled, each visited page is captured as a PNG.
  2. When a new run targets the same URL, xyva retrieves the baseline screenshot from the previous run.
  3. Pixelmatch compares the two images pixel-by-pixel and generates a diff image highlighting changes.
  4. If the percentage of changed pixels exceeds the configured threshold, a visual regression finding is created.
Baseline (Run #14)  ->  Current (Run #15)  ->  Diff (red = changed pixels)
+--------------+      +--------------+      +--------------+
|              |      |    ##        |      |    ##        |
|  Page as     |      |  Page with   |      |  Changed     |
|  expected    |      |  layout      |      |  regions     |
|              |      |  shift       |      |  highlighted |
+--------------+      +--------------+      +--------------+

View Modes

The visual diff viewer (VisualDiffViewer.tsx) offers two display modes:

ModeDescription
Side-by-SideBaseline and current screenshots displayed next to each other with the diff image below. Best for spotting layout shifts.
Diff OnlyShows only the diff image with changed pixels highlighted in red. Best for quickly scanning many pages.

TIP

Use side-by-side mode when investigating a specific regression. Use diff-only mode when reviewing a full scan with dozens of pages.

Threshold Configuration

The pixel difference threshold determines how sensitive the comparison is:

ThresholdSensitivityUse case
0.01Very strictPixel-perfect designs, brand pages
0.05StandardGeneral UI regression detection
0.10RelaxedDynamic content, ads, date-dependent text
0.20LooseStructural changes only

Configure the threshold in Expert Mode > Visual Regression > Threshold. The default is 0.05 (5% pixel difference).

WARNING

Pages with dynamic content (timestamps, ads, user avatars) will trigger false positives at low thresholds. Use exclude regions or increase the threshold for these pages.

Cached Diffs

Diff images are cached locally to avoid recomputation. The cache is stored alongside run data in .xyva/agent-memory/swarm-runs/. Clearing a run from history also removes its cached diffs.

Baseline Management

The baseline for each URL is automatically set to the most recent run's screenshot. To manually set a baseline:

  1. Open the History tab and expand the desired run.
  2. Click Set as Baseline on a specific page screenshot.
  3. Future comparisons for that URL will use the selected screenshot.

Local-first QA orchestration.