Your rules · pinned to your domain 0.25 credits × test count

Custom Tests.

Write a flow or a rule in plain English. The LLM converts it once into something the runner can execute, and from then on every audit on that domain re-runs it automatically. Your business-critical regressions become a permanent part of the report — no Selenium, no Playwright DSL, no maintenance.

What it does customTests.js

Two kinds of custom tests:

Flow tests (multi-step, end-to-end)

Write a goal in English: "A user can sign up with email + password, verify via the link in the welcome email, and reach the dashboard within 60 seconds."

On first save, the LLM converts that prompt into a structured step list (fill, click, verify, goto) — the same shape as auto-generated Test Flows. The structured version is cached on the custom-test doc so it doesn't get re-generated on every audit. Edit the prompt → triggers a re-conversion.

Static checks (rule-style assertions)

Write a rule: "The checkout page must not produce any console errors." or "All <img> tags must have a non-empty alt attribute."

The LLM converts the rule into an executable predicate — either against captured state (console messages, network responses, HTML) or against the live page via Playwright. The result is pass / fail with concrete evidence on failure.

How it gets used

Custom tests are scoped to (ownerEmail, host) — pinned to a specific domain so they only run on the right reports. Submit a report on that host → every active custom test for that host runs and lands in report.customChecks[]. Tests can be toggled active/inactive without deleting; the most-recent N runs of each test are stored so you can see flake rates.

Editing: the dashboard's Custom Tests tab (/admin/custom-tests) is a per-host CRUD with import/export as JSON. Each test has a severity (info / warn / fail) and a category (functional / a11y / perf / content / security).

What kinds of tests you'd write

Critical-path regression: "Add to cart → fill shipping address → enter promo code BLACKFRIDAY → see 20% discount applied"
Compliance: "The cookie banner must be visible above the fold on first visit and dismissible without accepting"
Content rule: "Pricing card prices must match the prices listed in data-test=price attributes on the pricing API response"
Security check: "No console.log output may contain the substring 'password' or 'token'"
Performance budget: "Largest network resource must be under 2 MB"
Brand consistency: "All buttons must use the brand colour #5DFC82 for their background"
SEO: "Every page must have a meta description between 80 and 160 characters"
Form behaviour: "After clicking Submit on the contact form, the button must show 'Sending…' for at least 300ms"

Coverage

Scope

Per-host, per-account. The same test can be written once and run on every audit of acme.com.

Format

Plain English on input; LLM converts once. Edit the English → triggers re-conversion. No DSL to learn.

Types

Multi-step flows (replay) + static rules (predicate evaluation against captured state)

History

Recent runs stored per test — flake rate, last-failure timestamp, last-N statuses

Visual regression (pixel-diff against a baseline) is not a custom-test type today — coming in a future release.
The LLM converts your prompt deterministically once per save — small wording changes shouldn't change behaviour, but if you rewrite the prompt the new conversion may pick different selectors. Pin behaviour with explicit selectors in the prompt: "click the button with text 'Continue'" is more stable than "go to the next step".
No cross-test sharing of state — each test runs in its own clean Playwright context. If "log in" is a precondition for 5 tests, each test must perform the log-in (or use the auth submit option).
Heavy custom-test suites slow audits down. Budget ~15s per static rule, ~45s per flow. For > 20 tests per host, run them on a scheduled audit, not interactive.
The flow runner halts on first failure — to see "everything that's broken," split into smaller tests rather than one big one.

Sample custom test

// One entry from report.customChecks[]
{
  "id":        "acme.com::cart-promo-flow",
  "name":      "BLACKFRIDAY promo applies a 20% discount at checkout",
  "kind":      "flow",
  "category":  "functional",
  "severity":  "fail",
  "passed":    false,
  "evidence":  "step 5 (enter promo BLACKFRIDAY): code accepted, but order total\ndid not change. Expected -20% (~$24 off the $120 subtotal); observed $120.",
  "firstFailedAt": "2026-05-12T14:31:09Z",
  "recentRuns": [
    { "at": "2026-05-14T09:12:00Z", "passed": false },
    { "at": "2026-05-13T14:02:00Z", "passed": false },
    { "at": "2026-05-12T14:31:09Z", "passed": false },
    { "at": "2026-05-11T11:08:14Z", "passed": true }
  ]
}