Skip to content

Use cases

Where the eyes earn their keep. The through-line: anything that renders to pixels can be graded before it's shipped — by one agent, a CI pipeline, or a whole swarm.

A swarm of agents (the main event)

A fleet of agents generating UIs, charts, dashboards or documents in parallel — each one needs to see its output before claiming done, and the orchestrator needs every verdict in the same language to decide what advances.

  • Run eyes as a service (scaling) and point the whole swarm at it; or embed the library in each worker. Single-shot grading (check/analyze/conform) scales horizontally with zero coordination.
  • Every worker returns the same agentsensory Report/Handoff, so a coordinator (or a brain like Verel) aggregates verdicts on one bus — vision alongside tests, lint and types — and only PASS work compounds.
  • See Swarms & scaling for topologies and the fan-out example.

An agent that self-corrects before claiming done

The single-agent core, and the foundation of the swarm case. The agent writes a UI, renders and looks at it, gets grounded issues, fixes them, and re-renders until the verdict is PASS — instead of confidently shipping breakage it can't perceive.

  • Drop the agent contract into the system prompt, or use the MCP tools or Claude Code Skill so the agent calls the eyes mid-task.
  • The self-correcting loop (loop / LoopSession) automates the render→see→fix→re-render cycle.

A visual gate in CI

Fail the build when the page actually looks wrong — not when a snapshot pixel-diffs (those flake on fonts/timing). check is deterministic and needs no API key:

agentvision check dist/index.html --full-page --quiet   # exit 2 on FAIL

Add a vision backend and conform to gate on intent ("a checkout button is visible"), not just defects. See Workflows & agents for the GitHub Action and pre-commit hook.

Documents, decks, and PDFs

Point any command at a .pdf or an Office/OpenDocument file (.docx/.pptx/.xlsx/.odt/…) and it's rasterized per page and graded like a screenshot — so a generated report or slide deck gets the same FAIL/PASS treatment as a web page. (Office conversion is on for local use, off by default on the REST service — it's a large attack surface.)

PowerPoint decks get an offline slide inspectoragentvision check deck.pptx runs key-free and no-egress, flagging unreadable text (low / dark-on-dark contrast on the rendered pixels), clipped/truncated text, off-slide shapes, and overlapping boxes, each tagged [slide N]. Add --no-cache for a confidential deck (nothing is written to disk). See the check command.

Streaming, loading, and liveness

A glance can't tell a chart that's still loading from one that's broken. watch verifies an artifact over time — frames across an interval — to confirm playback, a loading→loaded transition, or that a live dashboard actually updates. See Streaming / temporal.

Visual regression against a baseline

Capture a named baseline, then gate future renders against it with a structural SSIM diff:

agentvision baseline dist/index.html --name home
agentvision regress dist/index.html --name home      # fails if it drifted

Useful as a cheap, key-free guardrail alongside the semantic checks.

Generative loops (image/asset generation)

When the artifact is generated (not hand-written), generate runs generate → see → grade vs intent → refine prompt → regenerate, so an image/asset pipeline converges on what you actually asked for rather than stopping at the first plausible output.


Ready to try these? Try it yourself · 5-minute tutorial · Real-world scenarios.