Integrating AgentVision into your workflow & your agents¶
Developers use AgentVision in two places: their dev/CI workflow (gate merges on what the UI actually looks like) and their agents (give the agent eyes so it self-corrects). Both on-ramps below.
In your CI / workflow¶
GitHub Action (installs AgentVision + Chromium and fails the build on a fail verdict):
# .github/workflows/visual.yml
jobs:
visual:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: amitpatole/agent-vision@v0.6.1
with:
source: dist/index.html
command: check # check (no key) | analyze | conform | watch
args: --full-page
# Intent gate with a vision backend:
- uses: amitpatole/agent-vision@v0.6.1
with:
command: conform
source: dist/index.html
backend: anthropic
expect: |
must: a "Checkout" button is visible
should: uses the brand's dark theme
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
pre-commit (this repo ships .pre-commit-hooks.yaml):
# .pre-commit-config.yaml
- repo: https://github.com/amitpatole/agent-vision
rev: v0.6.1
hooks:
- id: agentvision-check
args: ["dist/index.html", "--quiet"]
Any CI / Makefile / script — just shell out; non-zero exit fails the gate:
agentvision check ./dist/index.html --quiet # 0 pass/warn · 2 fail · 3 error
agentvision conform ./dist/index.html --expect 'must: shows "Total"' --quiet
agentvision watch http://localhost:3000 --allow-local --no-vision --quiet
--quiet writes only the JSON report to stdout (logs to stderr) — easy to capture and parse.
The bundled Dockerfile bakes in Chromium/Tesseract/poppler for a hermetic runner.
In your agents¶
"Provider-agnostic" means the API surface works anywhere — but an agent only benefits if it's actually told to use the loop. Pick the on-ramp that matches your agent:
| Agent / host | How | Recipe |
|---|---|---|
| Claude Code | Skill (auto-invokes before "done") | skill/SKILL.md |
| Cursor | MCP server or project rule | integrations/cursor.rules.md |
| Aider | /run or --test-cmd |
integrations/aider.md |
| Any MCP host | agentvision-mcp |
adapters.md |
| Anything else | the CLI + a system-prompt contract | integrations/agent-contract.md |
The universal contract¶
Whatever the agent, the behavior you want is:
- After producing/editing a visual artifact, run
agentvision check/analyze <artifact> --quiet(useconformto also grade intent,watchfor streaming/over-time behavior). - Treat
issuesas a required to-do list (or use--handofffornext_action/todo); fix the source. - Re-run
agentvision loopuntil the verdict ispass. - Don't claim the task done without a passing verdict.
Copy integrations/agent-contract.md into your
agent's system prompt to enforce it.
Why Claude Code is "first-class"¶
The Claude Code Skill is the one surface that makes an agent invoke the loop proactively
(its description triggers on visual work, before the agent claims completion). Other
agents get the same capability but you must wire the trigger yourself (a rule, a test
command, or the contract above). MCP is the first-class cross-host path.