AgentVision โ Eyes for AI Agents ๐๏ธ¶
Problem: AI coding agents are blind โ they write a UI, chart, SVG, PDF or image and never see the result, shipping breakage they can't perceive. Result: AgentVision gives them eyes โ render โ see โ report โ fix โ so the agent self-corrects before it claims done.
pip install "agentvision[render]"
playwright install chromium # see `agentvision doctor` if Chromium won't launch
agentvision demo # no API key required
agentvision demo renders a deliberately broken page, prints a FAIL report (overflow +
low-contrast + a 404 image โ all DOM/CV-grounded, no LLM key), then loops against the fixed
version and prints "what changed: 3 issues resolved โ PASS."
What it does¶
| Capability | What you get |
|---|---|
| See & report | A machine verdict (pass/warn/fail) + coordinate-grounded issues โ DOM geometry, computed-style WCAG contrast, OCR/typos, broken-image & console/4xx capture. |
| Match intent | Grade a render against a brief / checklist / reference โ PASS means "it's what I set out to build," not just "defect-free." |
| Full-coverage vision | Large artifacts get a downscaled overview plus full-res tiles โ pixel-based & source-agnostic (HTML, image, PDF, canvas). |
| Streaming / temporal | watch verifies behavior over time โ playback, loading, liveness โ not just a single glance. |
| Eyes โ brain handoff | A distilled {verdict, next_action, todo, open_questions} signal any agent/brain acts on. |
Where to go next¶
- :material-rocket-launch: Quickstart โ install, system deps, first run.
- :material-sync: The loop โ render โ perceive โ report โ fix โ diff.
- :material-target: Conformance โ grade against intent.
- :material-brain: Handoff โ wire perception into your reasoning/memory.
- :material-play-circle: Streaming / temporal โ
watchover time. - :material-cog: Workflows & agents โ GitHub Action, pre-commit, MCP, the agent contract.
- :material-console: CLI reference ยท :material-tune: Configuration ยท :material-language-python: Python API
- :material-book-open-variant: Recipes ยท :material-help-circle: FAQ
Eyes & brain¶
AgentVision is the eyes. It pairs with Verel, the brain โ an agent framework where nothing is "done" until a grader returns a verdict. The eyes perceive and grade intent; the brain decides and compounds only verified work into memory; then the eyes look again.
Install: pip install "agentvision[all]" ยท Source: GitHub
ยท Package: PyPI ยท License: MIT.