Python API¶

Everything the CLI does is available as a library. The high-level entry points are async.

import asyncio
from agentvision import analyze, load_settings

async def main():
    report = await analyze("dist/index.html", settings=load_settings(vision_backend="local"))
    print(report.verdict, [i.message for i in report.issues])

asyncio.run(main())

Core functions¶

analyze `async` ¶

analyze(source: str, *, settings: Settings | None = None, backend: str | None = None, instructions: str | None = None, expected: str | None = None, brief: Brief | None = None, use_ocr: bool = True, source_type: str = 'auto', viewport: Viewport | None = None, full_page: bool | None = None, wait_for: str | None = None, out_dir: Path | None = None) -> Report

Full visual analysis: structural grounding + vision-backend critique.

When brief is given, the render is also graded for intent conformance — does it match what the agent set out to build — and the verdict is gated on it.

check `async` ¶

check(source: str, *, settings: Settings | None = None, brief: Brief | None = None, source_type: str = 'auto', viewport: Viewport | None = None, full_page: bool | None = None, wait_for: str | None = None, use_ocr: bool = True, out_dir: Path | None = None) -> Report

Classic checks only — no LLM, no API key, no egress (incl. OCR spell-check).

With a brief, also grades text requirements deterministically via OCR; non-text requirements are reported uncertain (the offline path cannot judge visual intent).

watch `async` ¶

watch(source: str, *, settings: Settings | None = None, backend: str | None = None, frames: int | None = None, interval_ms: int | None = None, brief: Brief | None = None, instructions: str | None = None, use_vision: bool = True, source_type: str = 'auto', out_dir: Path | None = None) -> Report

Watch source over time and report temporal behavior (playback/loading/liveness).

render `async` ¶

render(source: str, *, settings: Settings | None = None, source_type: str = 'auto', viewports: list[Viewport] | None = None, full_page: bool | None = None, wait_for: str | None = None, device_scale: float | None = None, settle_ms: int | None = None, freeze: bool | None = None, out_dir: Path | None = None) -> RenderResult

Render source and return image(s) plus trustworthy DOM/CV signals.

compute_diff ¶

compute_diff(baseline_path: str | Path, candidate_path: str | Path, out_path: str | Path | None = None) -> DiffResult

Sessions¶

LoopSession ¶

Drive the visual feedback loop for one artifact across iterations.

Agents call :meth:iterate after each fix attempt (optionally passing an updated source). The session persists state under the workspace so it can be resumed.

run `async` ¶

run(max_iter: int = 5) -> list[IterationResult]

Convenience: iterate the SAME source up to max_iter times.

Useful for demonstrating stuck-detection on an unchanged artifact. Real agents drive :meth:iterate themselves, editing the source between calls.

IterationResult ¶

Bases: BaseModel

GenerativeLoopSession ¶

Drive generate → perceive → refine until the output matches the brief.

Intent¶

Brief ¶

Bases: BaseModel

The intended product — the thought the render is graded against.

from_inputs `classmethod` ¶

from_inputs(*, text: str | None = None, expect: list[str] | None = None, reference_image: str | None = None) -> Brief

Build a brief from CLI/REST-style inputs (--brief + repeated --expect).

IntentClaim ¶

Bases: BaseModel

A single checkable visual requirement extracted from the intent.

parse `classmethod` ¶

parse(raw: str, *, default: Importance = Importance.MUST) -> IntentClaim

Parse "must: the title reads AgentVision" → claim + importance.

The report contract¶

Report ¶

Bases: BaseModel

to_handoff ¶

to_handoff()

Distill this Report into a :class:~agentvision.models.handoff.Handoff.

The afferent signal an agent's reasoning/memory layer ("the brain") acts on.

issue_signature ¶

issue_signature() -> frozenset[tuple[str, str]]

Identity of the issue set — used for loop progress/stuck detection.

Deliberately ignores bbox/severity drift; two iterations are "the same" if they flag the same (kind, message) pairs.

Issue ¶

Bases: BaseModel

Conformance ¶

Bases: BaseModel

How well the render matches the intended product (the brief / checklist).

score `property` ¶

score: float

Fraction of claims satisfied (0..1); 1.0 when there are no claims.

matches_intent ¶

matches_intent() -> bool

True only when no must requirement is violated or left uncertain.

ClaimResult ¶

Bases: BaseModel

One requirement (a piece of the thought) graded against the render.

The handoff¶

Handoff ¶

Bases: BaseModel

The afferent signal: what the eyes saw, shaped for the brain to act on.

from_report `classmethod` ¶

from_report(report: Report) -> Handoff

Distill a full Report into the actionable signal.

Backends¶

VisionBackend ¶

Bases: Protocol

complete_text `async` ¶

complete_text(system: str, user: str) -> str

Text-only completion (no image) — for checklist extraction + prompt refinement.

Backends that cannot do this (e.g. the offline local backend) return "".

AnalysisRequest ¶

Bases: BaseModel

Configuration¶

Settings ¶

Bases: BaseSettings

Runtime settings. Environment prefix: AGENTVISION_.

Provider API keys use their conventional env names (not the prefix) so they match what the provider SDKs already expect.

load_settings ¶

load_settings(**overrides) -> Settings

Build a Settings object, applying any explicit overrides last.

Python API¶

Core functions¶

analyze async ¶

check async ¶

watch async ¶

render async ¶

compute_diff ¶

Sessions¶

LoopSession ¶

run async ¶

IterationResult ¶

GenerativeLoopSession ¶

Intent¶

Brief ¶

from_inputs classmethod ¶

IntentClaim ¶

parse classmethod ¶

The report contract¶

Report ¶

to_handoff ¶

issue_signature ¶

Issue ¶

Conformance ¶

score property ¶

matches_intent ¶

ClaimResult ¶

The handoff¶

Handoff ¶

from_report classmethod ¶

Backends¶

VisionBackend ¶

complete_text async ¶

AnalysisRequest ¶

Configuration¶

Settings ¶

load_settings ¶

analyze `async` ¶

check `async` ¶

watch `async` ¶

render `async` ¶

run `async` ¶

from_inputs `classmethod` ¶

parse `classmethod` ¶

score `property` ¶

from_report `classmethod` ¶

complete_text `async` ¶