H2 — corpus-transfer experiment, model sweep¶

Generated by examples/run_h2_sweep.py. Cross-tenant re-verification rate of a live-built skill corpus, swept across models. The moat is real only where skills transfer; a one-model result isn't enough to bet on.

Sweep¶

Provider	Model	Built	Transfer rate	Decision
`ollama`	`qwen3-coder:480b`	12/12	32/36 = 89%	BUILD
`openai`	`gpt-4o-mini`	11/12	29/33 = 88%	BUILD

Per-skill transfer (by model)¶

Skill	`qwen3-coder:480b`	`gpt-4o-mini`
`count_vowels`	100%	100%
`fiscal_quarter`	100%	100%
`gcd`	100%	100%
`initials`	100%	100%
`is_palindrome`	100%	100%
`order_code`	67%	67%
`price_label`	67%	67%
`reverse_words`	100%	100%
`slugify`	100%	—
`snake_case`	100%	100%
`tax_total`	33%	33%
`word_count`	100%	100%

Interpretation¶

Universal pure functions transfer ~100% across models; tenant-specific skills transfer only where a tenant's rule matches the one the skill encoded. A model that builds fewer skills, or whose skills overfit, shifts the rate — which is exactly why the decision is swept, not taken from a single run.