H2 — corpus-transfer experiment, model sweep¶
Generated by
examples/run_h2_sweep.py. Cross-tenant re-verification rate of a live-built skill corpus, swept across models. The moat is real only where skills transfer; a one-model result isn't enough to bet on.
Sweep¶
| Provider | Model | Built | Transfer rate | Decision |
|---|---|---|---|---|
ollama |
qwen3-coder:480b |
12/12 | 32/36 = 89% | BUILD |
openai |
gpt-4o-mini |
11/12 | 29/33 = 88% | BUILD |
Per-skill transfer (by model)¶
| Skill | qwen3-coder:480b |
gpt-4o-mini |
|---|---|---|
count_vowels |
100% | 100% |
fiscal_quarter |
100% | 100% |
gcd |
100% | 100% |
initials |
100% | 100% |
is_palindrome |
100% | 100% |
order_code |
67% | 67% |
price_label |
67% | 67% |
reverse_words |
100% | 100% |
slugify |
100% | — |
snake_case |
100% | 100% |
tax_total |
33% | 33% |
word_count |
100% | 100% |
Interpretation¶
Universal pure functions transfer ~100% across models; tenant-specific skills transfer only where a tenant's rule matches the one the skill encoded. A model that builds fewer skills, or whose skills overfit, shifts the rate — which is exactly why the decision is swept, not taken from a single run.