Two passes. 49 skills. All at 100%.
Every skill in the Vibecoding pipeline was put through two rounds of systematic testing and improvement: a functional autoresearch pass (Karpathy method) and a progressive disclosure audit. Both ended at 100%.
5 real-world test inputs × 5 binary evals = 25 points per skill. One mutation per experiment — the change that fixes the most failures. Loop until 100% for 3 consecutive stability runs.
Tests whether the skill produces correct, complete output across real scenarios: edge cases, missing files, wrong stack, direct activation, handoffs.
Each skill scored on 4 binary checks. Any skill below 4/4 received a targeted fix — either a jump directive or structural rewrite.
Tests token efficiency: does Claude load only the relevant section, or does it read the whole skill every time?
All 49 skills at 100%. 120 total experiments. Average baseline 84.2%.
| Skill | Baseline | Final | Experiments | Key Fix |
|---|---|---|---|---|
| vibe-coding-orchestrator | 80% | 100% | 3 | Bug intercept before routing; routing announcements; step status on resume |
| vibe-coding-state | 80% | 100% | 3 | Concrete WRITE STATE format; VALIDATE error messages; RESUME missing-file handler |
| vibe-coding-ideate | 68% | 100% | 4 | Mid-session fast-track switch; progress.txt on activation; handoff announcement |
| vibe-coding-document | 84% | 100% | 4 | Resume skip-approved-docs; revision handler; PRD import gap-filling |
| vibe-coding-doc-prd | 76% | 100% | 4 | Thin data handler; existing PRD check; preview + approval |
| vibe-coding-doc-appflow | 72% | 100% | 3 | Preview + approval; missing PRD fallback; SPA handler |
| vibe-coding-doc-techstack | 72% | 100% | 2 | Preview + approval; stack-adaptive setup commands |
| vibe-coding-doc-design | 76% | 100% | 3 | Preview + approval; UI_UX_SELECTIONS lookup; validation checklist |
| vibe-coding-doc-backend | 72% | 100% | 3 | Preview + approval; ORM adapter; API style adapter |
| vibe-coding-doc-frontend | 72% | 100% | 2 | Preview + approval; Vite + Next.js Pages Router structure variants |
| vibe-coding-doc-implplan | 72% | 100% | 3 | Preview + approval; no-auth handler; Python stack variants |
| vibe-coding-doc-claudemd | 56% | 100% | 2 | Specific gotchas derivation; Python stack adapter for state/testing |
| vibe-coding-doc-review | 84% | 100% | 3 | globals.css path discovery; requirements.txt fallback; missing-file handler |
| vibe-coding-build | 96% | 100% | 1 | Explicit error routing to vibe-coding-build-fix on any compile failure |
| vibe-coding-css-setup | 96% | 100% | 1 | Non-Tailwind handler; Tailwind v4 setup with @import and @theme block |
| vibe-coding-design-templates | 88% | 100% | 3 | Type detection scoring; Vite variant; read APP_FLOW.md screen inventory |
| vibe-coding-design-guard | 64% | 100% | 3 | MUST FIX vs SHOULD FIX severity; Svelte + Vue check sections added |
| vibe-coding-ui-ux | 88% | 100% | 3 | Style-audience mismatch warning; skip path defaults; custom color expansion |
| vibe-coding-ui-review | 92% | 100% | 2 | DESIGN_SYSTEM.md missing handler; framework-aware fix suggestions |
| vibe-coding-code-review | 96% | 100% | 1 | Language adaptations for Python, Go, Ruby, PHP added per check |
| vibe-coding-debug | 96% | 100% | 1 | Vague bug handler — clarifying questions before reproducing |
| vibe-coding-impact-analysis | 92% | 100% | 2 | Python/Go/Ruby/PHP import pattern detection; deletion always HIGH risk |
| vibe-coding-reverse-engineer | 92% | 100% | 1 | Minimum codebase check — warn if fewer than 5 code files |
| vibe-coding-re-scan | 80% | 100% | 1 | Save to progress.txt before displaying scan output |
| vibe-coding-re-analyze | 100% | 100% | — | Already perfect at baseline |
| vibe-coding-re-generate | 96% | 100% | 1 | Wave 1 partial failure handler |
| vibe-coding-ship | 88% | 100% | 2 | Platform deploy commands; Python pre-flight; Railway/Fly.io/Render support |
| vibe-coding-explore | 100% | 100% | — | Already perfect at baseline |
| vibe-coding-recall | 100% | 100% | — | Already perfect at baseline |
| vibe-coding-security-review | 100% | 100% | — | Already perfect at baseline |
| vibe-coding-tdd | 92% | 100% | 1 | RSpec + Rust #[test] templates added; direct-activation handler |
| vibe-coding-build-fix | 76% | 100% | 1 | Go + Rust error sections added; Python module resolution section |
| vibe-coding-api-connect | 80% | 100% | 1 | Python httpx + requests client; Python OAuth2; Flask webhook handler |
| vibe-coding-cli-runner | 68% | 100% | 1 | Python CLI: venv, Alembic, Django, uvicorn/gunicorn, pytest, Makefile |
| vibe-coding-mcp-setup | 84% | 100% | 1 | JSON merge guidance; settings.json create-if-missing step; security checklist |
| vibe-coding-local-runner | 100% | 100% | — | Go, Rails, Rust, PHP sections added in rewrite pass |
| vibe-coding-deploy-vercel | 80% | 100% | 1 | Python on Vercel limitation note + serverless function config |
| vibe-coding-deploy-netlify | 80% | 100% | 1 | Python persistent server limitation — redirect to Railway/Render/Fly.io |
| vibe-coding-deploy-digitalocean | 100% | 100% | — | Already perfect at baseline |
| vibe-coding-review-react | 100% | 100% | — | React 19 patterns added in rewrite pass |
| vibe-coding-react-native | 100% | 100% | — | Expo New Architecture check added in rewrite pass |
| vibe-coding-web-design-guidelines | 100% | 100% | — | Already perfect at baseline |
| vibe-coding-self-improve | 40% | 100% | 1 | Fixed handoff claims; progress.txt format; orchestrator registration; results file created |
| vibe-coding-db | 80% | 100% | 1 | Non-canonical PHASE: DB_SETUP replaced with BUILD phase append format |
| vibe-coding-db-sqlite | 80% | 100% | 1 | Progress.txt update section added with BUILD phase append format |
| vibe-coding-db-bettersqlite | 80% | 100% | 1 | Progress.txt update section added with BUILD phase append format |
| vibe-coding-db-postgres | 80% | 100% | 1 | Progress.txt update section added with connection_pooling flag |
| vibe-coding-db-duckdb | 80% | 100% | 1 | Progress.txt update section added with primary_db field |
| vibe-coding-db-convex | 80% | 100% | 1 | Progress.txt update section added with auth_configured and file_storage_configured flags |
The full pipeline is free and open source on GitHub.