ShipVitals Open-Source Benchmark Results

Current result

20repositories audited

0 / 0P0 and P1 findings

74-89evidence-capped scores

The repositories are mature public projects, so zero release blockers is plausible. The useful signal is that documentation examples and vendored files no longer become false P0 findings, while missing UI proof still limits confidence.

Coverage

Category	Count	Observed cap
CLI, API, library	11	Independent review
Landing or content	2	Visual and independent proof
SaaS or Next.js	2	Visual and independent proof
Marketplace or extension	3	Visual and independent proof
Agent or automation	2	Independent review

What the scores mean

A score of 74 does not mean a UI project is poor. It means static files and deterministic commands cannot establish visual behavior. A score of 89 does not mean a library is nearly perfect. It means the available checks passed while independent review remains absent.

ShipVitals records these boundaries in each report. Projects are listed for calibration only and are not affiliated with or endorsed by ShipVitals.

Reproduce the suite

npm run benchmark:real

The manifest stores repository URLs, categories, expected commands, and audit context. Each result directory contains report.json, run.json, and a Markdown summary. Inspect the full benchmark table.

Results you can inspect, not a victory chart.

Current result

Coverage

What the scores mean

Reproduce the suite