VTKL Quality Dashboard

👥 Human Review — Reaction-Based Accuracy

Team reacts to Warren's outputs in #warren-review with ✅ (accurate), ⚠️ (partial), or ❌ (inaccurate). This is the only eval. AI judging AI is dead. Tier A functions graduate at 95%+ accuracy over 20+ reviews.

Total Outputs

—

posted to #warren-review

Reviewed

—

with ≥1 human verdict

Pending

—

awaiting review

Overall Accuracy

—

weighted across reviewed outputs

Per-Function Accuracy

ID	Function	Tier	Reviews	✅	⚠️	❌	Accuracy	Graduation

Reviewer Activity

Reviewer	Total Verdicts	✅ Accurate	⚠️ Partial	❌ Inaccurate

Recent Outputs (last 15)

Date	Function	Tier	Status	Verdicts

Graduation Progress (Tier A → 95% over 20+ reviews)

🛡️ Deterministic Quality Gate

Regex-based checks. Zero LLM cost. Catches obvious violations before output ships. This is the only automated gate that stays — no AI judgment involved.

SOPs Wired

—

Total Checks Run

—

Pass Rate

—

Per-SOP Results

SOP	Checks	Pass	Fail	Rate

⚙️ Eval System Status

Active Eval Pipeline

Human Review

Channel#warren-review

QA ManagerDukane

Verdict method✅ ⚠️ ❌ emoji

Review digest cronDaily 10 AM PT

Quality Gate

Active

TypeDeterministic regex

LLM cost$0

SOPs wired—

Killed Systems

Disabled

Shadow reviewOFF

Correlation engineOFF

Self-improvementOFF

Aria shadow reviewOFF

ReasonAI judging AI amplifies shared faults

Data Sources

Human reviews JSON—

Dashboard data JSON—

Dashboard export cronDaily 12 PM PT

Review collectorcollect-review-reactions.py

⚡ VTKL Quality Dashboard