RelayForgeRelayForge™ · DAWES

DAWES Methodology

Document in Progress

The full DAWES methodology document covers question authorship and domain calibration, scoring rubric and threshold derivation, evaluator prompting and judge selection, run replication protocol, and failure categorization taxonomy. Published here when complete.

Domain-specific question authorship and calibrationDrafting
Scoring rubric (0–100) and Tier 1 pass threshold (≥70)Published in benchmark
LLM judge selection and prompting methodologyPublished in benchmark
Cross-run reproducibility and variance analysisIn progress
Failure categorization taxonomyDrafting
View Live Benchmark Results →Research Sources

© 2026 RelayForge, Inc. · Anacortes, WA