API Reference
Run a dataset against a candidate policy, inspect per-sample results, and diff two runs side-by-side.
/v1/orgs/{org_id}/benchmarks
List Benchmarks
Start Benchmark
/v1/orgs/{org_id}/benchmarks/compare
Compare Benchmarks
/v1/orgs/{org_id}/benchmarks/{run_id}
Get Benchmark
/v1/orgs/{org_id}/benchmarks/{run_id}/results
Get Benchmark Results