Runtime API

POST/v1/evaluate/input

Evaluate user input before calling the LLM

Evaluate input text before it reaches the LLM. This is the most common endpoint integrators call. Returns an ``EvaluationResponse`` with one of ``ALLOW``, ``BLOCK``, or ``TRANSFORM`` plus per-detector results. Use the resolved policy for the scope (tenant/app/agent/env) of the request.

Runtime tokenscope: evaluate + bundle fetchSubject to per-plan eval quotaoperation_id: runtime.evaluateInput

Authentication

Create via POST /v1/orgs/{org_id}/tokens/runtime. Scoped to one project + environment.

SDK install

pip install znyx-sdknpm install @znyx/sdk

Request bodyrequired

Field	Type	Required	Description
request_id	string	required	Unique identifier for this request
tenant_id	string	required	Tenant identifier
app_id	string	required	Application identifier
agent_id	string	optional	Agent identifier
env	string	optional	Environment (prod, staging, dev)
text	string	required	Text to evaluate
metadata	object \| null	optional	Optional metadata
trace_id	string \| null	optional	Distributed trace ID for correlation
session_id	string \| null	optional	Session/conversation ID for grouping
span_id	string \| null	optional	Span ID within a trace

Responses

Status	Description
200	Successful Response
422	Validation Error

Response schema

request_idrequiredstring

decisionrequiredDecision

risk_scorerequiredinteger

Risk score from 0-100

policy_versionrequiredstring

rule_hits

sanitized_textstring | null

Sanitized text if REDACT/TRANSFORM

sanitized_tool_argsobject | null

Sanitized tool args (for tool evaluation)

user_messagestring | null

Safe message to show end-user when blocked

developer_messagestring | null

Developer-facing explanation

latency_msinteger | null

Total evaluation latency in milliseconds

trace_idstring | null

Trace ID for distributed tracing correlation

session_idstring | null

Session/conversation ID echoed from request

span_idstring | null

Span ID within a trace echoed from request

detector_results

Per-detector timing breakdown

qualityQualityReport | null

Response quality scores (output context only)

field_errors

Field-level errors from structured output validation

remediationRemediationResult | null

Remediation action applied after detector decision

pending_review_idstring | null

Human review queue ID if ask_human remediation was triggered

Errors & what triggers them

Code	Trigger	Fix
401	Missing or invalid X-API-Key / Authorization header.	Check the token is still active — rotated tokens return 401 after the grace period ends.
403	Token does not have the `evaluate` scope.	Use a runtime token (POST /v1/orgs/{org_id}/tokens/runtime).
422	Request body failed Pydantic validation (missing tenant_id, bad context, etc.).	—
429	Monthly evaluation quota hit for your plan.	Upgrade via POST /v1/billing/checkout, or wait for the next monthly reset.
500	Detector crashed or resolver timed out. Typically transient.	Retry with backoff. If it persists, check Traces for the request_id.

Notes & examples

When to use this

Call /v1/evaluate/input before sending the user message to your LLM. It returns an ALLOW, BLOCK, or TRANSFORM decision plus per-detector results, and takes ~10-40 ms depending on which detectors are enabled.

ALLOW: proceed to the LLM with the original text.
BLOCK: do not call the LLM. Return response.user_message (or your own static message) to the user.
TRANSFORM: use response.sanitized_text (PII redacted, harmful fragments removed) as the LLM input.

Common pitfalls

The tenant_id / app_id / agent_id / env 4-tuple controls which policy is resolved. For most single-tenant customers, set tenant_id to the org UUID and app_id to the project UUID. Use the string "default" in the tenant position only if you're running the open-source runtime without a control plane.
request_id should be unique per request. Reusing IDs makes trace correlation useless on the Traces page.
metadata is free-form — stash user_id / session_id / conversation_id here. They flow through to traces and webhooks.

Tracking latency

Every response includes latency_ms. Also watch detector_results[].latency_ms — if one detector dominates, disable it in policy rather than tuning timeouts.

POST /v1/evaluate/output — same shape, but apply to the LLM response.
POST /v1/evaluate/stream — for streaming responses (tokens evaluated as they arrive).
GET /v1/bundles/latest — cache the resolved policy locally and avoid a DB hit per call.

Request

curl -X POST 'https://api.znyx.ai/v1/evaluate/input' \
  -H 'Authorization: Bearer $ZNYX_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
  "request_id": "string",
  "tenant_id": "string",
  "app_id": "string",
  "agent_id": "default",
  "env": "prod",
  "text": "string",
  "metadata": null,
  "trace_id": null,
  "session_id": null,
  "span_id": null
}'

Response

application/json

Successful Response

{
  "request_id": "string",
  "decision": "ALLOW",
  "risk_score": 0,
  "policy_version": "string",
  "rule_hits": [
    {
      "rule_id": "string",
      "severity": "low",
      "message": "string"
    }
  ],
  "sanitized_text": null,
  "sanitized_tool_args": null,
  "user_message": null,
  "developer_message": null,
  "latency_ms": null,
  "trace_id": null,
  "session_id": null,
  "span_id": null,
  "detector_results": [
    {
      "detector_name": "string",
      "decision": null,
      "risk_score": 0,
      "latency_ms": 0,
      "rule_hits": [
        {
          "rule_id": "string",
          "severity": "low",
          "message": "string"
        }
      ],
      "transformed": false
    }
  ],
  "quality": null,
  "field_errors": [
    {
      "path": "string",
      "message": "string",
      "expected": null,
      "actual": null
    }
  ],
  "remediation": null,
  "pending_review_id": null
}

Schema: object