Runtime API
/v1/evaluate/streamEvaluate streaming LLM output as it arrives
Evaluate text chunks via Server-Sent Events. Accepts a list of text chunks and streams back SSE events as each window is evaluated. Events: - ``chunk``: forwarded text chunk - ``guardrail``: window evaluation result (non-blocking) - ``block``: window triggered a BLOCK decision - ``done``: final summary with aggregate metrics
Authentication
Create via POST /v1/orgs/{org_id}/tokens/runtime. Scoped to one project + environment.
SDK install
pip install znyx-sdknpm install @znyx/sdkHeader parameters
| Name | Type | Required | Description |
|---|---|---|---|
| X-API-Key#header | string | null | optional | — |
| authorization#header | string | null | optional | — |
Request bodyrequired
| Field | Type | Required | Description |
|---|---|---|---|
| request_id | string | optional | — |
| tenant_id | string | optional | — |
| app_id | string | optional | — |
| context | string | optional | input or output |
| chunks | string[] | required | Text chunks to evaluate in order |
| policy | object | null | optional | Inline policy (optional) |
| window_size | integer | optional | — |
| overlap | integer | optional | — |
Responses
| Status | Description |
|---|---|
| 200 | Successful Response |
| 422 | Validation Error |
Response schema
Errors & what triggers them
| Code | Trigger | Fix |
|---|---|---|
| 401 | Missing or invalid X-API-Key / Authorization header. | Check the token is still active — rotated tokens return 401 after the grace period ends. |
| 403 | Token does not have the `evaluate` scope. | Use a runtime token (POST /v1/orgs/{org_id}/tokens/runtime). |
| 422 | Request body failed Pydantic validation (missing tenant_id, bad context, etc.). | — |
| 429 | Monthly evaluation quota hit for your plan. | Upgrade via POST /v1/billing/checkout, or wait for the next monthly reset. |
| 500 | Detector crashed or resolver timed out. Typically transient. | Retry with backoff. If it persists, check Traces for the request_id. |
Notes & examples
When to use this
Use the streaming endpoint when you want to block a response while it is still being generated — not after the whole response is already in the user's hands. Typical cases:
- Token-streaming chat UIs (OpenAI / Anthropic style).
- Long-form generation where waiting for
evaluate/outputafter the full response would mean the user has already seen unsafe text. - Multi-agent pipelines where a tool-call argument needs to be screened before the next tool fires.
Sliding window model
The request is a list of chunks (one per stream event). Internally, the engine concatenates them into a rolling buffer and evaluates every time the buffer reaches window_size characters, with overlap characters carried forward so phrases spanning two chunks still match.
Defaults (window_size=200, overlap=40) are tuned for Latin-script chat use. Tune up if you get false-positive detector fires at chunk boundaries.
Server-Sent Events
The response is text/event-stream with four event types:
event: chunk
data: {"text": "Hello, let me help..."}
event: guardrail
data: {"window_index": 0, "decision": "ALLOW", "risk_score": 12}
event: block
data: {"detector": "pii", "window_index": 3, "risk_score": 92, "message": "Email redacted"}
event: done
data: {"total_windows": 5, "blocked": true, "aggregate_decision": "BLOCK"}Your client should process block as terminal — stop forwarding chunks to the user the moment you see one.
Common pitfalls
chunkstakes strings, not raw token IDs. Decode upstream from whatever your LLM client yields.- If you don't set a
policyinline, the engine resolves bytenant_id/app_id— same as/evaluate/output. Cache the bundle locally and passpolicydirectly in latency-sensitive deployments. - This endpoint lives on the runtime, not the control plane. Point your client at your runtime's hostname, not
api.znyx.ai.
Related
POST /v1/evaluate/output— non-streaming equivalent, simpler to wire up.POST /v1/evaluate/input— screen user input before the LLM sees it.
Request
curl -X POST 'https://api.znyx.ai/v1/evaluate/stream' \
-H 'Authorization: Bearer $ZNYX_TOKEN' \
-H 'Content-Type: application/json' \
-d '{
"request_id": "stream-0",
"tenant_id": "default",
"app_id": "default",
"context": "output",
"chunks": [
"string"
],
"policy": null,
"window_size": 200,
"overlap": 40
}'Response
Successful Response
null
Schema: any