The streaming chat API is two endpoints: a POST that kicks off an agent run and returns a stream id, and a GET that connects to a Server-Sent Events stream pushed by that run. This split keeps the writer and reader independent — you can hand the stream id to a different process, retry the GET if the connection drops, or start the stream from one device and render it on another.
Endpoints
/ai/agent/streamJWT/ai/stream/:streamIdJWTRequest body
| Field | Type | Description |
|---|---|---|
| task* | string | The user's prompt or question. |
| agentRole | string | Agent to use — |
| agentInstructions | string | System-level instructions prepended to the prompt. Equivalent to a system message. |
| conversationHistory | array | Array of |
| conversationId | string | Persistent conversation id. The server loads the last 20 messages from |
| context | object | Free-form context object. Reserved keys: |
| settings | object | Pass-through settings for the underlying model — |
| memoryTtl | number | TTL in seconds for any memory-store calls the agent makes during this run. |
| clientMcpTools | array | Tools the client exposes (not server-side). The agent calls back to the client via streamed |
| availableTools | array | Whitelist of server-side tool names the agent may call. If omitted, all registered tools are available. |
| files | array | Documents to extract into context — |
| images | array | Inline images for vision-capable models — |
Two-step flow
// Step 1: kick off the run
const { streamId } = await fetch('/ai/agent/stream', {
method: 'POST',
headers: {
orgid: 'my-org',
Authorization: `Bearer ${jwt}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
task: 'Summarise the latest TechCrunch article on the link.',
agentRole: 'chat',
conversationId: 'conv-123',
context: { aiMode: 'auto' },
files: [{ path: 'uploads/report.pdf', name: 'report.pdf' }],
}),
}).then(r => r.json());
// Step 2: open the SSE stream
const es = new EventSource(`/ai/stream/${streamId}?orgid=my-org`, {
withCredentials: true,
});
es.addEventListener('data', (ev) => {
const chunk = JSON.parse(ev.data);
// chunk shape varies; see "Chunk types" below
});
es.addEventListener('end', () => es.close());
es.addEventListener('error', (ev) => {
console.error(ev);
es.close();
});
Browser EventSource cannot send custom headers. Pass orgid as a query param, and include the JWT in a cookie set during sign-in (the JWT guard accepts cookie auth for SSE GETs). Server-side clients can use a fetch streaming reader with full headers — see the curl/Python examples below.
curl
# Step 1
curl https://appengine.appmint.io/ai/agent/stream \
-H "orgid: my-org" \
-H "Authorization: Bearer $JWT" \
-H "Content-Type: application/json" \
-d '{"task":"Hello","agentRole":"chat","conversationId":"c1"}'
# {"streamId":"abc123"}
# Step 2
curl -N https://appengine.appmint.io/ai/stream/abc123 \
-H "orgid: my-org" \
-H "Authorization: Bearer $JWT"
# event: data
# data: {"type":"text","text":"Hi"}
# event: data
# data: {"type":"text","text":" there"}
# event: end
# data: {}
Python
import json, requests, sseclient
base = 'https://appengine.appmint.io'
headers = {
'orgid': 'my-org',
'Authorization': f'Bearer {jwt}',
'Content-Type': 'application/json',
}
# Step 1
r = requests.post(
f'{base}/ai/agent/stream',
headers=headers,
data=json.dumps({
'task': 'List the steps to deploy a static site to Vercel.',
'agentRole': 'chat',
'conversationId': 'demo-1',
}),
)
stream_id = r.json()['streamId']
# Step 2
r = requests.get(f'{base}/ai/stream/{stream_id}', headers=headers, stream=True)
client = sseclient.SSEClient(r)
for ev in client.events():
if ev.event == 'end':
break
if ev.event == 'data':
chunk = json.loads(ev.data)
print(chunk)
Chunk types
The agent emits a small set of data chunks during a run. Common shapes:
// Streaming text token
{ "type": "text", "text": "..." }
// Tool call request (when the agent decides to call a tool)
{ "type": "tool_use", "id": "...", "name": "server.memory.retrieve", "input": {...} }
// Tool call result (after the tool ran)
{ "type": "tool_result", "id": "...", "output": {...} }
// Document processing started
{ "type": "doc", "stage": "extract_started", "fileName": "report.pdf" }
// Image generation result (when the chat agent generates an image)
{ "type": "image", "url": "...", "prompt": "..." }
// Final usage summary, before the end event
{ "type": "usage", "model": "claude-...", "inputTokens": 412, "outputTokens": 128 }
Render text chunks as they arrive; render tool_use and tool_result as activity indicators ("Looking up...", "Got 42 results"); save usage to your local accounting if you do per-user billing.
Server-loaded history
When you pass conversationId, the server loads the last 20 messages from chat_message for that conversation and uses them as conversationHistory. That means a single-message client doesn't need to track history at all — just send the new turn with the same conversationId each time.
The persistence happens transparently: the controller writes the user's request and (after the stream completes) the assistant's full response to chat_message, both keyed to conversationId.
// Turn 1
await fetch('/ai/agent/stream', {
method: 'POST',
headers: { /* ... */ },
body: JSON.stringify({
task: 'My name is Alex.',
conversationId: 'conv-alex-1',
}),
});
// Turn 2 — server already knows Alex from the persisted turn 1
await fetch('/ai/agent/stream', {
method: 'POST',
headers: { /* ... */ },
body: JSON.stringify({
task: 'What is my name?',
conversationId: 'conv-alex-1',
}),
});
Model selection
Models route automatically based on the agent's config and the aiMode selected. For explicit override, pass settings.model in the request body — the model id is forwarded to the underlying provider client.
Errors
- 402 Payment Required with
code: INSUFFICIENT_BALANCE— org has no active subscription and insufficient credit. Body includesrequired,available, andbillingUrl. - 400 — missing
taskor invalidagentRole. - 404 on the GET —
streamIddoesn't exist (expired or never created). Streams are kept for 5 minutes after creation.
Stream sessions are single-consumer. If you open /ai/stream/:streamId twice, the second connection won't receive duplicated events — the stream handler is bound to the first reader.