Documentation

Chat agent streaming

The two-step SSE pattern — POST returns a streamId, GET streams tokens.

The streaming chat API is two endpoints: a POST that kicks off an agent run and returns a stream id, and a GET that connects to a Server-Sent Events stream pushed by that run. This split keeps the writer and reader independent — you can hand the stream id to a different process, retry the GET if the connection drops, or start the stream from one device and render it on another.

Endpoints

POST/ai/agent/streamJWT
GET/ai/stream/:streamIdJWT

Request body

FieldTypeDescription
task*string

The user's prompt or question.

agentRolestring

Agent to use — chat (default), ai-agent, or any registered custom agent id.

agentInstructionsstring

System-level instructions prepended to the prompt. Equivalent to a system message.

conversationHistoryarray

Array of { role, content } messages from prior turns. Server-side history is preferred (see conversationId) but client can pass this as a fallback.

conversationIdstring

Persistent conversation id. The server loads the last 20 messages from chat_message collection on this id and uses them as history.

contextobject

Free-form context object. Reserved keys: aiMode (one of developer, data-analyst, graphic-artist, content-writer), systemPrompt (overrides agentInstructions).

settingsobject

Pass-through settings for the underlying model — temperature, maxTokens, model, etc.

memoryTtlnumber

TTL in seconds for any memory-store calls the agent makes during this run.

clientMcpToolsarray

Tools the client exposes (not server-side). The agent calls back to the client via streamed tool_use events; client executes and sends results back. See Function calling.

availableToolsarray

Whitelist of server-side tool names the agent may call. If omitted, all registered tools are available.

filesarray

Documents to extract into context — [{ path, name, url?, size? }]. See Document processing.

imagesarray

Inline images for vision-capable models — [{ url | base64, mediaType }].

Two-step flow

// Step 1: kick off the run
const { streamId } = await fetch('/ai/agent/stream', {
  method: 'POST',
  headers: {
    orgid: 'my-org',
    Authorization: `Bearer ${jwt}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    task: 'Summarise the latest TechCrunch article on the link.',
    agentRole: 'chat',
    conversationId: 'conv-123',
    context: { aiMode: 'auto' },
    files: [{ path: 'uploads/report.pdf', name: 'report.pdf' }],
  }),
}).then(r => r.json());

// Step 2: open the SSE stream
const es = new EventSource(`/ai/stream/${streamId}?orgid=my-org`, {
  withCredentials: true,
});

es.addEventListener('data', (ev) => {
  const chunk = JSON.parse(ev.data);
  // chunk shape varies; see "Chunk types" below
});

es.addEventListener('end', () => es.close());
es.addEventListener('error', (ev) => {
  console.error(ev);
  es.close();
});
EventSource and headers

Browser EventSource cannot send custom headers. Pass orgid as a query param, and include the JWT in a cookie set during sign-in (the JWT guard accepts cookie auth for SSE GETs). Server-side clients can use a fetch streaming reader with full headers — see the curl/Python examples below.

curl

# Step 1
curl https://appengine.appmint.io/ai/agent/stream \
  -H "orgid: my-org" \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{"task":"Hello","agentRole":"chat","conversationId":"c1"}'
# {"streamId":"abc123"}

# Step 2
curl -N https://appengine.appmint.io/ai/stream/abc123 \
  -H "orgid: my-org" \
  -H "Authorization: Bearer $JWT"
# event: data
# data: {"type":"text","text":"Hi"}
# event: data
# data: {"type":"text","text":" there"}
# event: end
# data: {}

Python

import json, requests, sseclient

base = 'https://appengine.appmint.io'
headers = {
  'orgid': 'my-org',
  'Authorization': f'Bearer {jwt}',
  'Content-Type': 'application/json',
}

# Step 1
r = requests.post(
  f'{base}/ai/agent/stream',
  headers=headers,
  data=json.dumps({
    'task': 'List the steps to deploy a static site to Vercel.',
    'agentRole': 'chat',
    'conversationId': 'demo-1',
  }),
)
stream_id = r.json()['streamId']

# Step 2
r = requests.get(f'{base}/ai/stream/{stream_id}', headers=headers, stream=True)
client = sseclient.SSEClient(r)
for ev in client.events():
  if ev.event == 'end':
    break
  if ev.event == 'data':
    chunk = json.loads(ev.data)
    print(chunk)

Chunk types

The agent emits a small set of data chunks during a run. Common shapes:

// Streaming text token
{ "type": "text", "text": "..." }

// Tool call request (when the agent decides to call a tool)
{ "type": "tool_use", "id": "...", "name": "server.memory.retrieve", "input": {...} }

// Tool call result (after the tool ran)
{ "type": "tool_result", "id": "...", "output": {...} }

// Document processing started
{ "type": "doc", "stage": "extract_started", "fileName": "report.pdf" }

// Image generation result (when the chat agent generates an image)
{ "type": "image", "url": "...", "prompt": "..." }

// Final usage summary, before the end event
{ "type": "usage", "model": "claude-...", "inputTokens": 412, "outputTokens": 128 }

Render text chunks as they arrive; render tool_use and tool_result as activity indicators ("Looking up...", "Got 42 results"); save usage to your local accounting if you do per-user billing.

Server-loaded history

When you pass conversationId, the server loads the last 20 messages from chat_message for that conversation and uses them as conversationHistory. That means a single-message client doesn't need to track history at all — just send the new turn with the same conversationId each time.

The persistence happens transparently: the controller writes the user's request and (after the stream completes) the assistant's full response to chat_message, both keyed to conversationId.

// Turn 1
await fetch('/ai/agent/stream', {
  method: 'POST',
  headers: { /* ... */ },
  body: JSON.stringify({
    task: 'My name is Alex.',
    conversationId: 'conv-alex-1',
  }),
});

// Turn 2 — server already knows Alex from the persisted turn 1
await fetch('/ai/agent/stream', {
  method: 'POST',
  headers: { /* ... */ },
  body: JSON.stringify({
    task: 'What is my name?',
    conversationId: 'conv-alex-1',
  }),
});

Model selection

Models route automatically based on the agent's config and the aiMode selected. For explicit override, pass settings.model in the request body — the model id is forwarded to the underlying provider client.

Errors

  • 402 Payment Required with code: INSUFFICIENT_BALANCE — org has no active subscription and insufficient credit. Body includes required, available, and billingUrl.
  • 400 — missing task or invalid agentRole.
  • 404 on the GET — streamId doesn't exist (expired or never created). Streams are kept for 5 minutes after creation.

Stream sessions are single-consumer. If you open /ai/stream/:streamId twice, the second connection won't receive duplicated events — the stream handler is bound to the first reader.