Documentation

AI overview

The AI agent framework — chat, function calling, document processing, MCP tools, and autonomous workflows.

The AI module is AppEngine's LLM orchestration layer. It wraps the underlying providers (Anthropic Claude, OpenAI, DeepSeek), exposes them through a streaming agent API, plugs in function calling via an MCP-style tool registry, processes uploaded documents into context, and runs autonomous multi-step workflows through the VibeAgent. Everything is metered through the Usage module so AI cost is attributed back to the org.

What's in the module

SurfacePathUse for
Chat agents/ai/chat, /ai/agent/chatSingle-turn or simple multi-turn responses
Streaming/ai/agent/stream + /ai/stream/:streamIdToken-by-token streaming via SSE
MCP tools/ai/mcp/tools, /ai/mcp/executeServer-side tools the agent can invoke (memory, search, data access)
Document processing(in-agent, file refs in body.files)PDF / DOCX / PPTX text extraction for RAG
Image / video gen/ai/generate/image, /ai/generate/videoProvider-routed media generation
Vibe Agent/ai/vibe-agent/*Standalone agent surface with its own model registry, used by Vibe Studio and external integrators

Agent types

AppEngine ships three core agent classes selected at request time via the agentRole or aiMode field:

  • ChatAgent (agentRole: 'chat' or 'auto') — the default. Handles conversation, MCP tool use, document context. Mode-aware: developer, data-analyst, graphic-artist, content-writer adjust the system prompt.
  • AIAgentAgent (agentRole: 'ai-agent') — the registered runtime for orgs running their own per-tenant agents (configured by record in the agent collection).
  • VibeAgent — autonomous planner/executor. Plans steps, calls tools, retries failures. Runs in its own controller under /ai/vibe-agent/*.

Pick chat for "user types, AI responds." Pick vibe-agent for "user describes a goal, AI does the multi-step work."

Streaming pattern

Streaming responses are two-step:

  1. 1

    POST /ai/agent/stream

    Send the prompt + context; receive { streamId }. Server starts processing in the background.

  2. 2

    GET /ai/stream/:streamId

    Open an SSE connection. Server pushes data events as tokens arrive, then end when complete.

This split lets the client choose where to render the stream (a different process, a different page) and avoids the long-poll problem of holding a single HTTP request open across an unpredictable LLM response. See Chat agent streaming for full code.

What's in each page

Auth

All /ai/* endpoints require JWT + orgid. The MCP execute endpoint additionally enforces principal scoping (it injects the caller's orgId and userId into tool args, regardless of what the agent sent). The VibeAgent controller uses bearer auth — see Vibe Agent.

Balance check

Before any AI stream starts, the AI controller calls usageService.checkBalance(orgId, MIN_ESTIMATED_COST). If the org has neither an active subscription nor sufficient balance, the request returns HTTP 402 with INSUFFICIENT_BALANCE — see Cost and token control.