Appmint - AI That Builds & Runs Your Entire Business

The AI module is AppEngine's LLM orchestration layer. It wraps the underlying providers (Anthropic Claude, OpenAI, DeepSeek), exposes them through a streaming agent API, plugs in function calling via an MCP-style tool registry, processes uploaded documents into context, and runs autonomous multi-step workflows through the VibeAgent. Everything is metered through the Usage module so AI cost is attributed back to the org.

What's in the module

Surface	Path	Use for
Chat agents	`/ai/chat`, `/ai/agent/chat`	Single-turn or simple multi-turn responses
Streaming	`/ai/agent/stream` + `/ai/stream/:streamId`	Token-by-token streaming via SSE
MCP tools	`/ai/mcp/tools`, `/ai/mcp/execute`	Server-side tools the agent can invoke (memory, search, data access)
Document processing	(in-agent, file refs in `body.files`)	PDF / DOCX / PPTX text extraction for RAG
Image / video gen	`/ai/generate/image`, `/ai/generate/video`	Provider-routed media generation
Vibe Agent	`/ai/vibe-agent/*`	Standalone agent surface with its own model registry, used by Vibe Studio and external integrators

Agent types

AppEngine ships three core agent classes selected at request time via the agentRole or aiMode field:

ChatAgent (agentRole: 'chat' or 'auto') — the default. Handles conversation, MCP tool use, document context. Mode-aware: developer, data-analyst, graphic-artist, content-writer adjust the system prompt.
AIAgentAgent (agentRole: 'ai-agent') — the registered runtime for orgs running their own per-tenant agents (configured by record in the agent collection).
VibeAgent — autonomous planner/executor. Plans steps, calls tools, retries failures. Runs in its own controller under /ai/vibe-agent/*.

Pick chat for "user types, AI responds." Pick vibe-agent for "user describes a goal, AI does the multi-step work."

Streaming pattern

Streaming responses are two-step:

1
POST /ai/agent/stream
Send the prompt + context; receive { streamId }. Server starts processing in the background.
2
GET /ai/stream/:streamId
Open an SSE connection. Server pushes data events as tokens arrive, then end when complete.

This split lets the client choose where to render the stream (a different process, a different page) and avoids the long-poll problem of holding a single HTTP request open across an unpredictable LLM response. See Chat agent streaming for full code.

What's in each page

Chat agent streaming — the SSE pattern with code in JS, curl, and Python.
Function calling and tools — define tools, register them, agent invokes during a turn.
Document processing — upload PDFs/DOCX/PPTX, extract text, RAG into prompts.
Prompt engineering — system prompts, context construction, few-shot examples within AppEngine.
Vibe Agent — autonomous planner with retry and tool orchestration.
Cost and token control — UsageModule integration, AiChargeService, per-org limits, cost attribution.

Auth

All /ai/* endpoints require JWT + orgid. The MCP execute endpoint additionally enforces principal scoping (it injects the caller's orgId and userId into tool args, regardless of what the agent sent). The VibeAgent controller uses bearer auth — see Vibe Agent.

Balance check

Before any AI stream starts, the AI controller calls usageService.checkBalance(orgId, MIN_ESTIMATED_COST). If the org has neither an active subscription nor sufficient balance, the request returns HTTP 402 with INSUFFICIENT_BALANCE — see Cost and token control.

AI overview