Every API call AppEngine receives is metered. The Usage module sits as middleware on the request pipeline, attributes the call to the calling org, looks up the per-endpoint cost, and decrements the org's balance (or accumulates billable usage on metered plans). It also handles AI token charges, which work differently because cost is computed per-call from prompt and completion tokens rather than from a flat per-endpoint price.
Architecture
Three services collaborate:
UsageMiddleware— globally registered NestJS middleware. Runs on every HTTP request, captures the endpoint and org, and queues a usage entry.UsageService— accumulates entries, decrements balances, and exposes read endpoints for reports.AiChargeService— special path for AI provider calls. Computes cost per provider/model/token-count and charges against the same balance.ServicePricingService— exposes the pricing catalogue (per-call costs, per-service tiers, per-AI-model rates).
Endpoint metering
/usage/endpointsJWT/usage/endpointsJWT/usage/endpoints/:endpointJWTThe endpoint catalogue lists which routes are metered and their cost in tokens. By default, write-heavy and external-call-heavy routes are metered (/storefront/order/process, /upstream/call/...); pure reads of the data layer are not. Add or remove endpoints from the catalogue at runtime — useful for promotional periods where you waive metering on a specific route.
// charge 5 tokens per call to a specific endpoint
await fetch('/api/usage/endpoints', {
method: 'POST',
headers: { orgid: ORG_ID, Authorization: `Bearer ${jwt}` },
body: JSON.stringify({
endpoint: '/automation/flows/execute',
cost: 5,
}),
});
Usage stats and history
/usage/statsJWT/usage/balanceJWT/usage/:orgId/current/:type?JWT/usage/history/:type?JWTstats returns aggregate usage in a date range. balance returns current balance, spend rate, plan, and active state — what the org-management UI shows on the billing tab. current and history segment by usage type (api-calls, ai-tokens, storage, bandwidth).
const balance = await fetch('/api/usage/balance', {
headers: { orgid: ORG_ID, Authorization: `Bearer ${jwt}` },
}).then(r => r.json());
// {
// balance: 12500, // tokens remaining
// spendRate: 1.0, // multiplier on the catalogue rates (plan tier)
// plan: 'pro',
// active: true,
// company: '...'
// }
AI provider keys and charges
/usage/ai/provider-keyJWT/usage/ai/modelsJWT/usage/ai/chargeJWTThe AI module calls these to:
- Get the provider API key for the org's selected vendor (
/usage/ai/provider-key?provider=openai). The platform centralises keys server-side; orgs don't bring their own. - Look up supported models and per-token pricing (
/usage/ai/models). - Charge usage after a successful call (
/usage/ai/charge).
// after an OpenAI call returns
await fetch('/api/usage/ai/charge', {
method: 'POST',
headers: { orgid: ORG_ID, Authorization: `Bearer ${jwt}` },
body: JSON.stringify({
provider: 'openai',
model: 'gpt-5',
promptTokens: 1200,
completionTokens: 800,
metadata: { feature: 'lead-summary' },
}),
});
The AiChargeService looks up the per-million-token rate, applies the org's spendRate (a plan-tier multiplier — pro plans get a smaller multiplier than free), and decrements the balance. For image models (gpt-image-1, dall-e-3), pass imageCount and imageQuality instead of token counts.
If the org's balance is too low, the charge call returns success: false and the AI module surfaces a "balance too low" error rather than failing silently.
Plans and pricing
Plans live in src/usage/pricing-plan.ts. Each plan defines:
name(free, basic, pro, team, enterprise)monthlyTokens— tokens granted at start of cyclemonthlyPrice— what the org paysspendRate— multiplier on usage charges (lower = better deal)features— feature flags the plan unlockslimits— sites, dev environments, custom domains, team members, etc.
The org management module exposes plans to customers through its subscription endpoints.
Gifting credit
/usage/giftJWTUsed by sales/support to add credits to an org's balance — promotional grants, refunds, support gestures. The action writes a transaction record (audit trail) and credits the balance immediately. Permission: RootAdmin only.
Service pricing (dynamic)
/org-management/services/pricing/:serviceNameJWTFor shareable services where pricing varies by usage tier or plan (e.g. premium AI models, dedicated support, white-glove onboarding), the ServicePricingService exposes the live rate. Org-management's purchase flow reads it before charging.
Cost attribution
Every metered call writes a usage record carrying:
orgIdendpoint(orprovider/modelfor AI)cost(in tokens)metadata(caller user ID, feature flag, request ID)
To attribute by feature, pass a metadata.feature string at the call site. Reports group by it — what's "automation" costing this org versus "AI agent"?
Exports
For finance reconciliation, run:
const exportData = await fetch(
'/api/usage/stats?startDate=2026-04-01&endDate=2026-04-30',
{ headers: { orgid: ORG_ID, Authorization: `Bearer ${jwt}` } },
).then(r => r.json());
The result aggregates per-endpoint and per-AI-model cost. Pipe it to your accounting system; the platform doesn't push exports automatically.
Throttling
When an org goes over the monthly allowance:
- Soft mode (default): metered calls still succeed but accrue overage charges billable at the end of cycle.
- Hard mode: metered calls return 402 Payment Required until the org tops up.
Mode is configured per plan; enterprise plans typically use soft, free plans use hard.
What is not metered
- Public storefront browse (
GET /storefront/products, etc.) — the org isn't paying for shoppers reading the catalogue. - Authentication endpoints — sign-in is free.
- Health and monitoring (
/monitoring/health). - Webhook receipt — inbound vendor webhooks don't count.
The opt-out logic is in the middleware; if you add a high-volume read endpoint, decide whether to meter it explicitly via setEndpointCost.
The middleware is registered globally as APP_INTERCEPTOR-style — there's no way for a controller to opt out via decorator. If you genuinely need an unmeterable internal endpoint, prefix the route with one of the documented exclusion patterns; check usage.middleware.ts for the current list.