Documentation

Usage and pricing

Per-org request metering, AI token charges, service-level pricing, and usage exports for billing.

Every API call AppEngine receives is metered. The Usage module sits as middleware on the request pipeline, attributes the call to the calling org, looks up the per-endpoint cost, and decrements the org's balance (or accumulates billable usage on metered plans). It also handles AI token charges, which work differently because cost is computed per-call from prompt and completion tokens rather than from a flat per-endpoint price.

Architecture

Three services collaborate:

  • UsageMiddleware — globally registered NestJS middleware. Runs on every HTTP request, captures the endpoint and org, and queues a usage entry.
  • UsageService — accumulates entries, decrements balances, and exposes read endpoints for reports.
  • AiChargeService — special path for AI provider calls. Computes cost per provider/model/token-count and charges against the same balance.
  • ServicePricingService — exposes the pricing catalogue (per-call costs, per-service tiers, per-AI-model rates).

Endpoint metering

GET/usage/endpointsJWT
POST/usage/endpointsJWT
DELETE/usage/endpoints/:endpointJWT

The endpoint catalogue lists which routes are metered and their cost in tokens. By default, write-heavy and external-call-heavy routes are metered (/storefront/order/process, /upstream/call/...); pure reads of the data layer are not. Add or remove endpoints from the catalogue at runtime — useful for promotional periods where you waive metering on a specific route.

// charge 5 tokens per call to a specific endpoint
await fetch('/api/usage/endpoints', {
  method: 'POST',
  headers: { orgid: ORG_ID, Authorization: `Bearer ${jwt}` },
  body: JSON.stringify({
    endpoint: '/automation/flows/execute',
    cost: 5,
  }),
});

Usage stats and history

GET/usage/statsJWT
GET/usage/balanceJWT
GET/usage/:orgId/current/:type?JWT
GET/usage/history/:type?JWT

stats returns aggregate usage in a date range. balance returns current balance, spend rate, plan, and active state — what the org-management UI shows on the billing tab. current and history segment by usage type (api-calls, ai-tokens, storage, bandwidth).

const balance = await fetch('/api/usage/balance', {
  headers: { orgid: ORG_ID, Authorization: `Bearer ${jwt}` },
}).then(r => r.json());

// {
//   balance: 12500,        // tokens remaining
//   spendRate: 1.0,         // multiplier on the catalogue rates (plan tier)
//   plan: 'pro',
//   active: true,
//   company: '...'
// }

AI provider keys and charges

GET/usage/ai/provider-keyJWT
GET/usage/ai/modelsJWT
POST/usage/ai/chargeJWT

The AI module calls these to:

  1. Get the provider API key for the org's selected vendor (/usage/ai/provider-key?provider=openai). The platform centralises keys server-side; orgs don't bring their own.
  2. Look up supported models and per-token pricing (/usage/ai/models).
  3. Charge usage after a successful call (/usage/ai/charge).
// after an OpenAI call returns
await fetch('/api/usage/ai/charge', {
  method: 'POST',
  headers: { orgid: ORG_ID, Authorization: `Bearer ${jwt}` },
  body: JSON.stringify({
    provider: 'openai',
    model: 'gpt-5',
    promptTokens: 1200,
    completionTokens: 800,
    metadata: { feature: 'lead-summary' },
  }),
});

The AiChargeService looks up the per-million-token rate, applies the org's spendRate (a plan-tier multiplier — pro plans get a smaller multiplier than free), and decrements the balance. For image models (gpt-image-1, dall-e-3), pass imageCount and imageQuality instead of token counts.

If the org's balance is too low, the charge call returns success: false and the AI module surfaces a "balance too low" error rather than failing silently.

Plans and pricing

Plans live in src/usage/pricing-plan.ts. Each plan defines:

  • name (free, basic, pro, team, enterprise)
  • monthlyTokens — tokens granted at start of cycle
  • monthlyPrice — what the org pays
  • spendRate — multiplier on usage charges (lower = better deal)
  • features — feature flags the plan unlocks
  • limits — sites, dev environments, custom domains, team members, etc.

The org management module exposes plans to customers through its subscription endpoints.

Gifting credit

POST/usage/giftJWT

Used by sales/support to add credits to an org's balance — promotional grants, refunds, support gestures. The action writes a transaction record (audit trail) and credits the balance immediately. Permission: RootAdmin only.

Service pricing (dynamic)

GET/org-management/services/pricing/:serviceNameJWT

For shareable services where pricing varies by usage tier or plan (e.g. premium AI models, dedicated support, white-glove onboarding), the ServicePricingService exposes the live rate. Org-management's purchase flow reads it before charging.

Cost attribution

Every metered call writes a usage record carrying:

  • orgId
  • endpoint (or provider/model for AI)
  • cost (in tokens)
  • metadata (caller user ID, feature flag, request ID)

To attribute by feature, pass a metadata.feature string at the call site. Reports group by it — what's "automation" costing this org versus "AI agent"?

Exports

For finance reconciliation, run:

const exportData = await fetch(
  '/api/usage/stats?startDate=2026-04-01&endDate=2026-04-30',
  { headers: { orgid: ORG_ID, Authorization: `Bearer ${jwt}` } },
).then(r => r.json());

The result aggregates per-endpoint and per-AI-model cost. Pipe it to your accounting system; the platform doesn't push exports automatically.

Throttling

When an org goes over the monthly allowance:

  • Soft mode (default): metered calls still succeed but accrue overage charges billable at the end of cycle.
  • Hard mode: metered calls return 402 Payment Required until the org tops up.

Mode is configured per plan; enterprise plans typically use soft, free plans use hard.

What is not metered

  • Public storefront browse (GET /storefront/products, etc.) — the org isn't paying for shoppers reading the catalogue.
  • Authentication endpoints — sign-in is free.
  • Health and monitoring (/monitoring/health).
  • Webhook receipt — inbound vendor webhooks don't count.

The opt-out logic is in the middleware; if you add a high-volume read endpoint, decide whether to meter it explicitly via setEndpointCost.

The middleware is registered globally as APP_INTERCEPTOR-style — there's no way for a controller to opt out via decorator. If you genuinely need an unmeterable internal endpoint, prefix the route with one of the documented exclusion patterns; check usage.middleware.ts for the current list.