Appmint - AI That Builds & Runs Your Entire Business

Recording happens at the gateway level, not at Twilio's. AppEngine receives the μ-law stream over the Media Stream WebSocket and buffers both legs (caller and AI) until the call ends. The same gateway also collects the AI's spoken transcript and every tool call it made — together they form the "call journey".

What gets captured

Stream	Captured as
Caller audio	`callerAudioChunks: Buffer[]` (μ-law 8 kHz, base64-decoded)
AI audio	`aiAudioChunks: Buffer[]` (μ-law 8 kHz, base64-decoded)
Caller transcript	OpenAI `conversation.item.input_audio_transcription.completed`
AI transcript	OpenAI `response.output_audio_transcript.done`
Tool calls	Each `function_call_arguments.done` plus the resolved result

When the WebSocket closes, the gateway either calls saveCallRecording (audio + journey) or saveCallJourney (just journey, no audio).

When recording is enabled

Recording fires when either is true:

The IVR routing for this number has menuSettings.recording.enabled: true
The assistant config has data.config.recordCalls: true

The chosen value is captured at start time so toggling mid-call has no effect.

Output format

createWavBuffer() builds a minimal 44-byte WAV header in front of the raw μ-law buffer:

Field	Value
Sample rate	8000 Hz
Channels	1 (mono)
Bits/sample	8
Audio format code	7 (μ-law)

The result is a single mono file that contains the caller stream concatenated with the AI stream. This is intentionally simple — most downstream tools (transcription services, players) read μ-law WAV without complaint. If you need stereo with caller/AI on separate tracks, mix the buffers with PCM interleave before calling createWavBuffer.

Storage

The WAV is stored via the configured file provider:

const fileProvider = this.repositoryService.getFileProvider();
await fileProvider.put(orgId, `callrecordings/${callSid}.wav`, wavBuffer, false,
  { contentType: 'audio/wav' }, true);

Default provider is S3 (configured in Upstream for the org). The path returned by put() is saved on the call_log record as recordingUrl.

The `call_log` record

Each call writes one call_log:

{
  name: callSid,
  callSid,
  sessionId: ivrContext?.sessionId || `session-${callSid}`,
  type: 'ai-voice',
  assistantId,
  from: ivrContext?.caller || '',
  to: ivrContext?.calledNumber || '',
  direction: ivrContext?.isOutbound ? 'outbound' : 'inbound',
  duration,                    // seconds
  startTime, endTime,          // ISO timestamps
  recordingUrl,                // when audio captured
  recordingDuration,
  hasRecording: true,
  status: 'completed',
  aiTranscript: [{ role: 'ai' | 'caller', text, timestamp }, ...],
  toolsUsed:    [{ tool, params, result, timestamp }, ...],
}

Read it via the data layer just like any other collection:

GET/repository/get/call_log/{id}JWT

curl https://appengine.appmint.io/data/call_log?from=eq:+15551234567&limit=20 \
  -H "orgid: my-org" -H "Authorization: Bearer <jwt>"

Transcription source of truth

Transcription is inline — produced by the OpenAI Realtime session itself, not by a separate post-call STT pass. That means:

No separate transcription job runs
Latency: transcript items arrive during the call, not after
Coverage: only utterances OpenAI heard (server VAD trims silence)

If you need a higher-fidelity transcript later (e.g., for compliance), run the saved WAV through any STT provider; the recording is preserved as μ-law so quality is the same as what the model heard.

Twilio-side recording (the alternative)

Some IVR actions (ivr_record, ivr_voicemail, ivr_transfer with record: true) use Twilio's native recording instead. Twilio uploads its recording to AppEngine via:

POST/connect/webhook/twilio/recording-statusNo auth

Or, for transcription:

POST/connect/webhook/twilio/transcription-statusNo auth

These webhooks live on the connect module — see crm/communications.service.ts and phone.controller.ts for the wiring. Recordings produced this way are linked to the same call_log (or to a dedicated voicemail record for ivr_voicemail) and stored in S3 too.

Inline gateway recording captures every AI call. Twilio-side recording is only available when the call passes through a Twilio recording verb (<Record>, <Dial record="...">). For the AI assistant path the inline route is the only one that fires, because Twilio is just streaming raw audio.

Journey-only saves

For calls where recording is disabled but the assistant still ran tools or generated transcript text, the gateway calls saveCallJourney instead. The same call_log shape is written without recordingUrl — useful for high-volume orgs that want analytics without the storage cost of audio.

Call recording and transcription