Voice Task 📞

Conducts an AI-powered voice conversation over a phone call.

Node type: voiceTask Category: AI Actor: voiceTask (2 threads)

Description

The Voice Task initiates or receives a phone call and runs an AI voice conversation. It combines:

Text-to-Speech (TTS) — converts AI responses to audio
Speech-to-Text (STT) — transcribes what the caller says
LLM — the AI brain that decides what to say
DTMF detection — touch-tone keypad input support

The Voice Task is designed for automated phone agents, IVR replacements, appointment reminders, outbound campaign calls, and voice-based intake flows.

Properties

Property	Type	Required	Description
`aiProviderConnectionId`	string	Yes	AI provider for conversation intelligence
`smsConnectionId`	string	Yes	SMS/voice provider integration (e.g., Twilio)
`toPhoneNumber`	text	Yes	Phone number to call in E.164 format. Supports `{varName}`
`voiceProvider`	select	No	Voice infrastructure provider (default: configured integration)
`ttsProvider`	select	No	TTS engine: `ELEVENLABS`, `GOOGLE`, `AWS_POLLY`, `OPENAI`
`ttsVoice`	text	No	Voice ID or name for the TTS engine
`language`	text	No	BCP-47 language code for STT/TTS (e.g., `en-US`, `es-ES`)
`transcriptionProvider`	select	No	STT engine: `DEEPGRAM`, `GOOGLE`, `WHISPER`
`speechModel`	text	No	Specific speech model ID
`interruptible`	checkbox	No	Allow caller to interrupt the AI while speaking (default: false)
`interruptSensitivity`	select	No	`low`, `medium`, `high` — how easily speech interrupts TTS
`dtmfDetection`	checkbox	No	Enable DTMF (keypad press) detection
`systemPrompt`	textarea	No	System context for the AI voice agent
`agentName`	text	No	Name the AI agent introduces itself as
`agentRole`	text	No	Role description read to the AI as context
`greeting`	text	No	First thing the AI says when the call connects

Inputs

toPhoneNumber: {customerPhone}
greeting: Hello! This is Alex from {companyName}. I'm calling about your appointment on {appointmentDate}.
systemPrompt: You are Alex, a friendly appointment reminder agent for {companyName}.
              The customer is {customerName}. Their appointment is on {appointmentDate} at {appointmentTime}.
              Your goal: confirm or reschedule the appointment.
agentName: Alex
agentRole: Appointment Reminder Agent

Outputs

When the call completes:

Variable	Type	Description
`{callTranscript}`	string	Full transcript of the voice conversation
`{callDuration}`	number	Call duration in seconds
`{callStatus}`	string	`COMPLETED`, `NO_ANSWER`, `BUSY`, `FAILED`
`{callOutcome}`	string	AI-determined outcome (set by AI's final response/tool call)
`{callSid}`	string	Provider call SID for tracking

The AI can set custom outcome variables by returning structured JSON in its final turn, similar to AI Task structured output.

Call Flow

[Voice Task node reached]
    ↓
[Engine calls phone provider → initiates call to {toPhoneNumber}]
    ↓
[Call connects → TTS plays greeting]
    ↓
[STT transcribes caller speech → sent to LLM]
    ↓
[LLM generates response → TTS plays response]
    ↓
[Conversation continues until AI ends call or caller hangs up]
    ↓
[Call ends → transcript and outcome stored]
    ↓
[Workflow continues with {callTranscript}, {callStatus}, {callOutcome}]

Connections

Connection	Description
`sequenceFlow` (incoming)	Arrives from previous node
`successFlow`	Call completed (regardless of call outcome)
`errorFlow`	Call could not be initiated or provider error
`timeoutFlow`	Call timed out

DTMF Support

When dtmfDetection is enabled, the caller can press keypad keys during the call. The AI is informed of DTMF input and can branch the conversation accordingly.

Example prompt handling DTMF:

systemPrompt: If the caller presses 1, confirm the appointment. If they press 2, start the reschedule flow. If they press 0, transfer to a human agent.

Supported Providers

Type	Providers
Voice/Telephony	Twilio, Vonage
Text-to-Speech	ElevenLabs, Google Cloud TTS, Amazon Polly, OpenAI TTS
Speech-to-Text	Deepgram, Google Cloud STT, OpenAI Whisper
AI Brain	OpenAI, Anthropic, Google Gemini (via AI Provider integration)

Mix and match providers — e.g., use Twilio for telephony + Deepgram for transcription + GPT-4o for AI.

Example: Appointment Reminder

{
  "nodeId": "voice-reminder-1",
  "name": "Call Patient for Appointment Reminder",
  "nodeType": "voiceTask",
  "properties": {
    "aiProviderConnectionId": "int_openai",
    "smsConnectionId": "int_twilio",
    "toPhoneNumber": "{patientPhone}",
    "greeting": "Hello, may I speak with {patientName}? This is a reminder call from {clinicName}.",
    "systemPrompt": "You are a polite appointment reminder agent for {clinicName}. The patient's name is {patientName}. Their appointment is with Dr. {doctorName} on {appointmentDate} at {appointmentTime}. Confirm the appointment and offer to reschedule if needed. Be brief and professional.",
    "agentName": "Scheduling Assistant",
    "language": "en-US",
    "ttsProvider": "ELEVENLABS",
    "ttsVoice": "rachel",
    "transcriptionProvider": "DEEPGRAM",
    "interruptible": true,
    "dtmfDetection": true
  },
  "timeout": {
    "duration": 5,
    "durationUom": "MINUTES",
    "action": "FAIL"
  }
}

Best Practices

Always set a timeout — calls can last indefinitely without one
Enable interruptible for natural conversation feel
Test in DEVELOPMENT environment with your own phone number before going live
Keep systemPrompt focused on one goal — multi-purpose voice agents confuse callers
Capture {callTranscript} for compliance and quality review
Connect errorFlow to handle failed calls (busy, no answer) with a follow-up SMS

Description​

Properties​

Inputs​

Outputs​

Call Flow​

Connections​

DTMF Support​

Supported Providers​

Example: Appointment Reminder​

Best Practices​