Conversation Orchestrator: Streaming Bidirectional Audio

Note: The "Connect Virtual Agent" component has additional usage costs. For detailed information on its pricing, please contact your Customer Success Manager.

Talkdesk Conversation Orchestrator enables real-time audio streaming between Talkdesk and third-party solutions.

It gives you the chance to share the audio of an inbound call with an external system that can be connected to a third party like Virtual Agent (VA) or another system, allowing you to “Bring Your Own Bot” (BYOB) into Talkdesk using the Connect Virtual Agent Studio component to stream the audio in a bidirectional way. By choosing the “External Voice Stream” option on the component, the audio of the call is streamed to a third-party WebSocket, which includes orchestration messages and contextual information about the voice interaction.  

There are separate types of events that occur during the stream's life cycle. These events are represented via WebSocket Messages:

  • Connected: The first message sent, once a WebSocket connection is established.
  • Start: The message containing necessary metadata about the conversation stream that is sent immediately after the “Connected” message. It is only sent once, at the beginning of the conversation stream.
  • Media: The message that encapsulates the raw audio data.
  • Stop: A stop message is sent when the conversation stream is either stopped or the call has ended.

When doing a bi-directional audio stream, you can send the following WebSocket messages from your system to Talkdesk:

  • Media: A message in which you can encapsulate the raw audio data.
  • Mark: A message that you can send after the Media message to get notified when the audio stream sent was completed.
  • Stop: A message indicating that the audio stream should stop and what is the result of the operation.
  • Clear: A message interrupting the audio streams of all Media messages.

The following diagram shows how both unidirectional and bidirectional audio stream work over WebSocket messages:

image1.png

To understand the structure of these WebSocket messages, please review the WebSocket Messages protocols section below.

The “Connect Virtual Agent” component can be added at any step of the Studio flow and can be configured as follows.

By leveraging the “stream and hold” capacity, the stream will start when the call goes through the component, and stop only when you send a “Stop” message informing the success, failure, or the need for the call to be escalated to an agent. 

image2.png

1. Add a Connect Virtual Agent component to an exit of any studio component. In this case, we will add it to the exit on the "Initial step" component for demonstration purposes (exit “OK”)

2. On the “Connect Virtual Agent” component, select “External Voice Stream” and configure the “Voice stream URL” to the WSS connection you want the audio to be streamed to and from. The selection of “External Voice Stream” activates the Conversation Orchestrator that starts triggering different events represented via WebSocket messages.

3. If you need the call to be escalated to a live agent, then configure the “Escalation” exit and add an Assignment and Dial component step.

4. Set up the “Connect Virtual Agent” exits, depending on the flow you would like to define. In this example, the “Execution Error” exit was also configured to the “Assignment and Dial” so, in case of an error, an agent helps the caller. 

 

WebSocket Messages Protocols 

The WebSocket message protocols are based on Talkdesk Global Communications Network (GNC) TwiML™️ Voice: <Stream>.

 

Messages sent from Talkdesk to the Partner

Connected Message

"Connected" is the first message sent once a WebSocket connection is established.

For details on the message format and parameters please see the documentation here

 

Start Message

The "Start" message contains important metadata about the conversation stream and is sent immediately after the "Connected" message. It is only sent once, at the start of the conversation stream.

For details on the message format, and parameters, please see the documentation here

To enrich the start message produced by Talkdesk Global Communications Network (GNC), we add the following information to the "customParameters" section of the payload:

  • extra_parameters: deprecated field
  • account_id: The Talkdesk ID for the account.
  • Interaction_id: The unique ID of that Talkdesk interaction.
  • stream_url: The WebSocket URL where the audio is being streamed. 
  • correlation_id: The ID that identifies the call throughout its lifetime, for all our corresponding interaction_id of that call.
  • business_hours: For unidirectional audio stream this field will be empty. This is used for bidirectional streams to indicate the agent's business hours information
  • type: The media flow, that is either inbound or outbound
  • initial_timestamp: The timestamp of the moment the stream started.
  • flow_id: deprecated field

Example:

"customParameters": {
"extra_parameters":{
        "initial_timestamp":"1668428901027","flow_id":""},
     "account_id": "5cee471c844dda000d67a428",
     "interaction_id": "073c9e0e1ab44c8a8085da2b08c1ecf9",
     "stream_url": "wss://my.service.com/socket/messages",
      "correlation_id": "614c537a2021746aead25356",
    “business_hours”_ “”,
     "type": "inbound",
      "initial_timestamp": "2022-01-16T16:12:47.254Z",
    “flow_id”: “”
}

 

Media Message

The "Media" message encapsulates the raw audio data.

For details on the message format and parameters please see the documentation here.

 

Mark Message

When you want to be notified that the audio you have streamed has been completed, send a “Mark” message after the “Media” message.

You will receive a “Mark” event with the matching name from Talkdesk when the audio ends or if there is no buffered audio.

In case the “Clear” message was used, you will also receive a “Mark” event when the buffer clears. 

For details on the message format and parameters please see the documentation here.


Stop Message

The "Stop" message will be sent when the conversation stream is either stopped or the call has ended.

For details on the message format and parameters, sent by Talkdesk, please see the documentation here

 

Messages a Partner can send to Talkdesk 

Media Message

You need to use the “Media” message to send an audio stream from your system to Talkdesk. 

The media messages will be buffered and played in the order received. To interrupt the buffered audio, you need to send a “Clear” message.

For details on the message format and parameters please see the documentation here.


Mark Message

When you want to be notified that the audio you have streamed has been completed, send a “Mark” message after the “Media” message.

You will receive a “Mark” event with the matching name from Talkdesk when the audio ends or if there is no buffered audio.

In case the “Clear” message was used, you will also receive a “Mark” event when the buffer clears. 

For details on the message format and parameters please see the documentation here.


Clear Message

To interrupt the audio stream, send a “Clear” message. This will cancel all “Media” messages, that are buffered and have not been played.

This will empty all buffered audio and cause a “Mark” event to be sent back to you.

For details on the message format and parameters, sent by Talkdesk, please see the documentation here.


Stop Message

Send a “Stop” message if you want to stop the audio stream and communicate the result of the operation:

  • ok - The audio stream was successful. 
  • error - There was an error during the audio stream.
  • escalate - The audio stream should stop and the call should be escalated to a live agent.

The following table shows the parameters that you should send in a “Stop” found in this message:

Parameter

Description

event

The type of event.

stop

An object containing stop metadata and payload information.

stop.command

One of the "ok"/"error"/"escalate" options. 

Depending on the command, a different exit option is followed in the “Connect Virtual Agent” component.

stop.ringGroup

In case of escalation, this parameter indicates the ring group to which the call is redirected.

streamSid

The SID of the stream that should receive the stopped stream.

 

Example:

{
 "event": "stop",
 "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
 "stop": {
   "command": "escalate",
   "ringGroup": "agents"
 }
}
All Articles ""
Please sign in to submit a request.