Conversation Orchestrator: Streaming Unidirectional Audio

Note: The "Stream audio unidirectional" component has additional usage costs. 

When you want to stream the audio of voice calls from Talkdesk to a Partner, the “Stream audio unidirectional” component plays a crucial role, as it can use a WebSocket (WSS) connection to start streaming the audio at any point in the call flow.

The stream of audio from Talkdesk to a Partner will not interfere with the normal flow of the call, as the stream is just mirroring the audio to your system.

The “Stream audio unidirectional” component triggers different events during the conversation stream’s lifecycle. These events are represented via the following WebSocket messages:

  • Connected: The first message sent, once a WebSocket connection is established.
  • Start: This message contains important metadata about the conversation stream and is sent immediately after the Connected message. It is only sent once at the beginning of the conversation stream.
  • Media: This message encapsulates the raw audio data.
  • Stop: A stop message that is sent when the conversation stream is either stopped or the call has ended.

To understand the structure of these WebSocket messages, please review the WebSocket Messages protocols section.

Note: The component only allows you to do a unidirectional audio stream, and it will stream both channels: agent and caller. Currently, the option to stream bidirectional audio is available using the Connect to Autopilot Voice Studio component, as explained in Conversation Orchestrator: Streaming Bidirectional Audio.

The “Stream audio unidirectional” component can be added to any step of the Studio flow, and the stream will start immediately.

 

Stream_audio_1.png

1. Add a "Stream audio unidirectional" component to the “Initial step” exit for successful outcomes (exit “OK”).

2. On the “Stream audio unidirectional” step, configure the “Stream URL” to the WSS connection you want the audio to be streamed.

stream_audio_2.png3. Set up the “Stream audio unidirectional” exits, depending on the flow you would like to define.

In this case, we have added an “Assignment and Dial” as the “Successful” exit of the "Stream audio unidirectional", so that the agent and caller audio are streamed.

 

WebSocket Messages protocols 

Each message sent is a JSON string. You can determine which type of event is occurring by using the event property of every JSON object.

Connected Message

"Connect" is the first message sent once a WebSocket connection is established.

Parameter

Description

event

The value of connected.

protocol

Defines the protocol for the WebSocket connections lifetime. eg: "Call".

version

Semantic version of the protocol.

 

Example: 

{ 
"event": "connected",  
"protocol": "Call", 
"version": "1.0.0"
}

 

Start Message

The "Start" message contains important metadata about the conversation stream and is sent immediately after the "Connected" message. It is only sent once, at the start of the conversation stream.

Parameter

Description

event

The value of start

sequenceNumber

Number used to keep track of message sending order. First message starts with "1" and then is incremented.

start

An object containing Stream metadata

start.streamSid

The unique identifier of the Stream

start.accountSid

The Account identifier that created the Stream

start.callSid

The Call identifier from where the Stream was started.

start.tracks

An array of values that indicates what media flows to expect in subsequent messages. Values include inbound, outbound.

start.customParameters

An object that represents the Custom Parameters set when defining the Stream (explained below)

start.mediaFormat

An object containing the format of the payload in the Media Messages.

start.mediaFormat.encoding

The encoding of the data in the upcoming payload. The value will always be audio/x-mulaw.

start.mediaFormat.sampleRate

The Sample Rate in Hertz of the upcoming audio data. The value is always 8000

start.mediaFormat.channels

The number of channels in the input audio data. The value will always be 1

streamSid

The unique identifier of the Stream

 

Example: 

{ 
"event": "start",  
"sequenceNumber": "2", 
"start": { 
  "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0", 
  "accountSid": "AC123", 
  "callSid": "CA123", 
  "tracks": [ 
    "inbound", 
    "outbound" 
  ],
  "customParameters": {
    "extra_parameters": {},
    "account_id": "5ea75fe7aa224700012eae40",
    "interaction_id": "cadc093d381648e98e520739630c47ff",
    "stream_url": "wss://my.service.com/socket/messages",
    "correlation_id": "72d64225890846c39a05616d08d5d5a1",
    "type": "inbound" 
  },
  "mediaFormat": { 
    "encoding": "audio/x-mulaw", 
    "sampleRate": 8000, 
    "channels": 1 
  } 
},
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0"
}

To enrich the start message produced by Talkdesk Global Communications Network (GCN) we add the following information to the "customParameters" section of the payload:

  • extra_parameters: deprecated field.
  • account_id: The Talkdesk ID for the account.
  • Interaction_id: The unique ID of that Talkdesk interaction.
  • stream_url: The WebSocket URL where the audio is being streamed. 
  • correlation_id: The ID that identifies the call throughout its lifetime, for all our corresponding interaction_id of that call.
  • type: The media flow, that is either inbound or outbound

Example:

"customParameters": {
     "extra_parameters": {},
     "account_id": "5ea75fe7aa224700012eae40",
     "interaction_id": "cadc093d381648e98e520739630c47ff",
     "stream_url": "wss://my.service.com/socket/messages",
     "correlation_id": "72d64225890846c39a05616d08d5d5a1",
     "type": "inbound"
}

Media Message

The "Media" message encapsulates the raw audio data.

Parameter

Description

event

The value of media.

sequenceNumber

Number used to keep track of the message sending order. The first message starts with "1" and then is incremented for each message.

media

An object containing media metadata and payload.

media.track

One of inbound or outbound.

media.chunk

The chunk for the message. The first message will begin with "1" and increment with each subsequent message.

media.timestamp

Presentation Timestamp in Milliseconds from the start of the stream.

media.payload

Raw audio is encoded in base64.

streamSid

The unique identifier of the Stream.

 

{ 
"event": "media",
"sequenceNumber": "3", 
"media": { 
  "track": "outbound", 
  "chunk": "1", 
  "timestamp": "5",
  "payload": "no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiI2ZuyYSCwcGCA0YNamUi4eGiIyXrywVDAcGBwwVLK+XjIiGh4uUqTUYDQgGBwsSJruZjYiGh4qRokQbDgkGBgoQH9KcjomGhomPnv8eDwkGBgkOHFKfkIqGhomOm8QiEQoHBggNGTumkouHhoiNmLUpFAsHBggMFy+slYyHhoeMlawvFwwIBgcLFCm1mI2IhoeLkqY7GQ0IBgcKESLEm46JhoaKkJ9SHA4JBgYJDx7/no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiA=="
 } ,
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0"
}

 

Stop Message

The "Stop" message will be sent when the conversation stream is either stopped or the call has ended.

Parameter

Description

event

The value of stop.

sequenceNumber

Number used to keep track of message sending order. The first message starts with "1" and then is incremented for each message.

stop

An object containing Stream metadata.

stop.accountSid

The Account identifier that created the Stream.

stop.callSid

The Call identifier that started the Stream.

streamSid

The unique identifier of the Stream.

 

Example: 

{ 
"event": "stop",
"sequenceNumber": "5",
"stop": {
   "accountSid": "AC123",
   "callSid": "CA123"
 },
 "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0" 
}

For more information on Conversation Orchestrator, please consult our documentation.

 

 

 

 

 

 

 

 

 

All Articles ""
Please sign in to submit a request.