Voice Authentication FAQ – Knowledge Base

Generic

What is Voice Authentication?

Voice Authentication, a.k.a. Voice Biometrics, or speaker recognition, enables identity recognition based on voice characteristics.

It measures the individual’s unique characteristics when voicing any sound.

What is the difference between Active and Passive voice authentication?

Active voice authentication is a text-dependent system where the user is asked to repeat a passphrase for enrollment and validation.

Passive voice authentication is a text-independent system where the user is speaking, and the system is assessing the speaker’s voice.

To read more, please consult this article.

When should active or passive voice authentication be used?

The choice between active and passive is driven by use cases and customer experience, rather than performance or security. However, Identity applies Active voice authentication when the voice interaction is in the IVR.

Passive voice authentication when the voice interaction has been escalated to a “live” call with an agent.

Will I still be recognizable if I have a cold, or if I am taking some kind of medication?

Yes; having a cold, or being on medication, does not change the measurable variables in your voice speaker recognition systems exploit information related to the shape of the full vocal tract (i.e. the physiological make-up of an individual), so the effect of a cold will be minimal.

Is it possible to fool a Voice Biometric system by mimicking a user’s voice?

When we mimic someone else’s voice, we copy language mannerisms: high-level language-production features. It is easy to copy the way a person is talking (accent, mannerisms), but impossible to alter the way speech is produced (effect of the vocal tract).

Can a recording fool the system (aka Replay Attack)?

Using a recording device to play back another person’s voice is known as a Replay Attack.

No, the system can’t be fooled, since it will identify recording devices, using a number of techniques in the absence of the highest and lowest frequencies' detection. These frequencies are detectable by speaker recognition engines. Additionally, the process of replaying creates distortion in audio, and this is measurable by replay detection algorithms.

Can the system distinguish between twins?

Even though identical twins have the same genes, their vocal tracts will vary enough for the speaker recognition systems to distinguish between their voices.

What is the accuracy of voice authentication?

Voice authentication’s accuracy, as part of Identity, depends on the user’s voiceprint’s quality and the audio connection.

When these conditions do not impact the system's accuracy, Identity ensures that a “fraudster” is unable to deceive the authentication process, so the rightful owner of a voiceprint can authenticate nearly 100% of the time.

When does the user’s name show up in the Identity app and in the Identity tab in the Conversations app?

After the first validation is performed, i.e. the next time they call in after the enrollment is done.

What is “Call risk” and how is it calculated?

Please check the “How is the risk calculated?” helper in the Identity app, as shown below.

How is the “Voice Authentication” validated?

Please check the “How is authentication validated?” helper in the Identity app, as shown below.

Active voice authentication (aka "Passphrase")

How to set up active Voice Authentication in Talkdesk Studio?

For active Voice Authentication, the admin is able to choose which settings to activate for both Talkdesk Studio components needed:

For more information on Studio flow configuration, please consult this article.

What is the difference between a “Predefined” and a “Custom” passphrase?

A “Predefined” passphrase is a sentence that the system has been previously trained to detect and reason with. On the other hand, a “Custom” passphrase is a sentence that you created, so the system is not trained to detect it.

Which predefined passphrases are available?

Talkdesk Identity is frequently editing the available passphrases. Please check the Enroll Voice Studio component for more information on these, as shown below.

What does “no utterance” means?

The “no utterance” information explains that the system will not perform:

An utterance check when the caller speaks the sentence.
A “second speaker” checks when the user is enrolling into active Voice Authentication.

The information between parentheses is not part of the respective passphrase.

To know more about the main differences, read the article here.

If the “Custom” passphrase setting is chosen, what must be considered when choosing an effective passphrase?

User-friendliness is key, hence a passphrase should be easy for the user to repeat. However, the passphrase should have at least five non-repetitive words, including some variance in terms of “open” and “closed” vowels.

How many passphrase repetitions are required to enroll a caller?

For a successful enrollment, the number of passphrase repetitions may vary between two to five times, depending on some factors (audio quality, passphrase wording, and background noise, among others).

Is it possible for customers already running active voice authentication with other vendors to migrate into our offer?

Yes, the system is capable of repurposing an existing client’s audio corpus (recording) to recreate a client voice biometrics model. This service includes a cost and needs to be scoped by Professional Services.

What if the predefined passphrase is changed after users have already enrolled with another passphrase?

Identity stores information regarding which predefined passphrase was used to enroll a user only when a predefined passphrase is being used.

This means that, even if the passphrase is changed, users that had already been enrolled with the previous predefined passphrase, will continue to have to speak the “old” passphrase.

Is it possible for customers already running active voice authentication with other vendors to migrate into our offer?

Passive voice authentication (aka "On Call")

In which part of the interaction can passive voice authentication be done?

Passive voice authentication is can only be performed:

In a voice interaction.
Once the call is escalated, and after it is picked up by a live agent.
During a live call between a user and a agent.

When does the enrollment of passive voice authentication happen?

The system will automatically try to enroll, and create a passive voice authentication voiceprint of the user at the beginning of the live call when (or immediately after) the consent is granted.

The system needs roughly 30 seconds of active speech from the user in order to create the voiceprint.

To know more about passive voice authentication, click here.

When does validation of passive voice authentication happen?

The system will try to validate the user’s identity by performing passive voice authentication at the start of a live call, or immediately after the enrollment for passive voice authentication is finished.

The agent is then able to perform passive voice authentication on-demand by pushing the “Reauthenticate” button on the Identity tab in the Conversations app.

To leverage passive voice authentication, do we need to manually set up the streaming of the audio of the call?

No. To perform passive voice authentication, all that is needed is to set up the Authenticate voice component in Talkdesk Studio.

I only want to do passive voice authentication, but users keep being asked to enroll in the IVR after a live call is finished.

As the consent is the same for either active and/or passive voice authentication, the system is unable to execute a pass-through in the Enroll Voice component, once the consent has been granted by the user.

To prevent this from happening, please remove the Enroll Voice Studio component from the respective flow.

How can we help?

Voice Authentication FAQ

Published October 12, 2022 15:14 • Last Updated October 01, 2025 16:39

Generic

What is Voice Authentication?

What is the difference between Active and Passive voice authentication?

When should active or passive voice authentication be used?

Will I still be recognizable if I have a cold, or if I am taking some kind of medication?

Is it possible to fool a Voice Biometric system by mimicking a user’s voice?

Can a recording fool the system (aka Replay Attack)?

Can the system distinguish between twins?

What is the accuracy of voice authentication?

When does the user’s name show up in the Identity app and in the Identity tab in the Conversations app?

What is “Call risk” and how is it calculated?

How is the “Voice Authentication” validated?

Active voice authentication (aka "Passphrase")

How to set up active Voice Authentication in Talkdesk Studio?

What is the difference between a “Predefined” and a “Custom” passphrase?

Which predefined passphrases are available?

What does “no utterance” means?

If the “Custom” passphrase setting is chosen, what must be considered when choosing an effective passphrase?

How many passphrase repetitions are required to enroll a caller?

Is it possible for customers already running active voice authentication with other vendors to migrate into our offer?

What if the predefined passphrase is changed after users have already enrolled with another passphrase?

Is it possible for customers already running active voice authentication with other vendors to migrate into our offer?

Passive voice authentication (aka "On Call")

In which part of the interaction can passive voice authentication be done?

When does the enrollment of passive voice authentication happen?

When does validation of passive voice authentication happen?

To leverage passive voice authentication, do we need to manually set up the streaming of the audio of the call?

I only want to do passive voice authentication, but users keep being asked to enroll in the IVR after a live call is finished.

Generic

What is Voice Authentication?

What is the difference between Active and Passive voice authentication?

When should active or passive voice authentication be used?

Will I still be recognizable if I have a cold, or if I am taking some kind of medication?

Is it possible to fool a Voice Biometric system by mimicking a user’s voice?

Can a recording fool the system (aka Replay Attack)?

Can the system distinguish between twins?

What is the accuracy of voice authentication?

When does the user’s name show up in the Identity app and in the Identity tab in the Conversations app?

What is “Call risk” and how is it calculated?

How is the “Voice Authentication” validated?

Active voice authentication (aka "Passphrase")

How to set up active Voice Authentication in Talkdesk Studio?

What is the difference between a “Predefined” and a “Custom” passphrase?

Which predefined passphrases are available?

What does “no utterance” means?

If the “Custom” passphrase setting is chosen, what must be considered when choosing an effective passphrase?

How many passphrase repetitions are required to enroll a caller?

Is it possible for customers already running active voice authentication with other vendors to migrate into our offer?

What if the predefined passphrase is changed after users have already enrolled with another passphrase?

Is it possible for customers already running active voice authentication with other vendors to migrate into our offer?

Passive voice authentication (aka "On Call")

In which part of the interaction can passive voice authentication be done?

When does the enrollment of passive voice authentication happen?

When does validation of passive voice authentication happen?

To leverage passive voice authentication, do we need to manually set up the streaming of the audio of the call?

I only want to do passive voice authentication, but users keep being asked to enroll in the IVR after a live call is finished.

Related articles