A Definitive Guide to Voice User Interface Design (VUI)

The Voice User Interface or VUI has broken the silent interaction between machines and humans.

When we stop for a moment and think about how popular voice-controlled devices and virtual assistants have become, it is only necessary to look deeper into the VUI design and learn more.

Siri, which came first, the chicken or the egg?

Siri: No one knows this, but it was a tie.

A recent article from Google called “How voice assistance is reshaping consumer behavior” makes the huge rise of smart speakers and VUI even more obvious. According to Google’s article, 41% of people who own a voice-controlled speaker feel like talking to another person.

That makes one wonder, is VUI reshaping the human-device relationship?

The answer totally at your discretion but before rushing an answer, let’s go through what a voice user interface is and how to create a VUI design

What is Voice User Interface (VUI)? 

Voice User Interfaces (VUI) make it possible for users to interact with a device or an app through voice commands. With the increased use of digital devices, screen fatigue has become more of a widely experienced issue. And this has given all the more advantages to the development and the use of voice user interfaces. VUIs provide hands-free, complete control over devices and apps without having to look at the screen. The world’s leading companies, specifically all the “Big Five” tech firms like Google, Amazon, Microsoft, Facebook, and Apple, have developed or been developing their voice-enabled AI assistants and voice-controlled devices. 

The most well-known voice user interface examples include Apple’s Siri, Google Assistant, and Amazon’s Alexa. Not only AI assistants but also smart devices with VUIs are taking over the market, such as Amazon Echo, Apple HomePod, and Google Home. 

Be it an AI assistant, a voice-enabled mobile app, or a voice-controlled device like smart speakers, voice interface and interactions have become incredibly common. 

I almost hear you asking, “how common exactly?”

We will talk more about the VUI’s popularity, but you can take a guess from the stats below.

According to a report, 1 in 4 adults in the US owns a smart speaker today, while one-third of the US population uses voice search features.

You need to understand what a voice interface is and how it works to be able to create a voice user interface design that doesn’t frustrate users and provides a smooth user experience. Now that you have got the voice user interface definition, let’s dive into the next question, “how does a voice interface work.”

Buckle up; this will be a tiny pocket encyclopedia, only in fewer and simpler words. 

How Does a Voice Interface Work?

A voice UI is actually an outcome of the combination of several Artificial Intelligence (AI) technologies, including Speech Synthesis, Automatic Speech Recognition, and Name Entity Recognition. Voice UIs can be added to devices or inside applications. 

The backend infrastructure and the VUI’s speech components backed by AI technologies often get stored in a private or public cloud where the VUI processes the user’s voice and speech. AI technology understands the user’s intent and returns a response to the device. 

That’s the basics of a voice UI design. Most companies include a Graphical User Interface (GUI) and additional sound effects to the VUIs to provide the best user experience. Visuals and sound effects make it easier for the user to know when the device is listening, processing speech, or responding back to the user. 

VUI Device Types

Today, a wide range of devices can contain the VUI, such as:

  • Smartphones
  • Wearables like smart wristwatches
  • Desktop computers, laptops
  • Sound systems, smart TVs
  • Smart speakers
  • Internet of Things (IoT); locks, thermostats, lights

Voice User Interface – Advantages and Disadvantages

Advantages of VUI

  • Faster than typing: Dictating is faster than typing text messages, making it more convenient for users.
  • Ease of use: Not all people can get along well with technological devices. But any user can use voice to request a task from VUI devices or AI assistants.
  • Hands-free: In some cases, such as driving, cooking, or when you are away from your device, speaking is much more practical than typing or tapping.
  • Eyes-free: VUI provides an eyes-free user experience. In cases like driving, you can focus on the road rather than the device. It is practical for screen fatigue issues as well.

Disadvantages of VUI

  • Privacy concerns: Potential privacy violations of a VUI concerns some users. 
  • Misinterpretation & lack of accuracy: Voice recognition software still has its flaws. The software cannot understand and interpret the context of language, which causes errors and misinterpretation. Voice dictation for typing automatically may lead to mistyping since VUIs may not always differentiate homonyms, such as “there” and ”their.”
  • Public spaces: It can be hard to give voice commands to devices and AI assistants in public spaces for privacy and noise issues.

Why Voice Interface Design Is The Next Big Thing

User interfaces or UI are the bridge that makes interaction possible between machines and humans.

One particular type of UI, visual user interface, has exploded in popularity in recent years and surpassed typing.

71% of users prefer doing a voice search in queries instead of typing. Not only that, but the usage of voice-controlled smart speakers is also constantly increasing. More than half of the smart speaker owners in the US are using their devices on a daily basis. 

This expanding popularity of voice interfaces particularly interests the UX and UI designers. After all, the ultimate goal of both is to answer users’ needs and make the experience easy-flowing for them. 

In order to create a successful voice interface design, designers and developers need to understand the intricacies of human communication. Consumers expect a certain level of capability, a conversational tone, and fewer misinterpretations from AI assistants and smart devices as they interact daily. 

So, there comes the big question: how to design VUI that will bring value to users’ lives? 

How to Design a Voice User Interface

At its heart, designing a voice interface is not so different from designing GUI or any other UX project. We can break down the VUI design process into simpler steps.

Step 1: Conduct a User Research 

Start off by understanding the interaction between the user persona and an assistant persona in various engagement stages by customer journey mapping.

Focus on observing and understanding the needs, motivations, and behaviors of the user. Include voice as a channel in your customer journey map to identify how and where voice can be used as an interaction method.

The designer should highlight where voice interactions can be implemented in the user flow as an opportunity. This is valid for a scenario in which the customer journey map has yet to be created. If there is already a customer journey, the designer should see if voice interactions can improve the user flow. 

Designers should focus on solving users’ problems.

For example, if your customer support always gets the same question asked, then it might be an opportunity to integrate the conversation in the voice app.

You might like; User Persona Examples for SaaS Products.

Understand the Device Persona

Apart from understanding the user persona, identify the ability and the characters of the device (e.g., Alexa.)

Step 2: Make a VUI Competitor Analysis

Designers should conduct a VUI competitors analysis to see how competitors are implementing voice interactions.

Find out the use case for their app, voice commands used in the app, and check what their users think from the reviews.

Step 3: Define Requirements

Define user’s pain points, needs, and requirements.

Other than conducting user research and competitor analysis, you can try interviewing and user testing. Capture different scenarios before turning them into conversation flows. Use flow maps to write down user requirements with user stories. Then, design dialog flows for each of them.

Next, go on to prototyping VUI conversations with dialog flows showing the interaction between the device and the user. 

How to Prototype VUI Conversations with Dialog Flows

Key points to creating successful VUI dialog flows: 

  1. Keep the interaction conversational and simple,
  2. Create a strong error strategy,
  3. Confirm when a task is completed,
  4. Have an extra layer of strong security.

VUI designers need to create dialog flows and the entire conversation between the system and the users. Dialog flows should successfully guide users. A dialog flow script is a deliverable that consists of:

  • Keywords that initiate the interaction, aka voice triggers like “Hello, Alexa.”
  • Branches that show where the conversation could lead to
  • Sample dialogs for the users and the AI assistant

A dialog flow is like a prototype, and it is a script that covers the back-and-forth conversation. Fortunately, there are prototyping apps that simplify the creation of dialog flows. 

Some of the apps for prototyping VUIs are in the following.

  • Voiceflow: Collaboration tool to design, prototype, and create for Google Assistant and Amazon Alexa
  • Dialogflow: Google-owned platform for designing a conversational user interface into web apps, mobile apps, bots, and devices.
  • Speechly: Spoken language understanding solution to build voice user interfaces.

Amazon has their own Alexa Skill Builder to help designers create new Alexa Skills

Learn more about product design tools: 16 Product Design Tools You Need in Your Arsenal.

Step 4: Testing

Testing the dialog flows between the system and the user is like a role play.

One person plays the device, and the other plays the user to see if the conversation flows successfully. 

Step 5: Understand the Anatomy of a Voice Command

When designing a VUI, designers constantly need to think about the possible interaction scenarios and each objective (i.e., what is the user trying to achieve in this scenario?)

So, when a user gives a voice command, it consists of three factors at its core: intent, utterance, and slot.

Intent

The intent is the objective of the user’s voice command. Voice interactions’ intent can be either a low utility or high utility interaction.

A high utility interaction refers to very specific tasks, such as requesting a rock song play on Spotify or the lights in the living room be turned off. 

A low utility interaction, on the other hand, is about performing vaguer and hard to decipher tasks. For example, if a user asks for more information about a topic, the voice UI needs to check if it is included in its service scope and then asks more questions to understand and respond better to the request.

Utterance

The utterance is the way a user phrases or utters the voice command to trigger the task.

Some phrases for requests can be simple and easy to understand, like “Play me music on Spotify,” but voice UX designers need to consider other variations as well. For example, instead of saying “play…” a user can say “I want to hear music…” or “Could you play…”

The more variations designers consider, the better and easier the AI will understand the request and respond with the right action.

Slot

Slots are the required or optional variables that are requested from the user to fulfill the task.

For example, if a user requests “play me calming music,” the variable here is “calming.” Since the AI can also perform the request without the variable, this slot is optional. However, for example, if a user wants to book a restaurant reservation, the slot will be the hour, and it is required.  

Voice User Interface Examples

We have gone over what a voice user interface is and how you can create a voice user interface design. Let’s see some of the top examples of VUIs. Of course, the most used and popular ones are Siri, Alexa, Cortana, and Google Assistant. Which one do you think is the most competent voice assistant in the market?

Siri

Siri is Apple’s voice assistant that comes with Apple’s operating systems such as iOS, iPadOS, watchOS, macOS, and tvOS.

It was first released on October 4, 2011; and has been active ever since.

siri voice user interface

Alexa

Released in November 2014, Amazon’s Alexa has been first used in Amazon’s Echo smart speakers.

It has now made it’s way into most of the operating systems of smart devices, such as Android, IOS, and Fire OS.

alexa voice user interface

Cortana

Microsoft’s voice assistant Cortana helps you be more productive by using Bing search engine to do tasks like setting reminders and answering questions for you.

cortana voice user interface

Google Assistant

Available on smart devices and home systems, Google Assistant is a virtual-assistant VUI designed and developed by Google.

According to research carried out by Loup Ventures, Google Assistant is the most competent one among these voice assistants. 

google assistant voice user interface

Conclusion

The key takeaways of this post are:

  • A speech interface is a VUI (Voice User Interface) referring to an interface that requires voice interaction.
  • It is different from a tangible user interface, which requires interactions with physical gestures, such as tapping or swiping.
  • Designers need to carry out thorough research and observations on the user persona, device persona and create easy flowing dialog flows to be able to achieve a successful voice user interface design.

Frequently Asked Questions


What is the goal of a Voice User Interface (VUI)?

A Voice User Interface is designed to re-enact the feeling of conversations between device-user interaction and help users easily complete tasks or search for information without using their hands or eyes.


What do people ask their Voice User Interface?

Voice User Interfaces such as Alexa, Siri, and Google Assistant can perform numerous tasks, so what people ask their VUI-based virtual assistants can vary from daily tasks to business related search queries.


When was Voice User Interface created?

The first ever Voice User Interface was developed in a collaboration of Nuance and SpeechWorks through Interactive Voice Response (IVR) systems in 1984.

Mert Aktas

Mert is the Marketing Manager of UserGuiding, a code-free product walkthrough software that 2000+ companies trust in their user onboarding.

Copy link