Conversation Design: Bridging Social Gaps With Voice Technology

Kezia Kong

29/12/21

5 mins read

Design trends

In a world where our devices are becoming increasingly anthropomorphic, artificial intelligence, voice assistants and smart homes are fast emerging as hot topics.

As of 2020, there are 4.2 billion voice assistants being used in devices across the world. By 2024, this number will reach 8.4 billion units. And within each assistant is a carefully mapped out structure and journey that replicates everyday human conversation. Conversation design is what governs that. As designers, we often run a balancing act on the edges of the uncanny valley begging a question — how can we make technology acceptably “human”?

In this article, we will be considering some of the potential conversation gaps, more specifically in the Asian context. This digital divide hinges on the lenses of individual, cultural, societal and generational points of views. As a UX practitioner and an advocate for voice-first experiences, I believe conversation design might be able to build some bridges between them.

Expectations vs Reality

‘Mum! The toaster’s dumb.’
‘It won’t answer me when I talk to it.’

This quote comes from a real life case study, happening in millions of homes today. What we, the Millennial digital adoptees, experience compared to our Generation Z digital native counterparts are dimensions apart. Our expectations, relationship and receptiveness towards voice, chatbots and conversation-first interfaces vastly differ.

Blog_Conversation_Design_1 _{This illustration has been designed using resources from}_Freepik.com

The key factor to the difference lies in their childhood experiences with technology. The following behaviours are a result of the access to readily available technology during their digital upbringing.

Millennials, born between the years of 1980s and 1990s, were raised with technology as part of their everyday lives, with the caveat that all their activities were mediated by a screen. Although many would consider them digital natives, to voice-like experiences, they fall short. Essentially, they were never born into technology, but had left their analogue upbringing and grown into the digital world. They are equipped to switch between states, both analogue and digital flexibly and at will.

Generation Z denotes the post-millennial generation born from the years of 2000s to today. These centennials were raised on iPads, iPhones, YouTube and the like. Simply put, you’ll never see a happy baby at a table without one. They were raised in the comfort of their homes with the addition of Amazon Echo from 2014 and Google Home from 2016, not forgetting Siri in 2011. Their friendly voice-teachers sang them lullabies, played games with them, taught them algebra, and occasionally demonstrated what a cow sounds like. They expect the sandwich maker to have the same understanding as Siri, they think that the fridge should be able to compute simple math. This is their standard, ‘it should just work’.

At present, the reality of commercial smart home devices are at best fulfilling the millennial’s needs of a glorified speaker, but the centennial’s expectations of immediacy, intelligence and understanding are much more of a myth.

Let’s take a step back, weren’t we all into this whole smart home thing?

Tip: Adopt a voice-first approach when designing new applications. When designing, consider this — what would your app be saying if there were no screens? This school of thought drives greater intentionality in our experiences. Although the speed of technology often fails to keep pace with expectations, designers should aim to future proof our products for ease of eventual voice adoption.

The Elephant in the Room

Blog_Conversation_Design_2 _{This illustration has been designed using resources from}_Freepik.com

To all those front runners in smart home adoption, where are your devices now? Are they as I mentioned earlier, ‘a glorified Spotify player?’, I’m guilty of that. Have we really built smart homes or a show-and-tell piece that might not recognise your voice when that friend you’d like to impress comes over?

This begs a question, ‘Do we really want smarter homes?’. Here are some factors to consider from a value based acceptance model courtesy of Thulin & Henricson, 2019.

Perceived Privacy Risk
Perceived Intrusiveness
Perceived Usefulness
Perceived Enjoyment
Perceived Value

Two camps tend to form out of these factors. On one hand, those who perceive the device to be intrusive and an invasion of privacy are staunch in their beliefs against voice adoption, overlooking potential value. Conversely, those who identify the devices as useful and enjoyable conclude that value triumphs privacy complaints, openly welcoming more of such connected technology.

Along with these factors come innate human habits of taking the path of least resistance, refusing change and staying within their comfort zones. Our connected smart home devices, though holding infinite possibilities, have become neglected white elephants.

Which camp are we in and how might we be able to overcome inertia?

Tip: Push for purpose over novelty. Instead of focusing on creating smart products, our mindset needs to shift to where our products can be smart. Voice assistants are not always the solution. When more voice-enabled devices offer smart solutions that truly value add, it will not be long before adoption becomes widespread.

Conversation in Southeast Asia

Blog_Conversation_Design_3 _{This illustration has been designed using resources from Freepik.com}

Household penetration rates in Southeast Asia are at present far below our Asian brothers, led by India and China. To consider some numbers from Statistia, the global smart-home market is projected to reach US$77,280 million this year. The Asian market accounts for a third of the market share at $26,895 million, only $3 million shy of the leading American market. The global household penetration rate at present is at 10.6%, Asia is holding its own at 9.2%. Southeast Asia, however, showcases a significantly weaker rate of 3.4%.

It is apparent that we should weigh the context of culture. A good conversation happens when the user feels heard and understood. When interacting with voice assistants, these two main factors come into play. To be heard, one has to speak. To be understood, leaves a risk that it might fail. Both of these happen in public and if both go wrong, one might become embarrassed. In conservative markets where deeply ingrained cultural nuances and social etiquette are the rules of life, it is no wonder the resistance to adoption is high.

The silver lining is that the future is bright. Despite Southeast Asia’s conservative increase of 2.9% in the next 5 years, the greater Asian market has a forecasted growth of 11.1% running ahead of the 10.8% globally. As iProspect’s white paper suggested, the prevalent thought that voice makes one look ‘cool’ drives the adoption rate in those dynamic markets. This presents an opportunity for a shift in market focus for the smart home category.

Asian Language Support (As of Nov 2020)
Apple’s Siri: Japanese, Korean, Mandarin, Cantonese, Malay, Thai
Google Assistant: Japanese, Korean, Hindi, Vietnamese, Indonesian
Alexa: Japanese

There are more than 4.3 billion speakers of nearly 2,300 different Asian languages. With Western assistants offering limited Asian language support, consumers turn to homegrown options in China with Xiaomi’s Xiao Ai, Alibaba’s Tmall Genie, and in Korea with Naver’s Clova, Kakao’s Mini among others. The voice revolution in Asia is picking up speed fast, language support is key for market penetration.

Are our devices ready for our future selves?

Tip: Upskill fast and learn the trade. As the market shifts to the east, designers need to prepare themselves and be ready for the demand. There are cultural nuances in languages that can only be picked up by native speakers, the field needs you. Voice is not difficult, it’s just a different mindset.

Are We All Futurists?

Blog_Conversation_Design_4 _{This illustration has been designed using resources from}_Freepik.com

We have spoken about uneven generational expectations, the aftermath of voice assistants and our context in Asia, but what is the future for voice?

Here’s a thought starter: COVID-19 has accelerated voice-technology adoption and increased usage in many homes. Voice assistants allow for communication and transactions without touch making them the much required solution for the existing restrictions. At the same time, with users stuck at home with nowhere to run, the voice gaming industry has also been booming.

As our younger generation matures and hungers for fresh digital experiences, the older generation relies on education to assimilate into the new. Voice, if used well, could potentially become a bridge as the common language between them.

The value of a conversation is immense. From our first few words to our final breath, we speak and communicate endlessly throughout our lifetimes. Our very words hold weight and power to create worlds into existence.

As designers, we get to craft and create these experiences, bringing numerous dimensions to life. Our structures and our flows outline the blueprints to the world of voice that awaits exploration by the coming generation. We have a legacy to build and a new normal to create so that ‘it just works’ will become our baseline.

Will we be willing to start the conversation to create this glorious voice-driven future?

Kezia Kong