Why isn’t voice commerce taking off?
Photo: Getty Images/basketman23

Why isn’t voice commerce taking off?

eMarketer last week lowered its outlook for smart speaker buyers (consumers making a purchase via a smart speaker) and smart speaker users (consumers using smart speakers for any purpose).

In a statement, eMarketer said it believes smart speaker usage still faces hurdles over payment security and privacy. 

Hub Entertainment Research’s recently-released “The Case for Voice Control” study, based on a survey of 2,512 U.S. consumers, likewise found 59 percent of those who regularly use a smart speaker have concerns about privacy. Those concerns include the threats of unwanted listening (91 percent) and data being unknowingly collected (90 percent).

The heightened privacy concerns follow numerous stories of digital assistants eavesdropping in homes. European Union privacy watchdogs indicated last month that they’re working on ways to police the reach of digital assistants into private conversations.

eMarketer also points to the absence of screens on most smart speakers as an inhibitor. Although manufacturers are releasing smart speakers with screens, many users haven’t felt the need to upgrade.

“There’s a good deal of friction in the voice-based buying process because people can’t see what they’ll actually be purchasing unless they have a screen on their smart speaker,” said eMarketer principal analyst Victoria Petrock. “So, most of the purchases made today are reorders and things that don’t need to be inspected.”

On the positive side, eMarketer upped the estimates for the percentage of users listening to audio (81.1 percent) or making inquiries (77.8 percent) on smart speakers. Confirming other reports, however, eMarketer found consumers aren’t using the devices for advanced commands that might incorporate research and other shopping activities.

“Though there are thousands of smart speaker apps that do everything from let you order takeout to find recipes or play games, many consumers don’t realize that they need to take extra and more specific steps to utilize all capabilities,” said Ms. Petrock. “Instead, they stick with direct commands to play music, ask about the weather or ask questions, because those are basic to the device.”

Discussion Questions

DISCUSSION QUESTIONS: What do you see holding back the overall adoption of smart speakers as well as their use as a purchasing tool? Do you see friction in the shopping experience with smart speakers as a short or long term challenge?

Poll

30 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Paula Rosenblum
Noble Member
4 years ago

I think there’s a price comparison issue, too. Especially with Amazon. How do I know I’m not getting gouged this time? Alexa will not be impartial.

As for the privacy issue, this is not going away. In fact, it’s going to escalate. The presumption that Millennials don’t care about privacy was always wrong…there was no data to prove it at all, at least not when we studied it.

Trust has to be earned. That hasn’t remotely happened yet.

Jeff Weidauer
Jeff Weidauer
Member
4 years ago

Privacy is a huge concern with smart speakers, and the lack of a screen puts a damper on shopping use. Perhaps the greatest hurdle is its stationary location. Smart speakers are essentially talking lava lamps, better suited for entertainment than productivity. Voice commerce will grow, but that growth will be through mobile devices, not a countertop appliance.

Oliver Guy
Member
4 years ago

I feel right now that faith in the result is a key barrier. Asking a voice assistant to turn on the lights and it mis-hearing you is one thing but if you ask it to add a given item to your shopping basket and it gets it wrong this could be costly – as well as inconvenient.

Another area is ease of integration with multiple store platforms. Amazon might want to keep you inside their ecosystem – so shopping at Kroger, Tesco or Albert Heijn may not be so easy.

It is possible that adding screens to smart speakers – like an Amazon Echo Show or a Google Hub – could well become a conduit to helping address both of these challenges if screens are leveraged in the right way.

Adrian Weidmann
Member
4 years ago

Trust, Accuracy, and privacy seem to be the top three reasons. The frustration everyone has had with robo-operators all claiming “we value our customers” would be a contributing factor as to why shoppers don’t trust smart speakers. They’re certainly being used for mundane tasks like music requests and the weather but the systems haven’t yet gained our trust when it comes to spending our money.

Bob Phibbs
Trusted Member
4 years ago

This is totally consistent with the survey I did with Netsuite Oracle last year, “HEY ALEXA, 95% OF CONSUMERS DON’T WANT TO TALK TO A ROBOT WHEN SHOPPING.“ This is yet another way we’re being sold a bill of goods that we’ll all be doing something no one really sees a need for currently, nor enjoys doing, without being able to see or touch before buying.

Richard Hernandez
Active Member
4 years ago

The privacy issue has and always will be a concern and this will be a sticking point for a long time. I don’t see wider acceptance of the technology until this is resolved.

Suresh Chaganti
Suresh Chaganti
Member
4 years ago

Voice-only is an inferior experience. Voice assistants can in theory be personal assistants — Like answering “tell me what brand I ordered last month” kinds of questions. But they are quite far from that. Privacy concerns, an inability to process complex commands, and an inability to have contextual understanding limits the e-commerce uptake.

Ben Ball
Member
Reply to  Suresh Chaganti
4 years ago

I think the perceived shortcomings of voice assistants regarding their serving as viable personal shopping assistants is due to user education more than actual device capability. Our household is perhaps a level up in user savvy when it comes to voice assistants — primarily due to my wife — but our Echo Show and multiple linked devices serve exactly the purpose you describe, Suresh. And being able to link locations so that any item entered shows up on the shopping list whenever and wherever she accesses the list on her phone is a real benefit for her.

Suresh Chaganti
Suresh Chaganti
Member
Reply to  Ben Ball
4 years ago

I agree. I have had only one Echo – speaker only, and didn’t go beyond using it as a novelty. An Echo with a screen and multiple linked devices could have been a better experience. The use cases probably need to be campaigned more. It is still in the very early adopter phase I think.

Jeffrey McNulty
4 years ago

The privacy issue is moving to the forefront of many consumers’ minds. With the ubiquity of emerging stories about the utter lack of security protocols, many customers are becoming concerned about their safety, data, and digital footprint.

As with most technology, the manufacturers focus on sales, user interface, and scale as opposed to strong security measures. I am looking forward to seeing technology that meshes security with performance.

Dave Bruno
Active Member
4 years ago

Privacy is without question an issue, but the number of smart speakers sold tells me that privacy is not the biggest inhibitor to adoption. In my opinion it’s a skill issue. The average consumer just doesn’t have the ability to remember all the skills it takes to do anything beyond the basics. Finding and installing the skills that add value is problem one, and remembering the skills – and the specific syntax required to execute them – is problem two. Until Alexa makes it easier for us to put her to work, she will be relegated to life as a music concierge, a list maker and a reminder.

Ken Lonyai
Member
4 years ago

Smart speakers and voice commerce are two different entities. Voice commerce can exist in a mobile app say as part of a grocery shopping list. In many ways concluding that voice commerce is problematic is to miss that it hasn’t been deployed in the most favorable manner. Better UX design such as applying voice In multimodal scenarios like devices with screens and bidirectional audio will help, as will application design/policy that considers and values user privacy.

Ryan Mathews
Trusted Member
4 years ago

Privacy is obviously the primary concern, especially in light of news stories of “audited” conversations, hacks, broadcasts to other users, etc. I’m actually bullish on the long-term prospects for all forms of voice-activated technologies, because I believe that — for most people — the easier the interface the higher the rate of adoption. So I see the friction as a short-term problem, although one that is very, very serious and one — unless it is fixed — that could set the voice-activation industry back 10 to 15 years.

Ben Ball
Member
Reply to  Ryan Mathews
4 years ago

Glad to hear you are “bullish on … voice-activated technologies” Ryan. I guess I am too, since I keep trying to master voice-to-text, easily the most diabolical digital technology since auto-correct!

Mark Price
Member
4 years ago

The slow adoption of smart speakers for purchasing is consistent with patterns of technology adoption historically. Usually what you see is that early adopters start to use the technology first, because they enjoy using new technology and are willing to put up with early-stage problems. The next group of customers, often called early majority, are waiting for the early adopters to signal that the new technology is ready for prime time. That has not yet happened.

As additional capabilities get rolled into the main smart speaker apps, you should expect more widespread adoption. Issues of private privacy will always be a concern, but over time consumers become more comfortable with technology and their privacy concerns diminish.

Gene Detroyer
Noble Member
4 years ago

We have traded off privacy on our computers, emails, what we search, our phones, our locations, our texts and we think nothing of it. We haven’t stopped using these tools, we use them more and the same will happen with voice-activated devices. Privacy will be something to talk about, but nothing to take action to secure.

Regarding some of the hurdles to shopping via voice, my colleagues have nicely highlighted the biggest one — seeing the products. Voice works if you want Kellogg’s Corn Flakes. It doesn’t work so well if you are buying almost anything where you want to compare products.

But worry not, soon you will say “Alexa, I want to shop.” Then your TV screen will pop on, and you will navigate by voice seeing exactly what you want to see as if it were your phone or your browser.

Ananda Chakravarty
Active Member
Reply to  Gene Detroyer
4 years ago

Gene, you hit on the key point. Voice enablement will be the accessory while the visual screen will be the dominant tech — that’s where I feel things will be heading with voice tech. It’s the easier path for consumers from ecommerce.

Cynthia Holcomb
Member
4 years ago

Listening devices dressed up as smart speakers. Likely I know too much. It is rather disconcerting to have a personal discussion manifest itself into an advertisement directly targeting my specific private conversation embedded in a national news website. More than once! Common sense seems to point out either advertisers are magic or maybe one of the many devices in our homes is listening. I am not a fear monger! Only connecting the dots of evidence through personal experience.

As far as shopping goes, where do we start? Could Alexa, a product of the world’s largest retailer with a foothold in almost every industry one can think of, possibly have biases towards certain outcomes? I agree if shopping for anything beyond canned goods and toilet paper is to be relevant, mobile has the least amount of friction. That said, established mindsets in the field of computer science and new cutting edge tech appear to be leaning toward AI-enabled machine learning which solves for mathematical patterns to apply to humans rather than solving for human intelligence.

David Naumann
Active Member
4 years ago

Voice shopping has many challenge for broad appeal and it may never be a pervasive shopping method unless it is enhanced. A combination of voice and display is much more compelling, especially if it provides an opportunity to verify purchase accuracy and compare prices. The future may enable voice commands to replace keyboards and essentially voice speakers will be merged with smart TVs or computers.

Doug Garnett
Active Member
4 years ago

What’s most surprising is that eMarketer is surprised. When did marketers lose their respect for the consumer pocketbook?

The risk of misordering is perceived to be huge — hence all the comments about needing a screen because a screen will confirm that you were correctly heard.

Consumers worked hard to get that money INTO their pocketbooks, they aren’t going to trust it going out through voice ordering — most likely ever.

As a former tech guy, I’m really shocked at how little the tech industry has learned about people. We are only talking about this because it was hyped by the tech sector and bought into by the wrong consultants.

Retailers should ignore this all for quite a while — it’s going to be a long time before any of this changes.

Ken Morris
Trusted Member
4 years ago

I believe the biggest issue is privacy. People don’t feel they are secure if something in their home facilitates the invasion of their privacy. You wouldn’t allow people to physically enter your home, so why would you allow them to virtually do the same? From a shopping experience perspective frictionless commerce is the goal, but few of us trust that the nuance required when making a purchase can be accomplished without a visual element. We will get over this eventually, but for non-commodity purchases this will take some time.

Ananda Chakravarty
Active Member
4 years ago

Experts on this thread have already touched on trust issues — but it’s more than that. I like Suresh’s response that voice-only is a lesser experience. A big part of shopping is compare and contrast — I see Product 1 next to Product 2, examine the details, maybe touch and feel them and perhaps even try the products on. Voice commerce leaves little ability for me to shop and compare. Even comparing price is challenging, and the consumer still wants to make their decisions based on fixed data rather than hunting down every last detail for a hundred different products.

Even with video or images, the experience is limited to what’s presented before me — as a consumer I know that. The option to select is taken away, and hence the interest in using the tech. Not just privacy, but also freedom to choose vanishes with voice commerce. Voice commerce is not for shopping but for buying and replenishment.

I just can’t see my Alexa Show rolling out a list of 30 different brands of cookies, their price points, their special offers, followed by ingredients — and see myself making a solid decision afterwards. Even with AI and personalized data, it will be a challenge to narrow it down — and a slight miss of forgetting my favorite Oreos would turn me off as a consumer. Add to that the fact that in-home voice products are not as portable or personally relied upon as smartphones and the voice commerce option becomes even less attractive.

Jasmine Glasheen
Member
4 years ago

The way I see it, there are two big obstacles to smart speaker adoption: trust and accuracy. We already talked about trust, but smart speakers recently (unbelievably) declined in accuracy, with Siri’s accuracy rate coming in at around 53 percent.

If you have to do the search multiple times, it’s not convenient. And until smart speakers are consistently accurate, they aren’t worth the investment, the risk, or the time.

Ralph Jacobson
Member
4 years ago

Most of this slowdown of growth has to do with capability awareness. As the article states, most people don’t realize all the apps that are available. But also, as far as the commerce trend goes, I see security and even shopping journey obstacles involved. Shoppers do not feel in control of the experience with these speakers. I think they need to see their journey.

James Tenser
Active Member
4 years ago

Seems like we are talking about two realms of trust here:

The first and most cited is the creepiness of an always-on, in-home listening device that may or may not be capturing personal details of our lives — call it privacy-trust.

The second is the absence of confirming feedback within the purchasing process itself, especially with respect to order accuracy, pricing, and product selection — call it purchase-trust.

Proponents of voice commerce are so busy defending their solutions from worries about privacy-trust that they seem to have largely looked past the issues of purchase-trust. They need to get both right. I’m not holding my breath.

Camille P. Schuster, PhD.
Member
4 years ago

Privacy is an issue of course, especially with the news reports of voice devices listening in and having conversations reviewed. There is also the issue of having devices understand people’s accents. When that is an issue, getting the correct orders recorded is problematic.

Then there is the issue of not seeing the item being ordered. If I have to go to another device to see the item, I might as well order it from that device. For standard items that come in one package with one size or format, voice ordering may be fine if my request can be correctly transmitted. However, there is much to be done before voice ordering will be commonplace.

Kenneth Leung
Active Member
4 years ago

I think the issue is that the UI in voice for new purchase via smart speaker just isn’t there yet and you are tied to a single vendor (in the case of Alexa, Amazon). Imagine being in a household and having the speaker announce throughout the house that you are buying cereal and confirming all the brand, price, shipping details etc, via voice. We have already heard stories of children treating Alexa as the new Santa. 🙂

I can see doing limited assortment reordering or maybe adding items into the shopping cart/reminder list. “Alexa reorder kitty litter” is a lot easier than “Alexa order 2 cans organic garbanzo beans 16 oz. from Whole Foods.”

Lisa Goller
Trusted Member
4 years ago

Early adopters aside, evolving from e-commerce to voice commerce represents a big leap in consumer behavior that will take place over the medium to long term.

Despite the convenience, ease and time savings offered by voice technology, wary consumers distrust these shopping devices, which they view as stylish music speakers at best or surveillance tools at worst.

As a result, most voice technology users currently decline to experiment by using the power of their voice to discover products, find stores and make purchases to minimize risk related to their household’s privacy.

My overall impression is that voice tech is currently too one-dimensional in its focus on audio, just as e-commerce is intensely visual. Since seeing is believing, over the short- to medium-term, consumers will stick to traditional e-commerce or shop in store for richer experiences that engage more of the senses to build consumer confidence and brand trust — even if it costs consumers more of their time.

Shep Hyken
Trusted Member
4 years ago

The easy way to use a smart speaker for ordering is to order something you’ve ordered before. Once you take the time to order something, the next time is typically easier. The difficulty comes when choices have to be made. That said, the technology is getting better. We’re far from a “tipping point” where ordering via the smart speaker is as easy as going on Open Table to make a restaurant reservation. Think about how long that took to become a “mainstream” way of booking a reservation. Or, how long it took for buying an airline ticket online to become the norm.

JON STINE
JON STINE
4 years ago

Let’s start by properly framing the issue. AI-enabled voice assistance today and tomorrow goes well beyond smart speakers; the smart phone (with a screen, folks) will be the primary vehicle for voice assistance. You’ll use voice assistance in your car, you’ll use it to enter data into enterprise applications (see Salesforce’s good work.) We cannot and must not equate voice assistance with a piece of hardware.

Second, let’s recognize that voice is in its earliest of days, days similar to the Netscape-IE browser war days. Earliest of days in development, in capabilities, in usage. It wasn’t so long ago that 95% of America wasn’t about to shop on the internet for many of the same reasons mentioned here — fear of privacy violations, fear of data theft. Perhaps others also remember the dismissive comments applied to internet-based retailing — “hell, it’s no more than a store’s worth of revenue,” or “it’s only for the nerds — women will never use it.” (Yes, I actually once conducted a study for Intel as to the potential of US women using the internet to shop. Fortunately, I concluded “yes.”)

Third, as Ms. Petrock reported a few weeks earlier — not noted above — voice assistance has crossed the chasm in both availability and adoption. In the States, it’s reached early majority-level use. Simple use, for sure — but regular, active use nonetheless.

Fourth, we must try — please, try — to understand what voice can and can’t do. The studies show that we can speak 3X faster than we can type. And that we can read 2X faster than we can listen. Which suggests a microphone, speaker, and screen. The voice developer community — people who are making money by making money for others — has already moved to the necessity of voice-visual (multi-modal) communication for commerce. The best ones roll their eyes at these types of discussions.

Fifth, let me echo the words of Paula and others. There is a significant issue of consumer trust. Much as there was with the internet. Much as there is whenever new consumer technologies emerge. It’s trust in privacy, in data usage (voice is a biometric and a diagnostic), in how to use it.

Sixth, let’s read the eMarketer report — and not just the headline. The headline suggests that the annual increase in smart speaker purchasers is disappointing … because it didn’t reach the level forecast by eMarketer. Hmmm. Actually, the number of purchasers by smart speaker increased 18%. Yes, eighteen percent. So — which is more disappointing: the forecast or the reality?

Voice needs time. And voice needs standards, guidelines, the kind of governance that made the internet the world’s greatest value creator. (There are none at present.)

This is a transformative technology. Coming at us slowly and inexorably, as the tide. Given that, we have the opportunity to make voice worthy of enterprise and consumer trust.

Contact me if you’d like to help.

BrainTrust

"Trust has to be earned. That hasn’t remotely happened yet."

Paula Rosenblum

Co-founder, RSR Research


"There is the issue of not seeing the item being ordered. If I have to go to another device to see the item, I might as well order it from that device."

Camille P. Schuster, PhD.

President, Global Collaborations, Inc.


"The slow adoption of smart speakers for purchasing is consistent with patterns of technology adoption historically."

Mark Price

Adjunct Professor of AI and Analytics, University of St. Thomas