The availability of voice search and voice-first technology is growing rapidly. On the other hand, getting people to make it a part of their daily lives is a completely different matter.
Television search ready for voice
Using voice commands to search for something to watch seems like a slam-dunk for the technology. Over half of US consumers now use search functions provided by their pay TV set-top box or other TV connected device. They use search even though they must navigate through clumsy on-screen keyboards using the arrow keys on a TV remote. Simply saying the name of the show into the remote should dramatically improve the experience of search.
More consumers have voice search capabilities available to them than ever before. In Q1 2016, 17% of U.S. consumers said they had access to video voice search services. The number increased to 21% in Q1 2017. Moreover, voice recognition accuracy is pretty good in systems commonly available to consumers.
Voice search is making slow progress
While it is good news that more people have access to voice search usage growth is quite another matter. In Q1 2016 51% of those with access used voice search. One year later the number had fallen to 45%. What’s more, the number of people using voice video search daily fell slightly over the same period, from 24.6% to 24.4%.
Normally with a rapidly advancing and maturing technology, we would expect to see adoption and usage increase over time. That is not happening with video voice search. Is it possible that voice-first devices coupled with AI assistants can help consumers find what they want to watch?
Voice-first devices not ready to help
According to VoiceLabs, the outlook for Amazon Echo and Google Home is very promising. It says that 24.5 million voice-first devices will ship this year bringing the total number shipped to 33 million.^ It also says that Alexa, the AI assistant for Amazon products, has become extraordinarily popular with third parties since Amazon opened up the technology to them last year. The number of Alexa skills available has grown from about 1000 in June 2016 to 7000 in January 2017.
However, most of these skills are never used by consumers. Just 31% of the Alexa skills have more than one consumer review. Moreover, when a consumer downloads a voice application to their Echo or Google Home, there is only a 3% chance that it will still be in use by the second week.
Though only a small proportion of the Amazon Alexa skills are movie and TV related, it could be they are no more successful than the rest. My experience with the Dish Alexa skill lends credence to VoiceLabs data. I installed the skill, tried to use it for a day or two, and then never returned.
What’s missing from voice video search
Simply put, AI assistances and voice search systems are terrible at figuring out the context of the voice request. The way Alexa skills cope with this is forcing consumers to speak in very specific ways. For example, “Alexa, tune to HGTV” worked with the Dish skill, while “Alexa, tune to Home and Garden TV” didn’t. Forcing a user to have to learn how to use a skill is certain to fail and likely why consumers continue to use so few voice apps after they try them.
Voice search systems are only useful when a consumer knows exactly what they want to watch. Most often they start in one place (say, Tom Hanks movies) and end up somewhere else (Meg Ryan in Kate and Leopold.)
That is why voice search and discovery is struggling to take off. They need to understand context – and that is a very hard thing to do.
If you would like to learn more about this subject, join the free webinar How voice search and discovery can boost revenue and customer satisfaction Thursday, August 17th at 8 AM Eastern, 11 AM Pacific, 4 PM BST (or you can listen to the recording.)
Why it matters
Video search should be a perfect application for voice technology, as it saves the consumer time and allows them to remain relaxed.
Though many have access to voice, few use it on a daily basis.
Voice-first solutions like Amazon Echo and Google Home perform little better for video search and discovery.
Voice search technologies are limited in their usefulness because they do not understand context.
^A voice-first device uses voice as its primary input and output mechanism.