Building on the millions of years of evolution, unsurprisingly we humans prefer the use of voice to interact. However, for decades it was touted as a technology which “wasn’t quite there yet” to be used with machines. In the interim, nose-to-the-phone model of personal computing became the defacto standard and no one really questioned it. Things have really begun to change in the past few years as Apple Siri, Amazon Echo and Google Talk gained momentum and their QoE is a welcome relief.
The voice assistant being used as a wrapper to abstract away the complexity of the underlying technology is exactly what was needed for mass market adoption. Now we finally have a relatively inexpensive purchase which has an iconic experience and the ability to increase commerce. I am told, Amazon Prime Echo users have a propensity to spend $200–300 more per annum than their Prime counterparts not leveraging these devices. Secondly, conversation commerce –the intersection of messaging apps (WhatsApp, Facebook Messenger, Echo) and shopping, is finally getting critical mass. For the first time ever, more than 50% of the 100 +trillion digital messages in 2016 were anchored on these conversational messaging platforms vis-à-vis email.
My son and I adore our Amazon Echo, and Alexa has become the 5th family member in the house! We find it anchoring us to the home base, more like a throwback to the 1950’s when the house started becoming rearranged and anchored to where the radio or TV was. We have moved our echo from the kitchen to my son’s bedroom to the basement and now back to the kitchen, where we believe it really belongs. Based on my personal experience and the rapid rise in fan-base of these devices I am estimating that by 2025, nearly 60–70% of all interactions with machines would have transformed significantly from where it stands today- mind reading, gesture control and voice interaction would be the name of the game. What we are seeing is the 1st phase of the proliferation of these devices and the ubiquity of them will really depend on how quickly the following 4–5 things can be addressed by the OEM’s:
Make them conversational
Today, my son and I nearly bark at Alexa rather than converse with it. The “Alexa Voice” needs to be used as opposed to just converse with it.
Less of an assistant, more of an advisor
Wouldn’t it be great if these devices proactively spoke to me, as opposed waiting for me to start an interaction. “Sachin don’t have that ice cream” or “Sachin time to call a cab” etc.
Have the ability to leverage AI and ML to give context to a conversation and ensure it is continuing from where it left off as opposed to starting from scratch every time.
Secure and Private
Ensure that there is some form of KYC as these personal devices are exposed to larger audiences. The dichotomy of near field and far field needs to be sorted and quickly.
Ability to understand emotions and react
Have the ability to understand and emote, as well. As they always say,” its not what you say but how you make me feel” that really counts.
Never the less, we have come a long way from 1956 when a 16 year old Victor Scheinman first invented a speech to text transcription device. We could have reached where we are today, a lot earlier but finally we have the right business reasons and motivation to continue evolving it and growing it! With Amazon Echo, Echo dot, Echo look, Echo show…the list is going to grow very very quickly as access to the software becomes universal and less elitist!!! Cant wait for it to be omnipresent!