Alexa, Amazon’s smart assistant enters a new era with GenAI

Five years of Alexa and big numbers seal the interaction between people and Amazon’s smart speaker. To celebrate the anniversary, Amazon Italy organised a party meeting with the press at the Amazon Black Friday Fun House, a temporary shop opened for the period linked to the November promotions period in the heart of Milan. Arriving in Italy in the autumn of 2018, following its launch in the United States four years earlier, Alexa has conquered space in the homes of Italians, who have needed some time to become attached to the smart speaker. According to Amazon Italia, the proof lies in the more than 28 billion interactions recorded over the last five years, with more than a third (11 billion) occurring during the last twelve months.

The rise of Alexa

At the root of the growth is people’s greater openness towards an object that is useful in everyday life for the multiple assistant functions it performs, providing information, answers to questions and solutions to doubts, as well as comfort by listening to music tracks and anticipating the weather forecast for the day, or even the possibility of managing the smart home (from the TV to the thermostat, via lights, washing machine and cameras) simply and immediately with the voice. The other reason concerns the growing number of devices Amazon markets with the integrated voice assistant: since the first Echo, the list has lengthened with Echo Show, Echo Auto, Echo Bud, Echo Pop, Echo Dot and Echo Studio. A base to which a huge number of third-party products have been added, with which Amazon has signed agreements for Alexa integration.

‘Over the past few years, we have continued to enrich the range of Alexa-integrated devices to meet all of our customers’ needs and to offer ever new ways to interact with our voice assistant, thus making their days easier, more organised and more fun,’ explained Giacomo Costantini, Business Development Manager of Amazon Alexa Italy.

Amazon Italy

Talking to Alexa seemed a step forward in the human-machine relationship that remains dear to every technology company. However, that progress seemed backwards after the arrival of ChatGPT and Bard and Bing AI, as generative AI software trained with billions of data and their ability to respond accurately to user requests suddenly aged voice assistants. However, Amazon has not been idle and has invested a lot of money and time in developing a ‘human-capable assistant‘, as the company’s vice-president David Limp called Let’s Chat.

As much as Alexa has always used artificial intelligence to recognise questions and respond on its merits, the leap forward we will see in the coming months is the result of the use of Large Language Models, such as the Alexa Teacher Model, used to teach Alexa to give more accurate answers and to turn it into the ideal companion for the home.

Alexa Let’s Chat, the evolution of the voice assistant

In order to facilitate a better understanding of context, the ability to maintain a single conversation composed of many questions and answers instead of limiting the interaction to mini-segments with individual repartee, and also the ability to resume the conversation without having to call the voice assistant’s attention each time by saying its name, Amazon has developed and optimised its language model based on five fixed points. No delayed responses during a dialogue; the interaction in the home must be personalised with each member of the family, providing relevant information with the needs of each of them; it must go beyond data, and the assistant must be able to make jokes and express opinions; as part of the family, it must safeguard the privacy of each of its members; Alexa must know how always to do the right thing because no mistakes are allowed.

Amazon, Alexa

Amazon showed a glimpse of the new Alexa’s potential with a couple of Limp-managed conversations about football and the menu for dinner with friends. Limited as it was, the experience was useful to get a sense of the progress and, more importantly, how much the new course will influence Alexa, laying the groundwork for its broader and more detailed role within the home dynamics. Let’s Chat will be available in the US by the end of the year and will later cross borders to Europe, within all Echo devices, old and new.

Of course, compared to GenAI’s models, Amazon is starting from a different and more complex terrain because Alexa does everything by voice and does not involve written requests, so its effectiveness depends first and foremost on the recognition of speech and its many linguistic and sound nuances by its interlocutors. At the same time, because of the type of dialogue it establishes with people, Alexa has less time for comprehension, and for providing a response than software such as ChatGPT, since by recalling the conversation between people, there cannot be gaps of several seconds between a request and a reply. In short, the difficulties are many, as are the obstacles to be overcome. Still, Amazon has demonstrated repeatedly that it has the strength, resources, and capacity to succeed in its aims. We will see if it continues to surprise us with Alexa as well.

Alessio Caprodossi is a technology, sports, and lifestyle journalist. He navigates between three areas of expertise, telling stories, experiences, and innovations to understand how the world is shifting. You can follow him on Twitter (@alecap23) and Instagram (Alessio Caprodossi) to report projects and initiatives on startups, sustainability, digital nomads, and web3.