Bugs Bunny Brought To Life Through 5G and Microsoft’s AI
February 12, 2021Augmented reality (AR) may have a future in retail stores of tomorrow. Unlike VR, AR content is displayed over a person’s field of view. This provides a more immersive and realistic way of interacting with the virtual world. AT&T’s store in Dallas, Texas has incorporated this technology to provide a unique experience to its walk-in customers.
The store has a large display where a life-size, high-definition Bugs Bunny greets you by name and tells you he needs your help to find several golden carrots hidden throughout the store. The AR system makes use of several different types of technology to make the Looney Tunes character seem real.
This includes 5G, augmented reality, artificial intelligence and a Custom Neural Voice created with Microsoft Azure AI technology. Bugs is able to follow your direction thanks to 5G, augmented reality, artificial intelligence and a Custom Neural Voice created with Microsoft Azure AI technology. Speech, an Azure cognitive service is responsible for the life-like speech interaction.
“One of the things we hear from our customers is they like the idea of communicating with their customers through speech,” said Eric Boyd, corporate vice president for Azure AI Platform at Microsoft. “Speech has been very robotic over the years. Neural voice is a big leap forward to make it sound really natural.”
Bugs and the Looney Tunes characters aren’t confined to a single display, in fact customers can pick up a smartphone or display and interact with them. Though the responses are limited, Bugs and co. are able to understand names and sentences with the help of Microsoft’s AI. Apparently the tech may eventually be used to bring characters in story books to life.
A voice-over actor who’s responsible for Bug’s voice had to record about 2,000 phrases and lines in a studio. It is this data that allows the Looney Tunes character to interact in real-time with customers in a way that accurately reflects Bugs Bunny’s personality and all his inflections.
“The real technology breakthrough is the efficient use of deep learning to process the text to make sure the prosody and pronunciation is accurate,” said Xuedong Huang, a Microsoft technical fellow and the chief technology officer of Azure AI Cognitive Services.
He added, “The prosody is what the tone and duration of each phoneme should be. We combine those in a seamless way so they can reproduce the voice that sounds like the original person.”
This custom Neural Voice can also be used to create a custom voice font that doesn’t mimic an existing person. Instead, Microsoft can create composite voices by bringing together different backgrounds. When it does make its way into mainstream tech, it could provide us with real world benefits that could help in the education sector.