In May, as The Verge reported, Google showcased Duplex, a new capability of Google Assistant which journalists have called “jaw-dropping,” mostly due to its highly sophisticated and human-like AI. The pre-recorded showcase of Google’s newest AI technology left a lot to be desired, but the human-sounding virtual assistant remained a subject of debate in technological circles.
Recently, as Droid Life reported, the tech giant invited members of the press for a demonstration, giving media outlets and the public alike a closer look at Duplex.
Duplex, according to Ars Technica, sounds “deceivingly similar” to a human, and even comes with artificial speech disfluencies like “um” and “uh.” Google Duplex, voice activated through the Google Assistant, completes tasks on its own.
For the demonstration, Google had reporters act like restaurant employees, managing reservations and taking phone calls from customers. Except, in this case, it was Duplex acting on behalf of the customer, establishing the details of the reservation. With the “casual disinterest of a real person,” as Ars Technica put it, Duplex patiently waited, negotiated, even offered compromises, such as acceptable time range for the reservation.
Duplex, according to reports, sounds stunning, human-like, and real. Its speech flows, like speech of a real person would, with the occasional pause and stammer. In short, it reacts to the person on the other end of the line, just like a real human being would.
A generation better than the current Google Assistant voice, Duplex would put on a new personality for every call, coming across as female, male, a young person, or an older person, switching between virtual personalities with ease.
“Hi, I’m calling to make a reservation. I’m Google’s automated booking service, so I’ll record the call. Can I book a reservation for…” is what a phone call made by Duplex sounds like, according to reports.
Although the AI gives out information, the user has to authorize which information the software is allowed to share. Designed to make gentle corrections — like, for instance, steering back the conversation to the actual topic — Google’s Duplex is focused on what it has to do, and it does it almost suspiciously well.
Although impressive, the model that generates these voices is, according to Ars Technica, a far cry from Google’s WaveNet, an AI system capable of replicating human sounds like breathing. A sneak peak into everything Duplex is, is for now just that — a sneak peak. The company is “quite a long way from launch.”
“This is super-early technology, somewhere between technology demo and product. We’re talking about this way earlier than we typically talk about products,” Google’s Nick Fox told Ars Technica.