Google Assistant actions can now continuously listen for specific words
Google today detailed new tools for partners developing on Google Assistant, its voice platform used by over 500 million people monthly in 30 languages across 90 countries. Actions Builder, a web-based integrated development environment (IDE), provides a graphical interface to show conversation flows and support debugging and training data orchestration. Continuous Match Mode allows Google Assistant to respond immediately to a user’s speech by recognizing specified words and phrases. And AMP-compliant content on smart displays like Nest Hub Max speeds up browsing via the web.
Google also revealed that Duplex, its AI chat agent that can arrange appointments over the phone, has been used to update over half a million business listings in Google Search and Google Maps to date. Back in March, CEO Sundar Pichai said Google would use Duplex “where possible” to contact restaurants and businesses so it can accurately reflect hours, pick-up, and delivery information during the pandemic. The company subsequently expanded Duplex in a limited capacity to the U.K., Australia, Canada, and Spain, adding support for the Spanish language in the last instance.
“What’s at the heart of [the Assistant’s] growth is the simple insight that people want a more natural way to get what they need,” Google Assistant director of product management Payam Shodjai wrote in a blog post. “That is why we have invested heavily in making sure Google Assistant works seamlessly across devices and services and offers quick and accurate help. Over the last few months, we have seen people’s needs shifting, and this is reflected in how Google Assistant is being used and the role that it can play to help navigate these changes.”
Media, Continuous Match Mode, and AMP
With Home Storage and Continuous Match Mode, Google aims to spur the development of more contextually aware apps for Google Assistant. Home Storage provides a database for devices connected to a home graph — for example, a wireless network — that allows developers to save progress for individual users, like a score in a puzzle game. As for Continuous Match Mode, which will roll out over the next few months, it lets Assistant recognize specific words or sets of words that developers define.
A bit more on Continuous Match Mode: Before Google Assistant begins listening for responses, it will announce the mic will remain enabled so users don’t have to employ additional prompts. According to a Google spokesperson, recording can continue for a maximum of 180 seconds — developers set the duration based on their requirements — but users can exit the mode by saying “cancel,” “exit,” “quit,” “stop” or “pause.” Perhaps more importantly, Continuous Match Mode respects account-level privacy settings and doesn’t treat voice data any differently.
On the media front, Google Assistant’s updated Media APIs support longer-form sessions and enable users to resume playback of content across devices. (For example, you’re able to start video, music, and podcasts from a specific moment or pick up where you left off during a previous session.) And later this summer, Google Assistant-powered smart displays will gain support for the AMP (Accelerated Mobile Pages) framework, beginning with news articles from specific partners before expanding to other web content categories.
AMP is an open source framework designed to speed up mobile web pages — Google asserts it can cut load times to less than one second by balancing the likelihood of a user clicking a result with device and network constraints. Hundreds of thousands of web domains across billions of pages use it (including VentureBeat), and it is Shodjai’s belief it will enable new, faster-loading smart display experiences via the web. “[We] want to bring the depth of great web content together with the simple and robust AMP,” he wrote.
Actions Builder and Actions SDK
Actions Builder is intended to eliminate the need for developers to hop between the Actions Console and Dialogflow, Google’s natural language understanding (NLU) platform, to build voice apps (which Google calls “actions”) for Google Assistant. As alluded to earlier, it allows users to manage NLU training data and provides advanced debugging tools, with native Actions Console integration that facilitates the building, testing, launching, and analysis of actions in one place.
Complementing Actions Builder is an updated Actions SDK that delivers file-based representations of actions and the ability to use a local IDE. Now, developers can author NLU and conversation schemas locally and bulk import or export training data to improve conversation quality, or use a command-line interface to build and manage actions with existing source control and continuous integration tools.
Both Actions Builder and the Actions SDK benefit from a new conversation model and improvements to the Google Assistant runtime engine. For instance, intents and scenes let developers define training data and behavior for specific conversational contexts, with scenes serving as building blocks to delineate active intents, error handling, prompt-based responses, and more. Scenes also separate conversational flow definitions from fulfillment logic so they remain reusable across conversations, with transitions indicating when one conversational context switches to another.
On the subject of the runtime engine, Google says it now provides faster responses and a smoother overall experience. It is also “smarter” insofar as Actions understand users better with the same amount of training data. “[It is now] easier to design and build conversations and users will get faster and more accurate responses. We’re very excited about this suite of products which replaces Dialogflow as the preferred way to develop conversational actions on Google Assistant,” Shodjai said.