Conversational Assistants’ Next Step: Communities
Conversational assistants (sometimes called conversational agents or virtual assistants) are digital participants in the ongoing conversation between users and computers. Conversational assistants are now found everywhere: in smart speakers, mobile devices, laptops, desktop PCs, automobiles, smart devices, and kiosks. Employing their voices, text, and/or graphics, users of these assistants, like Apple’s Siri, Amazon Alexa, and Google Assistant, can ask questions, enter orders, conduct transactions, perform searches, or handle other routine tasks.
Conversational assistants are said to “interoperate” if they can invoke one another and share data. And the need for that will soon be greater than ever.
For more conversational assistants are coming. Just as there are tools to generate websites, there will be tools to generate conversational assistants. Generative artificial intelligence, for example, will be able to create new and innovative conversational assistants.
Some conversational assistants will be general (e.g., Amazon Alexa), and some will be small and targeted to specific activities involving a single device, such as a TV or refrigerator. Conversational assistants can be public or private, be free of charge or require a premium, and have different political and personal biases.
Communities of Conversational Assistants
Conversational assistants are said to belong to a community when they make up a set that users can easily access and interact with. Some communities will consists of a walled garden, or a closed ecosystem that tightly controls the software and hardware to provide higher levels of security and consistent user experiences, but locks users into using only assistants within the community. Other communities will remain open.
Here are examples of four such communities:
Amazon’s Voice Interactive Initiative (VII)
Amazon Alexa uses a special hardware device so users can speak without being close to the microphone, and VII is a pioneering enhancement that enables users to access two assistants from a single device. With VII, users often toggle between two conversational assistants. VII devices determine which assistant responds to a user request and manages the presentation of results to the user—for example, halting the playing of a song so the user can hear the results from an Alexa request. VII has been implemented on a variety of devices, including the LifePod Smart Speaker, LG TVs, and Samsung refrigerators. While VII enables users to switch between assistants, the assistants cannot interoperate. Users may need to reenter data that they have seen, spoken, or heard in interactions with previous assistants.
The government of Estonia is implementing a community of “Bürokratts,” conversational assistants based on software algorithms that are autonomous, capable of learning, and perform tasks traditionally performed by humans A Bürokratt assistant enables citizens to access public services through text and, soon, voice-based interactions. Estonians can use Bürokratt to apply for family benefits, file taxes, renew passports, and, eventually, even apply for bank loans. The Bürokratt network maintains a registry of available government agencies that provides information or services and directs user requests to the right Bürokratt. Users may eventually access any of dozens of Bürokratts. A Bürokratt will not store centralized data nor share data with other Bürokratts.
Conversational Assistants Based on Generative AI
Generative AI, whose most famous incarnation is OpenAI’s ChatGPT, can create data in the form of images, audio, and text by training on immense datasets; recently, OpenAI announced ChatGPT plugins for conversational assistants. By using the ChatGPT plugin as a search engine, users can potentially locate and interact with hundreds of conversational assistants. A recent OpenAI video hints at how assistants might interoperate: OpenTable is accessed to present recipes to the user, Wolfram calculates the calories associated with each recipe, and then Instacart orders ingredients.
OVON’s Protocols for Interoperable Conversational Agents
At a recent online seminar, the Open Voice Network (OVON) discussed and demonstrated some protocols that enable conversational agents to invoke one another and share data. The protocols will enable conversational assistants to interoperate by delegating, channeling, and mediating, as discussed in the January-February 2023 issue.
Some communities of conversational assistants will consist of only a handful of members; other will be potentially unlimited in size. Some communities will enable users to only switch between members, and others will also enable members to interoperate—share processes and data.
I personally like the openness of the OVON approach, as it provides interoperable protocols that can be used by any community of conversational assistants.
James A. Larson, Ph.D., is a senior scientist with the Open Voice Network. He can be reached at firstname.lastname@example.org.