Articles for James A. Larson

Biographical Information

James A. Larson

program co-chair, SpeechTEK 2021

503-645-3598

James A. Larson, Ph.D., is co-chair of the W3C Voice Browser Working Group and chair of the authoring subgroup of the W3C Multimodal Interaction Working Group. He writes the Forward Thinking column for Speech Technology magazine, and teaches courses in the user interface design and implementation at the Portland State University and the Oregon Health and Sciences University. He is Vice President of Larson Technical Services, a speech application consulting and training company.

Articles by James A. Larson

The Prompt Box: From Humble Beginnings to AI Portal

19 Mar 2025

The evolution of the prompt box, from plain text input to sophisticated natural language understanding, has transformed user interaction.

ChatGPT: A Generative AI Revolution

04 Mar 2025

The AI landscape was altered forever, and OpenAI led the charge.

Shopping Made Easy: Features of a Modern Conversational Assistant

03 Dec 2024

An imaginary interaction shows the possibilities. Plus, speech technologies deserve their own Nobels.

Searchable CAs Will Make Conservational AI More Convenient

28 Oct 2024

As conversational assistants proliferate, they'll become harder to discover.

From Labyrinth to Lifeline: How Mobile Apps Are Revolutionizing Government Services

28 Aug 2024

Apps are fundamentally changing the way people access services like license renewals and tax filing, and speech technologies are playing a central role.

Pitfalls Facing Conversational Assistants: Hallucinations and Deepfakes

09 May 2024

Bad data (and actors) can have consequences.

Generative AI Is the Swiss Army Knife for Today’s Conversational Assistants

19 Mar 2024

Helpful assistants can become even more so thanks to genAI.

Large Language Models Will Transform Conversational Assistants

14 Feb 2024

Thanks to generative AI, conversational assistants will become everyday helpers.

Conversational Assistants and Privacy

21 Sep 2023

Guiding principles (and potential issues) for our AI-powered future.

Conversational Assistants’ Next Step: Communities

07 Jun 2023

As the apps become more ubiquitous, they'll need to interact.

When Conversational Assistants Meet: Delegating, Mediating, and Channeling

28 Feb 2023

How can voice assistants be made to cooperate with each other?

Conversational Agents Move Toward Interoperability

25 Jul 2022

On the path from garden walls to interoperable open agents.

Move over GUI, Hello SVUI

28 Jun 2022

Your Voice Tells All

11 Mar 2022

Speech app developers are finding ways to interpret your feelings and conditions.

An Open Letter to Voice Agent Platform Developers

23 Nov 2021

You're at the vanguard of a burgeoning movement.

Open Voice Network Hosts Interoperable Conversational Agent Workshop

25 Jun 2021

The workshop explored approaches for implementing interoperability of voice applications within and across platforms and produced a manifesto for the tone and general direction for future voice processing.

3 Trends That Will Shape IVA Development

23 Jun 2021

Voice assistants are poised to become more specialized and less platform-specific

Businesses’ Voice Assistants Are Coming to a Smart Speaker Near You

01 Nov 2020

Soon it will be de rigueur for every business to have its own IVA.

As Voice Assistants Multiply, When Will We Get a Registry?

01 May 2020

VARs can help tame the Wild West of voice assistants

Technology Is Making Dialogues More Life-Like, Conversational Interaction Speakers Note

17 Feb 2020

Advanced speech synthesis, analytics, and dialogue design are improving customer and company interactions, conference speakers agree.

Beware the Limitations of Machine Learning

29 Jan 2020

Systems that use only machine training will struggle with more complex requests

Q&A: Scott Hoglund on Conversational Banking

04 Dec 2019

Learn how conversational technologies are changing the way we bank, and how Discover is embracing this trend to better support companies and employees.

Q&A: David Morand on Integrating a Contextual AI Assistant with VoiceXML

27 Nov 2019

If you need a dialogue engine that will allow you to develop next-gen conversational IVR applications while being compatible with the VoiceXML standard, this is a must read.

Q&A: Sam Ringer Says the Revolution Is Coming — the Medium-Term Future of AI and ML

12 Nov 2019

Current AI and machine learning (ML) technologies are starting to change the way we build and innovate. However, the power of our current ML technologies is not fixed. Sam Ringer will explore where ML is at the moment.

Q&A: Greg Stack on Blazing a Trail to Successful AI Migration

05 Nov 2019

Greg Stack, Vice President, Speech-Soft Solutions, will be presenting at SpeechTEK 2020. Conference co-chair, Jim Larson, caught up with him to talk about "Blazing a Trail to Successful AI Migration."

Q&A: Dr. Nava Shaked on Evaluation, Testing Methodology & Best Practices for Speech-Based Interaction Systems

08 Oct 2019

Get a sneak-peak into Dr. Nava Shaked's SpeechTEK workshop in this Q&A. Learn everything you need to know about evaluation, testing, and best practices for speech-based interactions.

Q&A: Dr. Michael McTear on Building a Conversational Chatbot for Google Assistant Using Dialogflow

01 Oct 2019

Get a sneak-peek of Dr. Michael McTear's SpeechTEK workshop and learn how to build a conversational chatbot for Google Assistant using Dialogflow. Like what you see? Head over to speechtek.com.

Q&A: Bruce Balentine on the Basics of Conversational Chatbots

26 Sep 2019

Conversational chatbots may be the future (or even the present) but Bruce Balentine is taking developers back to basics in his workshop at SpeechTEK in April. Learn how to build chatbots that get results.

Q&A: David Attwater on the Ins and Outs of Conversation Design

24 Sep 2019

Everything you need to know about conversational design. Jim Larson talked to David Attwater, Senior Scientist, Enterprise Integration Group about his upcoming workshop, AI, and how human is too human?

Q&A: Deborah Dahl on Natural Language Understanding

18 Sep 2019

Jim Larson talked to Dr. Deborah Dahl, Principal, Conversational Technologies about the increasing importance and capabilities of natural language processing, speech recognition, and

Voice-First User Interfaces Speak to the Omnichannel Future

15 Jul 2019

Screen- and voice-oriented devices are becoming one and the same

Machine Learning and Innovative Speech Applications Dominate Conversational Interaction Conference

20 Mar 2019

The 2019 Conversational Interaction Conference emphasized innovative applications using the latest speech technologies, alternative architectures to ease the creation of conversational apps, and best practices for designing speech dialogs.

Is App Development Moving Toward User Interface Management Systems?

26 Nov 2018

App developers' jobs might be getting less complicated, again

8 Ways Advances in Speech Recognition Will Affect Our Lives

30 Jul 2018

From karaoke assistance to talking robots, here's a roundup of speech industry developments you'll be hearing about soon

Intelligent Agents Are Poised to Change the Conversation

10 Nov 2017

These apps can already understand what you say; soon they'll understand who you are and how you're feeling

What Will Future User Interfaces Look Like?

24 Apr 2017

As we enter the 'immersive age,' we need to prepare for a whole new way of interacting

Interactive Text Response: A Smart Update of IVR

10 Nov 2016

ITR systems let callers blend texting and talking

In a Mobile World, Voice and Graphical User Interfaces Need to Blend

14 Jun 2016

Developers have to design both visual and voice experiences for today's devices

Many Gadgets, One Interface

09 Nov 2015

In the smart home of the (near) future, a unified user interface—letting you touch, type, or speak—will control all of your systems and devices

Role-Play Apps Star in New Technologies

08 May 2015

It's time to leverage developers' know-how to improve lives

With Product Companion Apps, the Mobile Future Looks Even Brighter

10 Nov 2014

Up-and-coming tools add new accessibility to product information.

Voice XML Brings IVR into the Future

30 Apr 2014

Continuing demand gives these applications an impetus for improvement.

Eyes-Busy, Hands-Busy Computing

15 Nov 2013

What we all have in common with an HVAC repairman.

Assessing Speech-Enabled Help Apps

01 May 2013

Users seeking voice-enabled service and support have several choices.

A New Age for Computer Interactions

10 Nov 2012

Advances emphasize the role of the human mind.

Job Descriptions for Personal Assistants

10 Jul 2012

Radar O'Reilly is a tough act to follow.

Balancing Content and Control

01 Mar 2012

When consumers use multiple devices, give help where it's needed.

IVR Has a Black Eye

01 Nov 2011

Stop using frugal policies and start satisfying customers.

Multimodal User Interfaces Supplanting Voice-Only Apps

01 Jul 2011

Trend driven by statistical language models, speaker verification, video clips, and multilingual apps

Combining IVR and Smartphones

01 Mar 2011

Take advantage of today's visual displays and develop consumer product apps

Speech in My Pocket

01 Nov 2010

With the smartphone's growing popularity comes a slew of new speech apps.

Adding a Voice to Tweets

01 Jul 2010

Social media can incorporate voice to better engage users.

Grammatically Speaking

05 Mar 2010

Setting the standard for determining what a caller wants

Legal Issues Surrounding Speech

01 Nov 2009

Who will get the ball rolling on creating the rules?

Beyond Speech Recognition

27 Aug 2009

User property extraction systems pull more information.

On-Demand Services and Mashups

01 Jun 2009

Both allow for greater use of voice on mobile devices

Avoid the "Old-Folks' Home" with a Speech-Enabled House

02 Apr 2009

Technology exists to help the elderly live on their own.

Tomorrow's Technologies, Today's Applications

01 Nov 2008

We need to look beyond basic processing to fuller understanding.

Who Will Win the GUI-VUI Race?

22 Aug 2008

Multimodal applications present a new type of user interface.

The Evolution of IVR Systems

01 Jun 2008

The next phases of IVR development centers on multimodality and faster transactions.

The Expanding Speech Tech Universe

01 Apr 2008

New stars present opportunities for using speech technologies.

Escaping from Directed Dialogues

25 Jan 2008

Creating and implementing automatic classifiers is not as easy as it sounds.

Extending VoiceXML's Impact to New Markets

01 Oct 2007

Technology is available to handle many of the more mundane translation tasks

SISR: The Standard Semantic Interpretation Language

09 Jul 2007

Speech vendors should support SISR for easier porting of grammars and applications between platforms

Multimodal Mobile Apps Will Thrive

01 May 2007

Electronic companions are the wave of the future in consumer electronics

Policies and Technologies for Improving the Customer Experience

01 Mar 2007

The customer experience isn't about completion rates and ROI; it's about achieving an intended task easily, efficiently, and even enjoyably

Toward Natural Language Processing

09 Nov 2006

When many people hear the words, "natural language," they immediately think of Star Trek's android, Data, who speaks and understands everyday English. Some software vendors claim anything beyond discrete speech recognition (in which users must pause between speaking each word) as "natural."

Speaking and Listening to the World Wide Web

12 Sep 2006

Developing and sharing content is a growing activity on the Internet. In addition to passively observing Internet content, users are actively adding to it by uploading their pictures to flickr.com, and sharing their thoughts in blogs and wikis. Readers rate books on amazon.com, and teens post real and fantasy personas on the extremely popular myspace.com, hoping to attract the attention of other teens.

Multimodal Applications' Architectures

03 Jul 2006

Rolling Out High-Value Applications Services in Record Time

13 Jun 2006

Case study by PIKA Technologies>

NuVoxx focuses on developing and marketing advanced IVR (Interactive Voice Response) services, which it provides on a hosted model on a pay-per-use basis. The company also builds productivity solutions for call centers, and recently developed and deployed an advanced call recording and monitoring solution for NuComm International, its sister company and the largest privately-held Canadian provider of customer relationship and contact center services.>

Mom and Pop Shops Gain a Voice

08 May 2006

Three trends that enable IVR systems for mom and pop shops without costing an arm and a leg include prepackaged applications, free starter VoiceXML platforms, and reusable dialog components.

Improve the User Experience

01 Mar 2006

Graphical user interfaces have been the standard user interface for computer users for over 20 years. It's time to up-level the user's computing experience by voice-enabling applications.

Metalanguages and AJAX

01 Jan 2006

Because many telephones do not have hardware that supports speech processing, a voice server is placed in the network to act as a client on behalf of telephones.

VoiceXML on Steroids

07 Nov 2005

Researchers and practitioners are extending VoiceXML using various techniques to provide new functionality. These include the RDC library tags, xHMI meta language, and a prototype implementation of VoiceXML which supports dictation speech recognition. RDC Tag Library – Developers frequently use Struts1 or other application frameworks to generate HTML. The goal of the Reusable Dialog Component (RDC) project is to provide a similar framework for VoiceXML. Like Struts, RDC has a tag library that hides the…

Ten Criteria for Measuring Effective Voice User Interfaces

07 Nov 2005

A Toolkit of Metrics for Evaluating VUIsInvestors use standard metrics such as stock price and projected revenue per share to choose investment opportunities. Likewise, consumers use standard metrics such as floor space, number of bedrooms, or number of bathrooms when purchasing houses. This paper presents a toolkit containing some specific metrics for evaluating voice user interfaces (VUIs). The speech industry should use criteria from this toolkit to: Judge the most efficient of several VUIs for…

Consistent User Input Options and Instructions Across Multiple User Interfaces

20 Jun 2005

Many Web developers use XML to represent data and a transformation language, such as XSLT, ASP, or ColdFusion, to transform the XML data to an HTML user interface. Developers change the values of the XML data without having to manually recode the HTML user interface.

Regaining Consumer IVR Confidence: Companies Draw on Different Tools to Save IVR/Speech Interaction

01 May 2005

Michael Chavez, vice president of client services at ClickFox, explains that "by leveraging a strategic combination of customer behavior intelligence, customer service interviews and surveys, organizations can reduce customer frustration with IVR systems, which will result in drastic savings, while also improving customer satisfaction."

Synthetic Interviews: Beyond History Calls

26 Apr 2005

Matt Nickerson describes how mobile phones enable callers to speak and listen to virtual agents. Using the same device to speak with family, friends and business associates, callers speak with software agents that enable synthetic interviews with individuals in photographs of historical events in a museum. This represents a new way of interacting with objects that are usually only viewed.

Ten Guidelines for Designing a Successful Voice User Interface

06 Jan 2005

At SpeechTEK 2004, a group of leading VUI designers attended the Voice User Interface (VUI) workshop directed by Dr. James A. Larson. Taking the lead for an article on the best practices in VUI, Dr. Larson collected and coordinated this team of VUI specialists to compile the Ten Guidelines for Designing a Successful Voice User Interface. Speech Technology Magazine would like to thank the authors for their contributions to this article and Dr. Larson for…

Balancing Customer Service Support Options

23 Nov 2004

Customers often ask questions about products, services, delivery dates and account information, as well as, offer suggestions and complaints. If customers do not receive satisfactory answers to their questions, they become disillusioned with the company and take their business elsewhere. In short, good customer service support is the key to repeat business. …

MRCP Enables New Speech Applications

23 Nov 2004

Have you ever wished you could change your VoiceXML platform to use a speech synthesizer or speech recognizer from a different vendor? Have you ever wanted to move your speech synthesizer or speech recognizer to a different server? The Internet Engineering Task Force is proposing a new standard that will provide this flexibility.

What's New with VoiceXML 2.0?

11 Sep 2004

Beyond XML 2.0

08 Jul 2004

The W3C Voice Browser Working Group reviewed more than 700 requests for change to VoiceXML 1.0. After careful deliberation, many of these were adopted, resulting in VoiceXML 2.0 which became a recommendation in March 2004. The VBWG is now working on two efforts to make VoiceXML even better.

Profiles for Speech Application Users

24 May 2004

A user profile contains information that describes how to personalize a speech user interface to meet the needs for a specific user.

VoiceXML 2.0: A Real Standard for a Real World

08 Mar 2004

New technology will change the way people interact with computers. PCs enabled users to use a keyboard and screen rather than review printed reports. The Xerox Star and Apple Macintosh introduced Graphical User Interfaces (GUIs) which made the mouse and other pointing devices popular. Now, we are on the verge of a revolution in technology that makes computing portable. Separating user interface devices from the computing device will dramatically change how people interact with computers.

Help Users Speak

09 Jan 2004

A speech application does not work if users do not speak when prompted with a question.

EMMA: W3C’s Extended Multimodal Annotation Markup Language

25 Nov 2003

Recently, the W3C Multimodal Working Group published a first working draft of EMMA — the Extended MultiModal Annotation markup language — EMMA (www.w3.org/TR/emma/). EMMA’s intended use is to represent the semantics for information entered via various input modalities and the resulting integrated information.

InkML and Speech

25 Aug 2003

Have you ever been in a place where speaking to a VoiceXML application on a cell phone is impractical? As you know, not all locations or situations are suitable for using speech-enabled handheld devices.

Controlled Languages and Speech Prompts

30 Jun 2003

"Welcome to the Ajax speech application. It is possible to perform any of several actions at any time during this application, including the common actions of asking for help, transferring the call to an operator, stopping the application and returning to the main menu." This sample prompt is too long and confusing for most users. However, by using a controlled language, you can improve the prompt for your customers.

State of Speech Standards

30 Jun 2003

Speech standards include terminology, languages and protocols specified by committees of speech experts for widespread use in the speech industry. Speech standards have both advantages and disadvantages. Advantages include the following: developers can create applications using the standard languages that are portable across a variety of platforms; products from different vendors are able to interact with each other; and a community of experts evolves around the standard and is available to develop products and services based on the standard.

Speech Wars—Round Two

05 May 2003

The Voice Browser Working Group has finished the technical work on the three major languages in the W3C Speech Interface Framework—VoiceXML 2.0, the Speech Recognition Grammar Specification, and the Speech Synthesis Markup Language. These languages will soon become “check-off items” in a list of features provided by the leading speech platforms.

The W3C Speech Interface Framework

06 Mar 2003

In the past three years, the World Wide Web Consortium Voice Browser Working Group has produced several reports that define languages in the W3C Speech Interface Framework. Developers use the W3C Speech Interface Framework languages to create speech applications.

Uniform Basic Function Commands

14 Jan 2003

Information processing and the Internet are merging with the telecommunications industry to develop mobile devices using interactive services with global access. Speech recognition offers the most natural way for consumers to use new communication devices and services.

Developing Verbal, Visual, and Multimodel User Interfaces for the Same Application

21 Nov 2002

PC users access the World Wide Web using a graphical user interface (GUI) that is commonly specified with HTML. Telephone and cell phone users access the Web using a verbal user interface (VUI) often specified with VoiceXML.

The What, Why and How of Usability Testing

10 Sep 2002

As the cartoon illustrates, users become frustrated when speech applications don’t work. Testing minimizes this frustration by detecting and resolving many speech application problems before they cause user frustration.

Telephony Enable Your Web Site

11 Jul 2002

VoiceXML has revolutionized the development of telephony applications, in that telephone users can call Web sites and converse with VoiceXML applications. The system uses a TTS synthesizer or prerecorded voice and users can respond by speaking answers to the questions. Currently VoiceXML is weak in telephony controls. About all you can do in VoiceXML is and , which are powerful enough for many applications, but not powerful enough for many others, such as event notification and conferencing. However, with the coming of new call control capabilities, many of these restrictions will be overcome.

VoiceXML and SALT

21 May 2002

VoiceXML and SALT are both markup languages that describe a speech interface. However, they work in very different ways, largely due to two reasons: (i) they have different goals, (ii) they have different Web heritages.

Students Develop Innovative Prototype Speech Applications

21 May 2002

Student teams are very creative when asked to design and implement speech applications of their choice. Here are some of the prototype speech applications recently implemented by students at Georgia Institute of Technology, Washington State University, Portland State University and Oregon Health and Sciences University.

Building User Interfaces for Multiple Devices

29 Mar 2002

Users can choose from among several devices to access the World Wide Web. These devices include PCs with a visual Web browser, such as Netscape's Navigator or Microsoft's Internet Explorer, to interpret HTML files downloaded from a Web server and executed on the PC; telephones and cell phones with a verbal browser to interpret VoiceXML files downloaded from a Web server and executed on a voice server; and WAP phones with a visual browser to interpret WML files downloaded from a server and executed on the WAP phone.

IBM’s New Toolkit Connects with a Family of Developers

31 Jan 1997

The goal of speaking to your computer, and having it do what you say, has been a goal of IBM’s speech recognition research team for over 25 years.