January 1, 2006
By James A. Larson program co-chair, SpeechTEK 2021
Forward Thinking

Metalanguages and AJAX

The client-server model enables developers to partition computation between servers and clients. For traditional Web applications, the client may range in power from a low-end terminal to a powerful PC. Because many telephones do not have hardware that supports speech processing, a voice server is placed in the network to act as a client on behalf of telephones. In order for Web applications to work with all types of clients, developers minimize the processing performed on clients by downloading Web documents from the server for interpretation on the client. Documents may be:

Static documents—Developers create static Web documents which are stored on the server.
Dynamic documents—Developers use CGI languages (ASP, ColdFusion, or XSLT) and metalanguages (Struts¹ or xHMi²) to dynamically generate documents. Because CGI languages and metalangauges are executed on the server, we refer to the dynamic generation of documents as server-side computing.

In both cases, the client performs little processing beyond interpreting the documents.

AJAX —Asynchronous JavaScript plus XML—is an alternative model to server-side computing. In the AJAX model, a block of data is downloaded from the server to the client where an AJAX script uses the data to perform several interactions with the user without additional data exchanges between the client and server. Because the AJAX script is processed on the client, AJAX is an example of client-side computing.

Voice XML 2.1 supports a type of AJAX without special scripting by using two new elements:

<data> element to fetch data
<for>element to iterate through items in the data, enabling the user to interact with each data item without waiting for lengthy downloads from the server

In his article about implementing pick lists, Matt Oshrey explains how to use the two elements to download a list of options from the server to the client and present each option one at a time to the user. When the user hears the desired option, the user selects the option by speaking "tell me more."

VoiceXML 3.0 will use the DFP (Data Flow Presentation) framework, which continues the trend started by the element in VoiceXML 2.1. The VoiceXML 3.0 version of will retrieve a block of data with which the user can interact without multiple time-consuming downloads. The fetched data may be returned to the database after the VoiceXML 3.0 application has updated the data.

Developers continue to debate the relative merits of server-side and client-side computing.

Factors favoring server-side computing include:

Metalanguages and CGI applications leverage the powerful processing capability available on servers to generate complex Web documents for interpretation by the client.
Metalanguages may enable higher developer productivity.
Metalanguages are portable across platforms (e.g., VoiceXML and SALT) supported by the vendor.

Factors favoring client-side computing include:

Supports high application-user interaction with minimal downloading of documents and files from the server.
Avoids proprietary metalanguages which may not be portable across platforms from multiple vendors.

VoiceXML 3.0 permits both server-side and client-side processing. Developers can use metalangauges and CGI scripts with VoiceXML 3.0 for server-side processing. VoiceXML 3.0's <script> and <data> elements are processed on the client side. Developers can even use both: metalanguage on the server side to generate the <script> and <data> elements for processing on the client side. Developers have the means, and the responsibility, to develop efficient speech applications using the proper combination of server-side and client-side processing.

Dr. James A. Larson is manager, advanced human input/output, Intel, and author of the home study course VoiceXMLGuide, http://www.vxmlguide.com. He can be reached at jim@larson-tech.com.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Metalanguages and AJAX

Nex-Gen Chat Solutions with Generative AI You Can Trust

Speech Technologies in the Low-Code/No-Code World

Meeting the Rising Demand for Voice-Based Biometric Systems

More Web Events

Tips for Reviewing Voicebot Vulnerability

Safety and Ethical Concerns Loom Large in Voice Cloning

Apple Proposes Acoustic Model Fusion to Improve Speech Recognition

Aculab Launches Audio-to-Audio Translation