Speech Technology Magazine

 

ASSISTIVE TECHNOLOGY: Speech on the Web

As computer based information and the devices used to access and interact with computers have proliferated world wide, the definition of a computer user has grown to encompass a broadening set of individuals. Because of the visual and motor interface foundation of most computer systems, users who are blind, have low vision, or have perceptual and learning styles that differ from traditional models, are frequently denied access to computer based systems without the support of assistive technology.
By Markku Hakkinen - Posted Apr 1, 1998
Page1 of 1
Bookmark and Share

As computer based information and the devices used to access and interact with computers have proliferated world wide, the definition of a computer user has grown to encompass a broadening set of individuals. Because of the visual and motor interface foundation of most computer systems, users who are blind, have low vision, or have perceptual and learning styles that differ from traditional models, are frequently denied access to computer based systems without the support of assistive technology.

Traditionally, assistive technology is the adaptation to existing visually oriented systems of alternatives such as text to speech. Assistive technology is a small step toward the goal of universal design, and the concept of "readily accessible to and usable by" embodied in the Americans with Disabilities Act.

The original intent of the World Wide Web was to provide universal access to information using standard content definition language (HTML), with platform independent visual presentation. As the web evolved the original concept has been extended by a variety of interests to compete more directly with the interface capabilities and styles of personal computer applications.

The need to consider accessibility for users with disabilities continues to be an after thought rather than a guiding principle in the rapid evolution of the Internet.

Accessible design guidelines for web sites are being developed. Indeed, some show good design practices. Yet, well designed sites are not truly accessible unless the user has a browser that enables accessible navigation and display. With the advent of Java applets, any web site can now contain sophisticated interactive applications that are accessible.

For a visually impaired user, on screen displays must be translated with either spoken or Braille presentations. TTS synthesizers, coupled with specialized access software called screen readers serve as the primary method for this.

With graphical user interfaces such as Microsoft Windows, visual intuitiveness poses a significant challenge to the screen readers' software. Scrollable windows, iconic toolbars, and drop and drag control mechanisms all represent major hurdles for the visually impaired user.

Although application software can be designed to facilitate access by screen readers and control mechanisms can be implemented using both keyboard and mouse commands, such design is inconsistently applied. A sighted user is clearly at an advantage, with the benefit of viewing a variety of visual clues, such as position wording shape imagery motion and color. A visually impaired user must understand the application based on far fewer clues.

To make the web accessible, work is needed on both the content offering and client browser sides. Development of HTML design guidelines is underway by a number of groups. Given the global nature of the web and its authors, the dissemination and adoption of these guidelines is a formidable challenge. It is our expectation, however, that developers of web offerings will incorporate features that promote accessibility.

Although web content can be made "accessible" it remains for the web browser to effectively convey that content to the user. This must be coupled with control and navigation mechanisms that are easily learned and operated. These requirements are essential if access to the Internet and the web is to move beyond the experienced computer user.

An Accessible Browser

Current browser designs are largely similar in appearance, with interaction controlled by both graphic manipulation and keyboard functions. Navigation of the displayed web pages is controlled in two primary ways; via hyper-links within the displayed pages, and by browser controls, such as commands to move backwards through a list of recently accessed pages.

Browser usability (and market differentiation) is enhanced through the addition of features like book marks, save to local file and mail functions.

The underlying basis of the browser is the recognition and processing of document structure. By the nature of the information contained in HTML, the browser can choose appropriate display attributes, display text, trigger the display of multimedia elements, and then act upon user selected navigation or control requests.

As part of a project to develop an Internet based distance learning program for Thomas Edison State College in Trenton, New Jersey, the issue of accessibility to the virtual campus arose as a significant design challenge.

Although HTML accessibility guidelines were examined and further expanded upon for the college web site, server side design alone could not ensure that students with disabilities would have easy and effective access.

In considering accessible browser design, we began by defining a modular architecture that would support a variety of access methods for both input and display. The fundamental design focused on alternative formats such as text to speech, large print or refreshable Braille. The architecture also envisioned speech recognition, keyboard-mouse alternatives, and non-keyboard input devices.

The pwWebSpeak project began with no assumptions or existing code base for the browser. In fact, early prototypes of the browser had no visual interface, with all presentation coming directly from the TTS synthesizer. A fundamental concern in the development process was to create a browser that would run on an economical hardware platform, and support a user's existing investment in assistive devices.

For first time users, the minimum hardware investment would be Microsoft Windows 3.1 PC with sound card and modem. A video display would be required during set up, but would not be necessary for operational use by a visually impaired user.

The core of the pwWebSpeak browser is the HTML processor, which parses a web page with internal structures used to control navigation and display. A major part of the processor is a rule base that determines how output should be formatted for display. With text-to-speech output devices, it is possible to alter speech parameters such that different voices can be assigned to different structural elements in the documents.

For example, when used with the synthesizer such as DECtalk, pwWebSpeak can present heading text in a voice different from that used for the body text.

The output of the HTML processor is placed into several internal structures. One structure is used to contain a list of all document elements while another contains a list of all links in the current document. Additional structures contain process text ready for direct output to the selected display devices. This is an early form of the document object model.

Output can be directed to one of the supported devices. The default output device is a high contrast yellow text on a blue large print display window. Output can also be directed to software based text-to-speech synthesizers.

Control mechanisms had to be simplified so that users could rapidly navigate both the browser's command structure as well as the accessed documents. A user can hear or read available command functions by pressing an activate key when the desired option is heard.

This basic interface structure permits control by keyboard, speech recognition or other devices. For data entry fields such as URL entry or web forms, the user can utilize the full keyboard to select pre-defined responses for automatic entry into a field, or select letters from an alpha selection list. Command input, whether directed into the browser from the keyboard, speech recognizer or other input device, is routed through a common command processor.

The user can choose to listen to an entire web document, or browse the document structure. Current links are available from a selection list. Ongoing speech can be interrupted or skipped with a single keystroke.

The brief history of accessibility design has shown us that considering the needs of those with disabilities can often have broader societal benefits.

For example, closed captioning is now available with almost all new televisions, and is being used by more than just the deaf or hard of hearing community, including those learning English as a second language.

By making the web accessible to those who could not otherwise experience it with existing technologies, we are making the web usable in ways not originally envisioned, such as direct Web access by telephone.

In order to ensure that accessible browsers such as WebSpeak succeed, web content developers must become aware of and adapt their sights to conform to emerging design guidelines.

Ultimately, the greatest challenge is that content developers, service providers and browser developers must work with the accessibility community to ensure that the web remains true to its foundation of universal information access.

Markku Hakkinen is senior vice-president and the co-founder of The Productivity Works, an Internet software company. He can be reached at The Productivity Works, 7 Belmont Circle, Trenton, N.J. 08618 or at http://www.prodworks.com.


Assistive Technology Round Up

Earobics from Cognitive Concepts Inc.

Earobics Auditory Development and Phonics CD-ROM from Cognitive Concepts teach a full range of listening and auditory skills that are critical for learning how to speak, read and spell.

Earobics is a learning tool for all children, including children with special learning needs such as speech/language delays, attention deficits, dyslexia and language-based learning disabilities, hearing impairment and those learning English as a second language.

Earobics is priced at $59 and Earobics Pro is $149. Both use computer training techniques, including acoustic enhancements to make the important parts of speech more easily heard, systematic control of important learning variables and adaptive training.

For more information, contact Cognitive Concepts, Inc., PO Box 1363, Evanston, Ill 60204-1363 or call 888-328-8199.

Audio Website

OnTarget Marketing recently released TALKS(tm) (Total Access Linking Kiosk Systems), to enable visually impaired individuals to "surf the web."

The OnTarget Marketing web site, www.ontargetmkt.com, has been programmed with TALKS to provide audio instructions to a visually impaired person to taking them through every section of the site and reading back all of the information on each screen.

All navigation can be done by placing the mouse over a particular "voice prompting" icon that tells the user the topic of the section.

"The objective of TALKS is to provide everyone with the ability to obtain information about any company from the web," said Gary Crunk, chief technical officer, at OnTarget Marketing.

For more information, contact Brian Johnson, Vice President, Client Services, at 602 667-0775.

Aurora for Windows

Aurora Systems, Inc., recently released Aurora 2.0 for Windows 3.1 and 95, a family of products to provide easy conversational speech and writing assistance for people with learning disabilities.

With the Aurora RealVoice speech synthesis software, Aurora provides a quick, easy way to carry on a spoken conversation using almost any computer with a sound card. It can also make any Windows 3.1 or 95 word processor into a talking word processor with extensive spelling assistance for use by people who need help writing.

For more information, contact Aurora Systems at 888 290-1133 or on the web at www.djtech.com/aurora.

Accessibility API from JAVA

Sun Microsystems has unveiled the Java accessibility API, and several companies have already taken advantage of the Java attributes to improve computer interfaces for the visually impaired, the dyslexic and people with other disabilities.

Java allows developers to provide visually impaired people with detailed information about what's on a computer screen via such technologies such as screen readers, speech recognition and Braille terminals.

Assistive Technology from IBM

IBM has a long history of providing assistive technology products. For example, the SpeechViewer was first introduced 16 years ago at the IBM Scientific Center in Paris and was then known as the Paris Deaf Children's Speech Program, before the PC became a familiar tool.

SpeechViewer provides an external means of monitoring speech production by acoustically analyzing speech input for such parameters as pitch, loudness, voicing, phoneme accuracy and inflection. It then generates dynamic real-time graphics that represent variations in these speech parameters - in effect, a form of bio feedback.

Today, the SpeechViewer III is used in schools, clinics and private practice to improve the speech of persons with many disorders such as oral-facial anomalies, neurological disorders and vocal problems.

Pitney Bowes Copiers

Pitney Bowes Office Systems recently released its first Universal Access Copier System to meet the needs of people with physical disabilities. The system incorporates advanced speech recognition technology, an extra large touch screen interface and Braille labeling on the control panel.

"The Universal Access Copier System will be especially valuable to schools, libraries, municipal buildings and in the offices of disabled workers," said Dennis Roney, president of Pitney Bowes Office Systems. "To provide easy access, the copier can be controlled in a variety of ways. Voice activation, touch screen, and keyboard interfaces allow users to choose how they operate the system."

For more information, visit Pitney Bowes on the web at http://pitneybowes.com.

ALVA Releases outSPOKEN for Windows

ALVA, makers of software for blind and visually impaired computer users, recently announced the release of outSPOKEN 1.21 for Windows. This latest version adds increased compatibility and performance enhancement for use with a variety of popular programs.

Recognized for its ease-of-use, outSPOKEN utilizes the numeric keypad for actuating all commands. One set of simple reading and navigation commands works universally.

"Enthusiasm for version 1.21 from our product testers surpassed all our expectations," said Larry Lake, director of marketing at ALVA Access Group.

For more information, contact ALVA Access Group at www.aagi.com or at 510 923 6280.

Page1 of 1