Articles by Walter Rolandi
Can We Talk?
Conversational interfaces still can't match human-level dialogues.
The Persona Craze Nears an End
An over-the-top persona could push users over the edge.
Do Cultural Differences Make a Difference?
A common experience unites all telephone users around the world
Is Your VUI Out of Tune?
Testing prompts for functional effectiveness is a fundamental tuning activity
Aligning Customer and Company Goals Through VUI
Reducing the cost of customer service should come second to keeping the customer happy
The Pains of Main Are Plainly VUI's Bane
Automated systems are becoming more prevalent, and the debate between directed dialogues and natural language interfaces is heating up
The gethuman Factor
Much of the tone of SpeechTEK 2006, held in New York this summer, was set by its opening keynote address. In the presentation, Paul English, founder of gethuman.com, outlined some of the desirable characteristics of a "gethuman standard" for self-service systems.
The More Things Change...
Recently, I had a customer service problem that obliged me to call customer service. I heard the company had recently implemented a speech recognition self-service system and I was curious to see how converting to speech would improve its self-service process. I was shocked when my call was answered with the following: "Thank you for calling the Acme Company. Please pay careful attention because our menu options have recently changed."
The Value of Speech Analytics
In this special section of Speech Technology Magazine, you will find an overview of the major applications for speech analytics within the enterprise environment provided by Datamonitor, followed by a concise discussion of the role of speech analytics in quality monitoring by SER Solutions.
Predictability and Prompt Variations
Detecting Emotion: Prevention Is Better Than Cure
I am not sure how much progress has been made in detecting all possible emotional states in users, but detecting anger can be relatively easy.
A Fool's Revenge
The Alpha Bail
Speech Recognition and Telegraphic Speech
What Is Telegraphic Speech?Telegraphic speech is typically observed in language-learning toddlers and people who are re-learning to speak after having suffered some neurological trauma such as a stroke. It is characterized by minimalistic utterances which often are no more than noun-verb combinations. For instance, a baby might say, give juice, as opposed to a more grammatically complete and socially appropriate utterance, such as, Can you give me some juice, please?
Some Things Are Better Left Unsaid
Bad User!It is not at all uncommon to be scolded by an application in the Dual Tone Multi-frequency (DTMF) world. To wit: System: To do [this], press one. To do [that], press, two.User: (presses DTMF three)System: That is an incorrect response! And while the tendency to scold is less prevalent in the speech world, it has not entirely disappeared. Consider the following:
The Cumulative Effect of Recognition Failures
"Adapt or perish, now as ever, is nature's inexorable imperative."-H. G. Wells <@SM><@SM>The Good News<@SM><@SM>Everyone in the industry is well aware that current speech technologies are undeniably impressive. Speech recognition accuracy rates have been very high for some time and there have been dramatic improvements in ASR robustness (the ability to recognize utterances under unfavorable conditions) during the last few years. The news might be exclusively good if only a high speech recognition accuracy rate was
Speech Recognition in Education: Unexploited Opportunities
Approximately 98 Percent-plus Accuracy? Most everybody in the speech industry has heard vendor claims of 95-98 percent-plus speech recognition accuracy. These claims, if slightly qualified, are undeniably true. In fact, using a good quality microphone in a quiet test environment, I have repeatedly obtained 100 percent speech recognition accuracy with several of the major ASR engines.
The Legal Threat to the Effective VUI
Nobody likes lawyers - as they say - at least not unless or until they find themselves in need of one. Lawyers bear the brunt of numerous jokes and insults and their professional class is consistently judged to be among the least esteemed in our culture.
The Impotence of Being Earnest
Famous Last Words<@SM>Have you ever dialed a company, had your call answered and then heard something like this? <@SM><@SM>"Thank you for calling ACME Corporation. Your call is very important to us
." <@SM><@SM>How about this? <@SM><@SM>"In order to ensure the most efficient resolution to your problem, you can always visit us at our easy-to-use Web site 24 hours a day at double-u double-u double-u dot ACME dot com."
Failure to Test Detestable
Dr. Walter Rolandi, founder and owner of The Voice User Interface Company, reasons that "validating a call flow is not a particularly expensive or time-consuming endeavor," therefore, failing to do so is a sign of reckless behavior.
Anyone who has ever taken a philosophy course has probably heard of the medieval theologian, William of Occam, a Franciscan monk who led a troubled life. Mostly because his teachings seemed to aid and abet some theological enemies of the papacy, Occam frequently found himself at odds with the powers that be.
What's Natural about Natural Language Processing?
Oh No!I inwardly wince every time a client announces that he wants me to design a natural language voice user interface. What follows is often an awkward series of questions that is intended to find out just what the client means by natural language. The answers clients provide can represent a range of possibilities that span, on a scale of complexity, from a basic verbal command and control system all the way up to an unbounded conversational dialog with a machine possessing the verbal skills of William F. Buckley, Jr.
Frequency of Use and Design
Casting Users in Parts as Parts
Thinking of users as parts is actually a natural and understandable inclination: When we think about systems, we cannot help but think systematically.
The Common Causes of VUI Infirmities
While most of us know the various things we should and should not do to maintain a healthy lifestyle, relatively few of us consistently comply. Such is also the case in the Voice User Interface (VUI) design world. Best design practices, for the most part, are publicly available and widely known. Yet, and perhaps for the same reason that some people think they are above the rules of diet and exercise, many Interactive Voice Response (IVR) designers seem to see themselves as immune to the illnesses that invariably plague poorly designed voice applications.
Threats to Objectivity in Usability Testing
Most speech industry people concede the value of usability testing. It is widely appreciated how usability testing, particularly early on in the dialog design stage, can reduce usability problems, costs and headaches further down the road. The idea, of course, is to get an objective, unbiased assessment of a design before committing all of its particulars to code. This sounds simple enough. But obtaining objectivity is not always as simple as it seems and if the usability test plan or procedure is fundamentally biased, why should we bother to test at all?
What We Need Is A Killer App
When You Don't Know When You Don't Know
During a break at a recent speech technology conference, a group of attendees were discussing the importance of learnability in their application designs. One participant advocated a particular method for classifying and dealing with recognition results as helpful.The scheme divided user utterances into three basic categories: high confidence matches; low-to-medium confidence matches; and no-match or out-of-grammar (OOG) utterances.
What is Usability Testing?
Looking around the industry, it is apparent that "usability testing" means a number of different things to a number of different people. While there are consistencies in methods and techniques among many speech industry usability analysts, there is no obvious consensus as to the purpose of usability testing or on any particular way to conduct usability tests.
What is Usability Testing?
What is usability testing? Looking around the industry, it is apparent that usability testing means a number of different things to a number of different people. While there are consistencies in methods and techniques among many speech industry usability analysts, there is no obvious consensus as to the purpose of usability testing or on any particular way to conduct usability tests.
Repeat or Not Repeat
What is the proper role that repetition should play in a voice user interface? This question frequently arises when designing a VUI, particularly if the VUI is intended to simulate "natural speech" or "conversational dialog". The common assumption is that repetition is bad because it doesn't sound natural and it occurs only infrequently in human-to-human conversation.
Repeat or Not Repeat
What is the proper role that repetition should play in a voice user interface? This question frequently arises when designing a VUI, particularly if the VUI is intended to simulate natural speech or conversational dialog. The common assumption is that repetition is bad because it doesnt sound natural and it occurs only infrequently in human-to-human conversation.
What can a persona do for you? This question was the primary focus of several presentations at SpeechTEK 2002. The topic of persona, by itself, seems to elicit strong opinions from a number of speech industry personalities. Interestingly enough, a particular participant in one presentation allowed that he wasnt really sure what a persona was. He went on to ask, rhetorically, What does persona actually mean?
Do Your Users Feel Silly?
Years ago, I was telling a friend about my long-standing interest in the animal language debate. I had studied bee signaling systems, bird songs and a number of attempts to establish various forms of verbal behavior in chimpanzees. I told my friend that some of the communicative abilities of several species are truly amazing but that drawing anthropomorphic conclusions about the abilities would be a mistake. He nodded, chuckled and proceeded to describe a transaction he had witnessed years earlier involving his college roommate.
Is It Stupid to Be Clever?
Grammar writers generally try to anticipate a number of ways users will respond when prompted to speak. Many designers believe that by expanding their grammars to permit highly variable user input, they will create a natural, easy-to-use voice user interface. This is a belief that is strongly held by some in the voice application development community. And applications developed by true believers can sport some truly huge grammars. I have seen, for example, a yes/no grammar containing thousands of acceptable utterances!
Do People Want to Talk to Computers?
Do people really want to talk to computers? Lets explore the question with a thought experiment that allows us to define talking to computers in a sophisticated, unrestricted sense. Ask yourself this: <@SM><@SM>If C-3PO, the fussy, fretful, Golden-Droid of Star Wars fame were science fact instead of science fiction, how deeply would I want to talk to him?
All Too Human Factor Determining the Speech Growth-Market
There seems to be a growing awareness among speech industry players that making money in speech is more a function of good VUI design practices than the mere exercise of ever more innovative and impressive technologies. This was evident at the Telephony Voice User Interface Conference this year in a number of ways.
Will Unified Messaging be the Beachhead Opportunity for Conversational Voice User Interfaces?
That speech technologies represent a market poised for tremendous growth is scarcely subject to debate. The precise form that the emerging market will take is still, however, somewhat unknown. Some believe that speech application users will demand essentially unrestricted conversational user interfaces. But is this belief supported by the facts?
Building the Interface of the Future
As the worldwide speech marketing sales executive for IBM Speech Systems, Anne-Marie Derouaults responsibilities include directing all worldwide marketing and sales efforts for IBMs speech recognition business, including the ViaVoice family of products. She has been a key player in the speech recognition industry for over 15 years, long enough to regard the current industry buzzwords Natural Language with a sense of deja vu.
The Alpha Bail
A Little Bit of Energy Can Make a Big Difference<@SM>Usually, speech recognition is the preferred modality in telephony applications that require non-numeric input. Imagine asking users to type in something like the name of a movie or a restaurant or a street name using a telephone keypad. That would be a cruel usability joke. When entering information that cannot be otherwise conveyed using telephone keypad numbers, speech recognition, as a rule, provides a far superior