Speech Technology Magazine

 

Threats to Objectivity in Usability Testing

Most speech industry people concede the value of usability testing. It is widely appreciated how usability testing, particularly early on in the dialog design stage, can reduce usability problems, costs and headaches further down the road. The idea, of course, is to get an objective, unbiased assessment of a design before committing all of its particulars to code. This sounds simple enough. But obtaining objectivity is not always as simple as it seems and if the usability test plan or procedure is fundamentally biased, why should we bother to test at all?
By Walter Rolandi - Posted Aug 25, 2003
Page1 of 1
Bookmark and Share
WHY TEST AT ALL? Most speech industry people concede the value of usability testing. It is widely appreciated how usability testing, particularly early on in the dialog design stage, can reduce usability problems, costs and headaches further down the road. The idea, of course, is to get an objective, unbiased assessment of a design before committing all of its particulars to code. This sounds simple enough. But obtaining objectivity is not always as simple as it seems and if the usability test plan or procedure is fundamentally biased, why should we bother to test at all? THREE THREATS Usability testing is a form of behavioral research: A test situation is designed and subjects are asked to perform specific tasks. The behavior of the subjects during and after the performance of these experimental tasks is what interests the usability analyst. A good usability assessment will express findings in terms of observed and quantified behaviors. But quantified observations cannot ensure objectivity in a usability testing procedure. Greater objectivity is achieved only by carefully controlling factors that can introduce experimental bias. There are three major factors that can introduce experimental bias and threaten objective testing. They are experimental design, subject (user) expectations and experimenter expectations. EXPERIMENTAL DESIGN The most elementary forms of bias can stem from a flawed experimental design and can take many shapes. An example design flaw might constitute providing inappropriate instructions to subjects for the experimental tasks they will be asked to perform. The inappropriateness of the instructions can represent opposite extremes. On one end, instructions can be so specific, exact and informative that only the most feebleminded user will fail to complete the task. On the other end, instructions can be so nebulous or confusing that only the most brilliant can comply. The worst possible design flaws would be intentional. This is where someone creates a test with an explicit, preconceived outcome in mind in order to “prove” a point. As a practice, intentional or deceptive bias almost never occurs. Much more common is the unintentional introduction of experimental design flaws that undermine the integrity of test findings. SUBJECT EXPECTATIONS Care must be taken in experimental design and execution to control subject expectations and bias. Subject expectations threaten objectivity whenever they bias a particular experimental outcome. For example, if subjects for a speech recognition application test overwhelmingly believe they can speak with the machine in the same way they can speak to a human, they may be more likely to become annoyed in the event of a recognition failure. Another problem can arise because subjects often believe that they should say something positive about their user experience, regardless of how unpleasant it may have actually been. This is a common occurrence wherein subjects tend to be polite or constructive instead of analytically objective. EXPERIMENTER EXPECTATIONS Experimenters need to take special care not to bias test outcomes. Unfortunately, it is exceptionally easy to introduce bias in the most subtle and inadvertent ways. To find an example, we can revisit the matter of experimental task instructions. Let’s say that the task instructions are in fact appropriate but that they are explained to each subject by two different experimenters. One experimenter is friendly, talkative and takes great pains to ensure that the subject understands what is expected of him. The other experimenter, however, is curt, distant and appears unconcerned about the subject’s comprehension. More than likely, subjects informed by the first experimenter will perform more effectively than those that were informed by the second. The experimenter’s expectations of users can be far more subtle. Let’s say that all subjects get their task instructions from a single individual, but that particular experimenter inadvertently signals greater warmth and confidence toward female subjects. Something as seemingly inconsequential as a nod and a smile has the potential to dramatically bias the outcome of a test. EXPERIMENTAL CONTROL There are thousands of forms that test bias can take and the matter is discussed here in only the most superficial terms. The important thing to remember is that we, as usability test designers, need to recognize the way these factors threaten the objectivity of our tests and take all reasonable precautions to eliminate these influences.
Dr. Walter Rolandi is the founder and owner of The Voice User Interface Company in Columbia, S.C. Dr. Rolandi provides consultative services in the design, development and evaluation of telephony based voice user interfaces (VUI) and evaluates ASR, TTS and conversational dialog technologies. He can be reached at wrolandi@wrolandi.com
Page1 of 1