In recent years speech recognition technology has seen such marked improvements in accuracy rates, specialized vocabulary development and CPU processing speed that it has become a factor in improving productivity in many vertical markets. Thousands of doctors, lawyers and financial analysts have discovered that speech recognition can be a reliable and quick means to get from spoken word to written output. Market analysts say the medical industry is the leading vertical market for speech recognition. Many reasons have been cited to explain why the medical field is so strongly involved with speech recognition. Medical care increasingly is being driven by costs, and medical transcription is a big budget item for major hospitals and individual doctors. The health care industry faces a need to keep the costs of transcription as low as possible while continuing to provide reports with both accuracy and speed. Health care professionals have a need for accurate reports and highly specialized vocabularies. Health care also features a group of professionals who are accustomed to dictation as a means of communicating their ideas and which have recently become driven by a high concern for costs. These are all factors that point to the value of speech recognition, and they help explain the rapid rate of adoption of this technology in the medical field. SPECIALIZED VOCABULARIES
Every profession has its own vocabulary. In radiology, corporate law, sports medicine or auto insurance, there are specific words that are repeated constantly inside the profession that are almost never heard outside that particular professional venue. To be effective in specialized vertical markets, speech recognition software must deal with and comprehend those specialized vocabularies specific to each particular profession and subject matter. Software must "understand" whether the topic is business, law or medicine and be designed to handle words that are unique to that particular vertical market. Putting together the information to use in vertical markets for professionals requires producing a specialized vocabulary. Typically this requires an acoustic model, which describes the sounds spoken, a vocabulary of words for the specific subject matter, with spellings and pronunciations, and a language model consisting of statistical information about usage of each word alone and with other words. Commonly the term "vocabulary" is used in speech recognition circles to refer to all three components combined, although sometimes the words "context" or "topic" are used. Most continuous speech recognition software includes a large, general-purpose vocabulary. This is fine for general or business correspondence, but most professional users find they need additional proper names and words specific to their subject matter. A specialized vocabulary must meet its users' full terminology requirements. For example, a cardiologist must be able to dictate reports that include anatomical and procedural words that are unique to cardiology. But additionally, much of what cardiologists say in discharge summaries and other reports have little to do with their specialty, so a cardiology vocabulary needs to include a full internal medical vocabulary. Compiling specialized vocabularies requires skill and care. Teams of linguists and transcriptionists build some vocabularies and often work with groups of programmers and data processors to build others. To create a specialized vocabulary, a team of experienced vocabulary developers needs to collect data based on millions of words in thousands of representative letters and reports dictated by hundreds of individuals at several locations to create a product that can be effective throughout a wide geographical area. Then the data needs to be organized with appropriate content. The word list needs to be compiled and initial pronunciations for the words entered. Finally, the word usage statistics from the data are computed to produce a vocabulary for use with a product specialized for a vertical market.
A large cost-conscious market is required to support such development. And the health care arena fits the bill. The health care industry is large enough to attract the interest of major corporations with an interest in speech, as demonstrated by the recent decision by Royal Philips Electronics to purchase MedQuist Inc., a provider of outsourced medical record transcription services with sales of approximately $330 million. Philips made the investment to lead the transformation of MedQuist Inc. into a technology-based medical document services provider over broadband networks. MedQuist Inc. plans to aggressively roll out Philips' speech and other technologies with the goal of enabling significant productivity improvements in the conversion of dictated medical records into written text. Large hospitals also may find speech recognition technology a solution to their data management challenges. For example, Dictaphone, a Lernout and Hauspie company, has developed a solution called Enterprise Express, which offers flexible network support for multisite facilities, clinics and transcription services. It incorporates the ability to move voice and data on the network and the Internet in new ways and encompasses an environment where voice, text and data are managed on an enterprisewide level as part of a complete patient information management system. BILLABLE HOURS
While the health care market offers speech in particular and computer technology unique opportunities to provide high value products, there is also the issue of being able to demonstrate to the purchaser a strong case for increased productivity. If a doctor can handle more cases, naturally enough, he or she makes more money. But is that really the case in the legal arena, where attorneys have long been paid for billable hours? While the medical arena has been on the cusp of almost all new developments in technology, including speech, the legal community has generally been far more conservative. Some observers have suggested that the concept of the billable hour offers lawyers a strong disincentive to adopt labor-saving devices. Many law firms rely solely on the billable hour concept to charge clients, a time-honored tradition that impacts direct labor costs. But when attorneys use computers and efficiency increases, decreasing the time invested in the case, the number of billable hours is reduced. Originally the billable hour was designed to efficiently allocate resources, since the time spent on the case was reflected in the bill. But such billing measures are based on time instead of efficiency and quality, and as such the legal profession has been far more reluctant to adopt any new technology than has the medical profession. Some legal firms have been known to hold back on any automation, and to even intentionally use manual research methods to increase revenues. While there is no question that speech recognition can eventually become a part of the everyday lives of any professional who dictates, it is also evident that when one profession has a legacy of technological innovation and another a strong disincentive to change, a wide gap in adoption rate is to be expected.
Marcus Osborne is a researcher and contributing writer for Speech Technology Magazine.
Companies and Suppliers Mentioned