-->

Jeanne Gokcen, President, FutureCom Technologies

Q FutureCom Technologies is a relative newcomer to the market. Tell us a little about FutureCom and what strengths your company brings.
A FutureCom Technologies is a speech and signal processing technology development and services company. A privately-held company incorporated in 1994 and the only Certified Woman-Owned Business Enterprise in the area of speech technology, we build software and offer services that span various elements of voice processing and speech recognition and synthesis. We offer our own speech recognition, text-to-speech and speaker verification engines, and all of these speech technology features and more have been integrated into our CommUnify Integrated Communications Platform (ICP). In addition, we have been providing voice prompt recording, digitizing and editing services since 1997. FutureCom Technologies' strength is the expertise and experience of our principals and staff. While FutureCom Technologies' products may be relatively new, our staff members have been involved in this type of technology and application development for over twenty years. Sedat Gokcen, FutureCom CTO, worked at AT&T/Lucent Bell Laboratories in the speech technology development group for over 15 years and Roy Grubbe, FutureCom Director of Technical Operations, worked there for over twenty years. Both of these principals were part of the original ten-member team that developed the Conversant® System IVR platform. They also were the key developers of speech recognition, text-to-speech synthesis, and fax for Bell Laboratories. Each has patents in the area of speech and signal processing technology. Mr. Gokcen holds the patents for "barge-in" and for "Flexword" (sub-word model based) speech recognition. Mr. Grubbe holds patents for speaker verification and "barge-in." Mr. Gokcen and Mr. Grubbe provided key technology and prototypes enabling the development of the first successful IVR speech recognition application at AT&T, automating the zero "plus" calls (collect, calling card, third person, etc.). This application continues to be the single largest successful working application, handling up to 4 billion calls per year. Q Tell us about FutureCom's CommUnify Integrated Communications Platform.
A The CommUnify Integrated Communications Platform (ICP), launched in 2002, is a software-based, speech technology-integrated IVR platform. This platform has an IP-based, client/server architecture. It offers full telephony control such as DTMF and call progress detection. Other signal processing functions include echo cancellation, TDD and fax. All of these functions are provided in software. Because all of the functions are software-based, the only hardware dependency is as a data source through a telephony board. However, should the data come through a source other than telephony hardware, e.g., the internet (VoIP), our system can still provide all of the telephony controls and signal processing functions. Our speech recognition engine, CommUniHear, is a speaker independent, grammar-based system. It offers unique features such as multi-lingual recognition on a single engine, simultaneous use of whole word and subword models for improved recognition accuracy, and input media-independence (i.e., microphone, telephone, speaker phone and cell phone input can be managed on one system). Our text-to-speech engine, CommUniSay, utilizes the popular professional voices that are standard in IVR systems, enabling a smooth transition between recorded prompts and synthesized portions of an application, resulting in a better customer experience with great naturalness and improved intelligibility. CommUnify ICP is a ready-to-run, all-in-one solution with the lowest entry price per port in the market, yet it offers linear, high scalability. FutureCom's ROI and quality is unbeatable in the industry. Our savings come largely from the hardware-independent architecture. We use readily-available hardware that is powerful yet inexpensive. Q Would you describe your voice prompt recording and other services?
A We have been providing full service voice prompt recording for IVR applications since 1997. We have developed our own digitizing and editing tool, Visual Speech™, that streamlines the process and improves the voice output quality. We offer translation, culturalization, recording, digitizing, editing, formatting and prompt archiving services with a quick turnaround time. With Visual Speech, we have the capability of editing out undesirable noises (e.g., breaths, mouth noises) and long pauses not only at the ends of recorded phrases, but within the phrases as well. The endpoints are tapered and have consistent silence pads so that there is a smooth transition between phrases that are concatenated - no audible clicks or noticeable pauses. We can automatically match output volume of phrases so that consistent volume level is ensured throughout an application, even if the phrases were recorded at different times. Some IVR systems have gone through multiple versions, with variable characteristics that result in varying voice quality output. We can provide filters to match the quality of new and old recordings from earlier versions of systems. Output format for any IVR system can be provided. A unique service that we offer is the ability to speed-up recorded prompts typically 15% to 20% with no loss in intelligibility. For many customers, this option can result in a significant savings in network line and 800 service charges without having to re-record any prompts. One very exciting offering we have, as mentioned previously, is that our TTS engine utilizes the standard "golden" voice talents used in many IVR systems, so we can provide a TTS voice that matches the recorded voice prompts. Not having the staunch differences in recorded and synthesized voices used in a single application is such an improvement for customer experience and company image! In 2001, Avaya sanctioned FutureCom Technologies as the provider of speech services for themselves and their Conversant® System customers. We have a base of 100 customers that utilize our speech services, including AT&T, Qwest, Convergys and Discover Financial Services. Q How should an enterprise evaluate their needs for using speech technology and are there any vertical markets for which there is a compelling need to implement this technology?
A Speech technology has evolved to a level where we feel that appropriately-used, speech technology-enabled applications are a must today for almost every enterprise, so the questions really should no longer revolve around "if" but rather "when." If companies have already justified using an IVR system, then integrating speech technology should be a relatively incremental cost - given that the speech technology is properly used to its strengths. The key phrase is "appropriately-used," and this issue will be addressed more specifically elsewhere in this Q&A session. Speech is clearly the natural and more desirable communication interface. Any IVR system with a DTMF-based interface can and should utilize speech recognition and synthesis. A well-designed application that utilizes a high quality speech technology system will achieve higher customer satisfaction, less call abandoning, better time usage of CSRs, and significant savings. Q What are the important issues an enterprise should consider when deploying speech?
A As we mentioned above, we feel that speech technology is advantageous for most companies. This feeling seems to be increasingly pervasive not only among individuals in the industry, but users as well. However, some issues remain that affect deployment of speech applications, and customers need to be provided with information that helps them to make the right decisions when it comes to making the choice to utilize an application with speech technology: 1.The industry is presenting some conflicting and confusing information to customers about speech technology; customers are left to discern the reality of the capabilities and best uses of the technology from the commercial "hype." 2.Customers are still tending to look at the speech technologies as an option to add later. They are still clinging to their legacy IVR system, not realizing the significant expense they are incurring by not looking at a total solution involving DTMF detection and speech recognition, and perhaps speech synthesis. 3.Customers are trying to add too many features as a "bolt-on." This approach increases the complexity and the expense of the solution, while at the same time reducing the reliability of the system. Our primary suggestion for addressing these issues would be to keep it simple! ·Consider current needs and anticipate future needs and choose a system/company that can gracefully and cost effectively manage the upgrades, changes and expansion.
·Understand the true capabilities of today's speech technology and what the actual steps and costs involved are to achieve a successful deployment.
·Utilize a high quality speech technology system.
·Look for a system that has features that are well integrated into the system.
·Look for a software solution with minimum hardware support for the technology.
·Technology today is still in an art form that requires in-depth understanding to capably coordinate the technology and your needs, so look for companies that have the right people to accomplish your goals, not the size.
·Start with an application that is simple and technology appropriate to ensure that the implementation cost and the support following implementation will be most cost effective and not more expensive than anticipated.
·Invest in a well-designed application, testing the application with a small user group and working out any issues before full deployment. This also serves to bring the application in-house to evaluate and understand how to utilize it further.
·Look for total integrated solutions so as not to get caught in an "integration no man's land," where there is no one who can take responsibility for all of the parts.
·Additionally, customers must clearly understand the separation of technology from programming environments. For example, when it comes to the new programming environments being promoted, such as VoiceXML, too often, customers have the mistaken impression that the programming environment is equivalent to the speech technology, that VoiceXML has the capability of accomplishing speech recognition. Also, customers are led to believe that this environment allows standard programming across platforms, when in fact each platform has its own caveats that require special programming to enable the application to work. Q Provide us with your thoughts on the various standards that are being implemented and discussed.
A We support any standard that is being requested by our customers. Having said that, within the last couple of decades there have been many so-called "standards" that have been introduced, with much effort put forth to instantiate them, but they never actually reached the level of a true standard. The reason for this is due mostly to the standards being premature or being based on "fad" features that are not proven or established and then lose their popularity after some time, or because the standards have been pushed by technologists and customers never embraced them. We look forward to emerging standards that will succeed as a result of greater "customer pull," as opposed to "technologist push." Q How do you provide confidence to your customers and partners that FutureCom has the ability to support their technology needs?
A Our strength is our people and the technology they create. They have been leaders and contributors in the industry for over 20 years, even as part of larger, influential companies in the area of speech technology. FutureCom is pleased to have them as part of our team, to present their continued endeavors and utilize their experience and expertise both in technology and application development. One unique aspect of our approach is that we offer customers direct access to this expertise, rather than having multiple layers through which customers have to wade. Our staff has the experience of developing state-of-the-art, carrier-class technology and speech technology applications. We are absolutely confident in our ability to provide our customers and partners with unparalleled technology and technical support. Q Where do you expect FutureCom Technologies to be in three to five years?
A We expect FutureCom to continue to grow in a strategic manner and become more of a presence in the industry. With our continued value, we expect to be the technology provider of choice for customers and technology partners. We will continue to expand our CommUnify ICP capabilities to move toward our goal of developing a system that will "talk to everything…listen to everything." With our next generation speech recognition engine under development, we expect to revolutionize speech recognition technology in a manner that is very different from that which exists today. With this new approach, we expect that a number of the barriers to the broad use of speech technology that exist today will be removed, enabling speech technology usage to reach new heights. Q What are some issues that will impact the deployment of speech applications over the next few years and how should the industry address those issues?
A Certainly the stabilization and acceptance of a true standard such as VoiceXML or SALT will impact speech application deployments. Having an agreed-upon standard programming environment for speech technology is a worthy and desirable goal. However, because of the current propositional and evolutionary status of these standards, the industry should view these as they truly are - proposed, not industry, standards. Over the past 20 years, a number of standards have been proposed and strongly advocated for, but none of them has ever progressed to become true standards. Therefore, the industry should be cautious in embracing a standard before it becomes one. Experience tells us that the results of such premature acceptance can be costly in a number of aspects if applications have to be re-engineered later if the proposed standard fizzles out. A major issue is "technology appropriateness." Especially with the enthusiasm about "natural language" (i.e., grammar-based) recognition, companies need to understand the true capabilities of the current speech recognition technology and it is the responsibility of the industry to provide accurate information. There is a clear ROI justification for incorporating speech technology, but we see that the expectations and technology capabilities being put forth by some speech technology companies are frequently pushed to a level of inappropriateness. While there are some excellent applications that are currently in use, too often customers are being led to believe that, at an attractive price, speech technology will work in their application almost as a "plug and play" solution. If used toward its strengths, indeed speech technology can be a ready solution. However, when the capabilities of the technology are pushed to and even beyond its true limits, as seems to be the case with some "natural language" recognition applications, then the application becomes a custom-level work and successful applications typically must be additionally "tuned." The additional cost and time necessary for this tuning is sometimes not initially obvious to customers, resulting in companies being disappointed and/or disillusioned in the technology due to reduction in perceived capabilities, unexpected costs and increased time-to-delivery. This approach to selling speech technology takes the industry a step backward in terms of companies' willingness to put speech technology in place, and we are left doing damage control and again trying to convince customers of the worthiness of the technology.


Dr. Gokcen can be contacted at 614-478-1978, Say "Jeanne", or by email at jmg@futurecti.com, or at http://www.futurecti.com.

SpeechTek Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues