Speech Translation Requires an Interface Revolution
The interface demands of speech translation are wildly different from any previous software interface. Customers want a versatile interface design that has never existed before, but one that is fast becoming the expectation. In short, speech translation requires a revolution in interface conceptualization, planning, and execution.
The key to understanding this change involves a review of the state of normal business communications. When we communicate, we use telephone, web conferencing (dozens of options with web audio or dial-in), private branch exchange (PBX), Skype, voice over internet protocol (VoIP), and session initiation protocol (SIP). Next we review the scope of the scenarios that create normal communication: face-to-face, phone one-on-one, large conferences, smaller meetings, roundtables, etc. To these variables, add the need to communicate across various devices.
The result is that over many decades we have created numerous ways to communicate in the same language; to use these communication options, the participants agree in advance on how to communicate, then they meet at that space to talk or type.
One Single Solution
Now, suddenly, comes voice translation. Are we going to do with voice translation what we did with communication methodologies—create dozens of applications, one for each scenario? That is not what customers are demanding. Customers want a single voice translation solution that covers every possible communication methodology—methodologies that have taken decades to create and grow.
Customers want one versatile interface covering every pain point and language situation. They want multiple experience choices per scenario: Not one solution per scenario, but several from which they can pick and choose the most appropriate for their audience and budget. This requirement for a mass of options per scenario is overwhelming and new.
Conferences and Meeting Interfaces
As speech translation explodes into the conference hall, event organizers are demanding many more options to receive and share language translation: interpreter voices arriving via internet to smartphones or spread via private Wi-Fi and access points; subtitles displayed on screens, smartphones, or monitors scattered around the room. Each option needs a different interface functionality.
How the voice that is to be translated is sent to the cloud for translation is yet another interface. The content and display of voice translation that organizers want range from transcription (captions) to automatic speech-to-speech-and-subtitle translation to live remote interpretation. Voice translations must be able to be derived from the voice of the speaker, from a parrot (who repeats the speaker’s words more clearly), or from an interpreter. A client’s potential need for dozens of interpreters must be accommodated, as well as dozens of translation languages.
And organizers are adding yet another new request: interactivity across languages. Attendees must be able to interact with event speakers, moderators, registration, and staff in their native tongue. Again, the request is not for one single way to interact, but many: automatically translated speech-to-subtitles, translated speech-to-speech, private translated text chat exchanges, and so on.
Today’s enterprises want one single speech translation solution that can function across communication methodologies. Added pressure is piled high by the demands for high-quality accuracy in automatic translation.
Customer Service Interfaces
Customer service is also being shaped by these new demands. The service industry now requires multi-language interfaces covering speech-to-speech and speech-to-chat as well as multimedia features such as video, presentations, screen sharing, and others. Some companies are making their APIs available for incorporating into customer service apps and bots. Using these APIs and displaying these new translations will affect the interface designer not only with regard to the space required to display translations but also in handling translated audio output via web, PBX, and digital audio.
Meeting the translation usability demands for enterprise customers will require a multitude of interface layouts and configurations. What companies are asking for is a developer’s nightmare: one application that encompasses every communication methodology a company possesses, capable of translating every written language and spoken dialect known to man.
Oh, and they need it now. Somehow, developers will have to deliver.
Sue Reager is president of Translate Your World, developers of software for across-language speech communication. She can be reached at firstname.lastname@example.org.
Entrepreneur Jeffry Williams, CEO of Worldwide Tech Connections, sees big challenges, and even bigger opportunities for speech tech providers, in the need to tackle the translation requirements of today's business climate.
23 Dec 2019
Speech translation apps are a game changer for international sales
03 Jun 2019