Speech Technology Magazine

Banking on Multimodality

For financial transactions, gains are seen but challenges remain.
By Matt Yuschik - Posted Nov 10, 2012
Page1 of 1
Bookmark and Share

The smartphone is evolving at a rapid pace. The iPhone 5 and the Galaxy S3 are pushing the limits of phone-based technology with new features—near field communication (NFC), speech recognition, voice authentication—and more appealing user interfaces, such as Swype, resolution, colors, and screen size. Some capabilities require accessing the native code of the device, while others leverage hybrid solutions with parts of the application accessing cloud-based capabilities. Apple and Android, with dominant market share, take advantage of these capabilities to attract new smartphone customers.

Needs assessment of mobile services shows strong interest in entertainment, location-based searches, news and sports updates, social networking (including email), and transaction-based services, such as mobile banking. Some of these services are already on the Web, with a graphical user interface (GUI) defined for smartphone access. Others leverage a voice user interface (VUI) for call routing in call centers, for example. So, how should a GUI and a VUI be combined to create a superior multimodal user interface (MMUI) for specific transactions? And what is the advantage of doing so? Well, a smartphone is a technology-rich device that shows how this UI is rendered, and a reference application, such as mobile banking, helps identify what the multimodal strengths and challenges are.

Multimodal mobile banking satisfies on-the-go customers who want to use their phones to complete banking transactions anytime, anywhere. Value drivers include customer satisfaction, retention, Net Promoter Score, cross-sell opportunities, and faster time to market. Toolkits enable rapid prototyping to address flexible transaction flow, presentation, user testing, ease of use, and error handling. (See a toolkit compendium on the AVIOS.org site, compiled by James Larson.) The Nuance Nina framework provides a starter kit for developing iOS or Android mobile applications, and includes a multimodal mobile banking example. Other browser-based languages (HTML5, JQuery Mobile, Java Script) use Adobe's PhoneGap or Voxeo's Tropo to access native features for multimodal applications.

A major MMUI challenge is the expectation of consistency with prior GUI or VUI application interfaces—often constrained by the platform that supports them—like banking capabilities rendered on an ATM, smartphone, tablet, or the Web. This issue is amplified when designs are narrowly focused. The services packages should be viewed on a continuum. Devices or modalities should seem like different "skins" for different yet consistent sets of extensible features. Consistency facilitates learning transfer, so interacting with an application on one device makes it easier to interact with a similar application on another device, rather than the user having to learn a different UI paradigm.

There is an art and science for designing a MMUI for a service, its transactions, tasks, and operations. The value of a MMUI is to provide choices to interact with the service. One modality may be preferred for one security situation (speaker verification versus facial recognition), another for a specific task (entering numbers versus screen navigation). Modalities can work together (user authentication) or be complementary (reprompting for misrecognition, mistyping, or chat). Adding more features increases complexity and cognitive load, but a MMUI allows the creation of a more flexible mental model. Generally, speech is considered serial (selecting options from a menu), while a visual display presents all choices simultaneously. However, a MMUI enables a voice shortcut to allow a user to jump ahead (through multiple screens), or for one utterance to fill in multiple fields of a form. A cross-channel banking example starts with a spoken request on a smartphone to display account balances, deposit a check using the phone's camera, and prestage a cash withdrawal with NFC at an ATM. Modality fusion makes the overall transaction easier and more satisfying.

Transaction-based services (like voice banking) are ubiquitous. However, the road to a mobile wallet will only move forward through ongoing testing of MMUI prototypes to identify strengths, mitigate confusion, and seamlessly direct the user back on the success path. The future of multimodal banking with a smartphone is only limited by the energy and vision of the designers and the banking institutions.


Matthew Yuschik is a mobile solutions architect for Citibank's R&D group, where he designs and tests the Multimodal Mobile Banking UX/UI. He holds eight patents and is on the AVIOS Board of Directors.


Page1 of 1
Learn more about the companies mentioned in this article in the Speech Technology Buyer's Guide:
Learn more about the companies mentioned in this article in the Vertical Markets Guide: