Speechmatics Solves for Entity Formatting
Speechmatics, a speech recognition technology provider, has added Entity Formatting functionality to its Autonomous Speech Recognition (ASR) software.
Tackling one of machine learning's biggest challenges, using inverse text normalization (ITN), the software can interpret how entities such as numbers, currencies, percentages, addresses, dates, and times should appear in written form.
According to the company, entity formatting is notoriously challenging in speech recognition because the way that entities are spoken in conversation varies, even between countries that speak the same language, which adds layers of complexity. Telephone numbers are a great example, where people might use 'oh' instead of 'zero' or double/triple digits such as 'triple three.'
"Creating a more professional transcript will speed up our customers' workflows by making large numbers easier to read, requiring less human editing," said Katy Wigdahl, CEO of Speechmatics, in a statement. "Context is also critical. There are so many nuances and ambiguities that need to be accounted for in language, such as whether 'pounds' is a reference to weight or currency? And whether 'venti' is being used as the Italian word for 20 or winds?"
Based on pre-selected standardizations chosen by the customer, numbers can either be represented in written format or spoken in a transcript.
"This new functionality in our breakthrough Autonomous Speech Recognition will have a decisive impact on our customers working in numerically intensive industries," Wigdahl continued. "Entity formatting has always been a notoriously challenging task for speech recognition, but with this latest update we are delivering best-in-market functionality and bringing significant value to our customers operating in industries where getting numbers right for speech-to-text tasks is mission-critical."