David Gleaves, Business Development Manager, 20/20 Speech
Q Recent legislation in the UK requires that all broadcast material must be subtitled by 2009. Why did they require subtitles? How are these goals being met?
A The subtitles are displayed on screen to aid the deaf and hard of hearing. The process of creating these subtitles is a very time consuming one, it can take between 16 - 25 hours to subtitle one hour worth of TV output. Recent legislation requires broadcasters to subtitle increasing amounts of program material: in the UK alone, all broadcast material must be subtitled by 2009. In order to meet the targets set by the UK Government in the recent legislation, some way of automating the process was required to meet regulatory targets without commensurate increase in cost. Broadcasters are looking for innovative technology to speed up and automate the creation of subtitles. This is why 20/20 Speech developed a speech recognition system that can use the script and the audio files from broadcast material to place subtitles at the appropriate point in the video for later final editing prior to transmission. The system has gone live, and is being used by the BBC, to gain considerable time savings.
Q What will be the key market drivers for speech technology in the UK during the next few years?
A We see the area of connected mobile computing combined with the increasing number of mobile workers equipped with these powerful devices as being a key driver for embedded speech technology. More and more workers and consumers are using a computer of some sort while on the move, in a car etc. - away from the traditional office or home environment. We believe that a speech interface will greatly enhance the way in which these people can interact with the devices, and in particular gain access to information, when in 'eyes busy, hands busy' situations - a common occurrence for the mobile worker. With a speech interface the mobile worker is able to keep up to date with email, log work flow, and respond to other applications in a way that existing user interfaces do not allow.
Q What are some of the obstacles in embedding devices such as PDA's?
A The main obstacles that 20/20 Speech has come across are that the devices we are working with are not well equipped for speech input or output. We have done a lot of work looking at innovative solutions to the problems associated with headsets and/or inbuilt microphones. Our recent work in the Medical Sector, where we have speech enabled mobile devices for Paramedics (see website for more information), has shown we are well equipped to overcome these issues and offer real solutions to the market place.
Q What impact will mobile computing have on the workforce in the UK over the next five years?
A Something that we are excited about is the launch of packet based data networks such as GPRS (and eventually 3G). As it becomes easier to connect your mobile device to server based applications, and the ability to transfer data becomes faster and more reliable, and ultimately cheaper, we will see a whole host of new applications that will impact on the productivity and efficiency of the mobile workforce. Speech technology, as part of a multimodal user interface will play a big part in ensuring these productivity gains are realized.
Q What are the areas in which speech technology will have an impact on this trend?
A In particular, messaging applications and access to back office data are areas where speech technology will be vital for mobile workers. At a more general level, a speech interface will allow the task orientated workforce access to their systems and applications in situations where they wouldn't be able to before.
Q What vertical market segments do you see as having the most interest in utilizing speech technology in the short term? Long term?
A The medical sector has been very innovative in the uptake of speech technology, and we are seeing a lot of interest in this market place. On a longer timescale we will see adoption of the technology from a whole host of vertical segments from Field Engineers, Couriers, Logistics etc, primarily driven by business use/applications, but slowly moving towards a consumer focus as the technology gains wider acceptance.
Q Tell us about 20/20 Speech's history?
A 20/20 Speech is a Joint venture between QinetiQ (formerly DERA - the Defence Evaluation and Research Agency) and NXT Plc, the licensors of flat-panel loudspeaker technology. 20/20 Speech inherited over 30 years of experience, technology and know-how from the Speech Research Unit within DERA. We now have proprietary speech recognition and text-to-speech synthesis products available and in use.
Q What work are you doing with the government?
A Much of the work we do for the government concerns the UK Ministry of Defence and, therefore, is confidential. Much of our work in this area is regarding speech recognition in high-noise environments such as fighter jet cockpits and land based military vehicles such as tanks.
Q What makes 20/20 Speech unique among its competition?
A Due to our long history working for the MoD, we have vast experience in deploying speech technology in a wide range of applications, and a number of different noisy environments. We have taken the expertise gained from these military projects, and been able to shape our commercial offerings. In this way we can offer speech technology applications that work in real applications and in real environments.