Speech Technology Magazine

 

Gordon Renton of Speegle

Gordon Renton of Speegle explains how search is a fundamental tool, but at present the user needs to change focus to use it and under the pressure of time constraints, an alternative could save a lot of time, and how speech is part of this alternative solution.
Posted Feb 1, 2005
Page1 of 1
Bookmark and Share

Q. What is Speegle?
A. Speegle is a server-side speech application applied to a search engine. It uses a search engine backend to find results and reads them out loud as a summary in several different voices appropriate to the perceived accent familiarity of the user. When a result is heard that interests the user, the number of the search is pressed to redirect the user to that page on the Web site.

Q. What is the benefit of using speech on a search engine?
A. Traditional screen readers which are familiar to many users are not designed specifically for searches. Search is almost the first point of contact with the Web for the majority of users and from there they continue finding pages of interest. For many people who have impaired vision or other medical or cognitive problems, it helps them quickly find pages of interest. Because it is universal in its use, it can be used by those people who normally have screen reading software installed and are away from their computers, for instance in an internet café, and best of all it's free. We have developed several different techniques in gathering information about a Web site, so although we display results, the ensuing description may not be the exact one that is read aloud. We read only what we think is relevant to giving a quick summary of that site. Of course, it could also be used by anyone as an "eyes free" tool when they are working on another application and search for Web sites in the background.

Q. What are some of the advantages of using the enhanced search tools on Speegle versus the regular Google search?
A. There are a number of enhancements we are working on to make search more like a background tool and not the direct focus of the screen real estate. Search is a fundamental tool, but at present the user needs to change focus to use it and under the pressure of time constraints, an alternative could save a lot of time. It always seemed odd to us that many applications neglected use of the keyboard. This was one reason why we have been so pleased with how our keyboard redirect key has been so well received. As part of our interface it was absolutely necessary for obvious reasons. These same obvious reasons can also be used by anyone even if they love their mouse, which I personally don't. The searches are using the results of Google at present, but we can adapt to other search engines.

Q. What challenges or difficulties did you face implementing speech and how did you overcome them?
A. We started seriously looking at speech in 2002. At that time all speech delivery was broadband and required two or more operations sometimes lasting several minutes to hear the results. We felt that this needed to be improved and it had to be universal for the many people who do not have fast connections.  With all our resources focused on this problem, we released PanaVox in 2003 - our method of Internet speech delivery to narrowband regardless of operating system or browser. Therefore, it worked on Apple, Linux, Solaris and Windows; and because it would work on 33k modems it was as close to universal as we could get. After the technical problems were overcome, we studied feedback from users and came to the conclusion that for general acceptance, the quality of the voice needed to be improved even though people using screen readers actually prefer a more robotic delivery. This puzzled us, but it was explained by the phenomenon of being able to "tune in" to the accent of a dialect speaker after a very short time much in the same way that we first reject a strong accent and sometimes "turn off."  We needed to address this problem as, after all, it was a computer and users are far less forgiving if they know it's not a real person. If it was a real person speaking on an important topic we somehow concentrate a little more and begin to "tune in" and gradually become more comfortable with an unfamiliar accent. We spent a long time making changes to the prosody of the output and developed some advanced tools to make this easy and a one-time only correction. Speech has come a long way in a very short time and continues to improve. When we released our Speak Perfect voices, we were able to tackle more broadly-based applications. Speegle is the first of these applications.  We have three voices which are used on our technology preview with the ability to add many more if required.

Q. What other ways do you foresee Speegle using speech moving forward?
A. Speegle achieved worldwide publicity almost immediately after the site went up and has opened up the possibility of using speech outside the narrow confines of an accessibility issue to which we never thought it should be restricted. Speech is far too important to be restricted and Speegle has opened many eyes and ears to the possibilities for the future. There are compelling applications aimed at functionality which we are working on and will soon be making available. If nothing else, Speegle has exposed speech technology to several million users who had never heard of it other than via Professor Stephen Hawkins. During its short period on the Web, it has spawned many contacts with other speech companies and businesses who would like to work with us and we will definitely work with some of these companies in the near future.

Q. Are there statistics that you can give about results or user satisfaction with the speech application?
A. We can only say that the Web hits have been beyond our wildest expectations on all of our sites and they continue to grow with a large number being repeat visits. There has been a mix of users with some obviously looking for prurient results in the hope of hearing something risqué. The large number of foreign hits has encouraged us to develop different languages more quickly than we had originally planned. For many it has been a novelty, but the potential for those who can benefit more directly has been recognized. Overall, the feedback that we have had falls into the vast majority being very positive with very few negative comments about the quality of the voices used. Like everything it is a matter of taste and we here in Scotland find the Scottish accent rather good. The many blogs will show a spread of results, and thankfully there has been little from those who don't care, so we are polarizing comments rather than being ignored, which is a much better result.

Q. Do you have any additional comments?
A. There are a number of ways we can make speech more universal and bring it out of a rather narrow accessibility function which many people would rather confine it to in order to maximize their company or their interest group. Speech is compelling, instructive, universal, and the most natural form of human communication. Speech technology is just beginning, and we intend to show exactly how it can benefit everyone in the future. With improvements in speech technology, it won't be long before it will be very difficult to tell the difference between a synthesized voice and a human voice. The groundwork we have already covered will position us very well in the future to take advantage of this universal tool. As Speegle's marketing says, "It's obvious."

Page1 of 1