July 15, 2008
By Moshe Yudkowsky President - Disaggregate
Industry View

Lessons from the Blogosphere

Microblogging service Twitter is a perfect example of "I-don’t-get-it" technology. Twitter lets you blog—but only 140 characters at a time. The very first and obvious question is: What good is it? Can’t I just compose a short entry on a regular blog?

Actually, when blogs started, I didn’t get them either. After all, a blog is just a Web page, so if I want to update my Web page, I can simply go into the content management system and do it. But after using blogging software for a while, I began to understand that blogs are, above all else, simple. I don’t need to know how to edit a Web page to create a blog entry; I just need to fill out a form.

Twitter lowers the barrier even further. With Twitter, I don’t even have to fire up a Web browser. I can microblog from my cell phone or a small window that sits on my computer’s desktop. Instead of a complete blog entry, I can type a dozen or so words. Even more simply, I don’t have to visit a Web site to read Twitter. Updates stream right to my desktop.

Blogs and microblogs also provide something far more intriguing: the disaggregation of information from a Web page. Blog entries appear on Web pages, but they’re far more than text on a page. Entries are a collection of text, comments, graphics, and incoming and outgoing links. Users have access to them through Web pages and RSS feeds. An entire ecology of Web sites, such as Del.icio.us, index the entries, while Technorati and Digg rank and search them. Indeed blogging has evolved into a new way to encapsulate, enhance, index, and share information.

Microblogging, however, takes blog entries and transforms them into something else. Twitter discards comments and tagging in favor of immediacy: short, easy-to-read messages; the ability to instantly send and receive messages; and ubiquitous access from cell phones, instant messaging clients, and specialized software. Twitter also offers a rich application programming interface that lets it communicate with people and software agents. There are literally dozens of mashups and software agents based on Twitter—some silly, some interesting, and some very innovative.

How do the lessons of blogs and Twitter apply to speech technology? I’ve developed a double handful of ideas, but I’ll only choose one lesson from Twitter and another from blogs to complete this column.

Twitter is all about immediacy, and speech interfaces rarely work with any immediacy. They’re interactive, of course, but that’s far from being immediate.

From a user standpoint, speech recognition is boring and tedious, with an endless stream of questions and answers. There’s a world of difference between saying "Make sure George gets paid this week" over your shoulder to the accountant in the next cubicle and using even the best speech-based application as you answer a long series of questions about services, balances, account numbers, and secret passwords. We need to think of new ways to remove the barriers to rapid interactions.

More Questions Than Answers
From blogs, another major lesson—or rather, a series of questions—comes to mind: We think of blogs as a collection of information rather than something stuck on a Web page. Today people announce podcasts through blogs, while the podcasts themselves are data afterthoughts, treated just like the accompanying graphics. Certainly we use speech recognition to search for keywords and concepts within the podcasts, but that’s just one part of what speech technologies can do. What about associating a podcast with the feedback, comments, trackbacks, tags, and other information that blogs take for granted? Can we accept feedback from listeners in audio format, or allow them to edit a podcast as if it were a wiki? What’s the right way to index speech and provide access to it to create new applications that we never thought about before? How can the podcast itself encapsulate important information that can be shared, indexed, tracked, and publicized just like a blog entry?

In this case I have the questions, if not the answers, but I also have a final observation. I find it interesting—and somewhat scary—that while audio was cheap and easy to store on the Internet well before video was, no one bothered to create a YouTube-type equivalent for audio files. Video may be intrinsically more interesting than audio, but I doubt that’s truly the reason that audio languishes while video marches on.

Moshe Yudkowsky, Ph.D., is president of Disaggregate Consulting and author of The Pebble and the Avalanche: How Taking Things Apart Creates Revolutions. He can be reached at speech@pobox.com.

Building a Sound Social Presence

Microblogging enhanced with audio could bring personality to posts.

10 May 2013

Lessons from the Blogosphere

Building a Sound Social Presence

ServiceNow Partners with OpenAI on Voice AI

FlashLabs Releases Chroma 1.0 Voice AI Model

Agora Partners with MiniMax on Voice AI

VoiceRun Launches Voice AI Platform with $5.5 Million Seed Round