<body><script type="text/javascript"> function setAttributeOnload(object, attribute, val) { if(window.addEventListener) { window.addEventListener('load', function(){ object[attribute] = val; }, false); } else { window.attachEvent('onload', function(){ object[attribute] = val; }); } } </script> <div id="navbar-iframe-container"></div> <script type="text/javascript" src="https://apis.google.com/js/platform.js"></script> <script type="text/javascript"> gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() { if (gapi.iframes && gapi.iframes.getContext) { gapi.iframes.getContext().openChild({ url: 'https://www.blogger.com/navbar.g?targetBlogID\x3d9519466\x26blogName\x3dthe+spkydog+koop\x26publishMode\x3dPUBLISH_MODE_BLOGSPOT\x26navbarType\x3dTAN\x26layoutType\x3dCLASSIC\x26searchRoot\x3dhttps://spkydog.blogspot.com/search\x26blogLocale\x3den_US\x26v\x3d2\x26homepageUrl\x3dhttp://spkydog.blogspot.com/\x26vt\x3d-4534400202552370894', where: document.getElementById("navbar-iframe-container"), id: "navbar-iframe" }); } }); </script>

Thursday, June 29, 2006

Cool multimodal video in Google's top 100 List


Here's a fantastic multimodal demo on a table top display in which the user uses a combination of speech and two hand interactions to control Google Earth and Warcraft III. I've seen Bill Gates showing some tabletop interaction scenarios at a keynote (CES??) sometime in the recent past, but this blows it away in terms of being a compelling demo.

The demo is based on the research of Edward Tse, a CS Ph.D. student at the University of Calgary. There is similar work being done at MIT.

You can watch the demo at Google Video.

Wednesday, June 14, 2006

Yet Another Voice Search Patent

Yesterday V-ENABLE issued a press release announcing a mobile voice search patent granted by the USPO. spkydog hasn't yet had time to peruse the content of the patent, but it isn't clear that this is the first such patent in the mobile space as the press release claims. Google's recent voice search patent teaches a variety of client devices, including PDAs and telephones.

V-ENABLE's press release.

Tuesday, June 13, 2006

IBM Hires 100+ Speech Researchers?

In the recent past we've posted and speculated about the fact that Yahoo and Google have hired a number of speech folks over the past year or so. This is the first I've heard about IBM beefing up big time in the speech area, at least in the recent past? Not sure who they might be, though we are aware of some who have left IBM recently. Hmmm...

The success of some of these limited-application voice recognition systems has recently prompted the big software heavyweights, Microsoft and IBM, to make further investments. IBM has hired more than a hundred extra speech technology researchers, with the aim of developing a system capable of matching the human level of speech recognition by 2010. And Bill Gates recently said that "we [Microsoft] aim to have computer systems capable of matching a human level of speech recognition by 2011"


Read the ZDNet article this quote was taken from.

Monday, June 12, 2006

Speech/VoiceXML Merger Mania - Where's Google?

The Cisco announcement(s) last week add to the growing list of mergers/aquisitions involving vendors who are involved in the speech/VoiceXML industry. Here's the ones I can recall at the moment:

  • Cisco acquires Metreos and Audium
  • Genesys (Alcatel) acquires VoiceGenie Technologies
  • Cantata merges with Excel and Brooktrout which had acquired Snowshore earlier
  • Voxeo acquires Vocomo
  • HP acquires PipeBeach
  • Microsoft acquires Unveil and more recently picked up Vocalocity's VoiceXML technology
  • Genesys acquires Telera
  • Scansoft acquires Nuance, and many other firms.

I'm sure I'm missing some here, but absent from the list are Google and Tellme. Perhaps its time to resurrect the rumor of Google acquiring Tellme? :-)

Wednesday, June 07, 2006

Burger King Uses Speech Recognition to Take Drive-Thru Orders

Here's a fun one for ya all... after firing their drive thru attendant and attempting to outsource the job of taking burger orders to a call center in India, this Burger King finally decides to give speech recognition a try.

Watch the video.

Monday, June 05, 2006

Prof. Hawking Reads Business Week?

Here is a rather nice example of how the increasing degree of interactivity on the web makes rather boring redundant content a bit more interesting. In this recent Business Week technology column on speech recognition, the author regurgitates the same technology summary/predictions journalists have been writing for at least a decade now. The reader comments go on to add more interesting content/analysis than the original article itself. For example, if you think the guy in the cube next door who is constantly on a conference call annoys you, just wait until you hear the dull roar in the cubicles that is enabled when speech recognition gets as good as folks are predicting it will! Hopefully by then all of us technologists will be telecommuting!

One of the more interesting comments is supposedly supplied by Prof. Hawking's himself. Not sure if that's the case, but the point made (conventional ASR does a pretty good job of recognizing conventional TTS) is actually something I've confirmed independently. Unlike Hawking's, the typical use case for most of us is to automate testing of speech systems/applications.

Read the article.