<body><script type="text/javascript"> function setAttributeOnload(object, attribute, val) { if(window.addEventListener) { window.addEventListener('load', function(){ object[attribute] = val; }, false); } else { window.attachEvent('onload', function(){ object[attribute] = val; }); } } </script> <div id="navbar-iframe-container"></div> <script type="text/javascript" src="https://apis.google.com/js/platform.js"></script> <script type="text/javascript"> gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() { if (gapi.iframes && gapi.iframes.getContext) { gapi.iframes.getContext().openChild({ url: 'https://www.blogger.com/navbar.g?targetBlogID\x3d9519466\x26blogName\x3dthe+spkydog+koop\x26publishMode\x3dPUBLISH_MODE_BLOGSPOT\x26navbarType\x3dTAN\x26layoutType\x3dCLASSIC\x26searchRoot\x3dhttps://spkydog.blogspot.com/search\x26blogLocale\x3den_US\x26v\x3d2\x26homepageUrl\x3dhttp://spkydog.blogspot.com/\x26vt\x3d-4534400202552370894', where: document.getElementById("navbar-iframe-container"), id: "navbar-iframe", messageHandlersFilter: gapi.iframes.CROSS_ORIGIN_IFRAMES_FILTER, messageHandlers: { 'blogger-ping': function() {} } }); } }); </script>

Wednesday, August 30, 2006

Microsoft discontinues...folds... integrates... unveils, and is right on...

For those of you expressing concern that spkydog may have met an untimely demise while chasing the mailman, do relax. We've just been enjoying the waning dog days of summer while we can. Meanwhile, the speech industry keeps chugging along.

Earlier this month, tech journalists reporting on SpeechTek threw us what appeared to be a rather tasty bone when trying to interpret Microsoft's announcements at the conference. Initially, we just about fell out of our koop when reading the first line of an earlier version of this Information Week article which suggested that "Microsoft plans to discontinue selling speech server". After reading on, we understand that Microsoft is folding the Speech Server product into its Office Communication Server 2007 product line. Whew! We certainly wouldn't want to see Speech Server put in the pasture so soon after Microsoft put its weight behind VoiceXML.

There is a bit less hyperbole in the opening lines of this IT Week article, where the development is described as an unveiling of new speech technologies in Office Communication Server 2007. Nevertheless, the author dedicates at least half of the article to commentary on the now infamous speech demo Microsoft gave to financial analysts earlier this summer. Talk about beating a dead horse...

In any case, spkydog thinks Microsoft has made a wise move here, which will likely serve both them and the speech industry quite well. Despite the recent negativity and skepticism Microsoft's demo debacle has drawn, spkydog believes speech technology is more than ready for prime time - on our desktops, in our cars, on our PDA's/mobiles and of course our POTS phones, as usual. The bigger issue is getting the masses accustomed to using speech on a regular basis. Those of you who were introduced to using a mouse on your PC as an adult probably have memories of finding it rather cumbersome the first time. Speaking to machines has the analogous implications, and probably will for some time.

By integrating speech technology as a feature into systems that will be widely used by the masses (e.g., Windows Vista and Office Communication Server) Microsoft is putting speech technology in front of virtually everybody. Management at Acme, Inc. won't have to sit and spend cycles discussing whether or not its time to investing in a speech-enabled version of their favorite applications - its just going to happen. Selling speech technology as a stand-alone enabler is a tough sell.

There are actually very few companies who are in a position to market speech technologies to the masses in the way Microsoft is. Nuance has lots of speech technology, but in terms of apps is limited to the call center. Why buy Dragon when you get almost the same functionality bundled with Office? IBM has lots of apps and a fair amount of speech technology, but you run Microsoft bits on your PC all the time, and IBM bits only part of the time. Google certainly has the application reach, but their speech tech capabilities are not well understood, making them somewhat a wildcard.

In short, the folks in Redmond seem to be on to a decent strategy for differentiating their platforms and applications with speech technology.

Thursday, August 03, 2006

Google to Release n-gram Models

Today Google announced it will be providing a 6 DVD set of n-gram models it has generated from a training corpus of over one trillion words - culled of course from public websites.

"We believe that the entire research community can benefit from access to such massive amounts of data. It will advance the state of the art, it will focus research in the promising direction of large-scale, data-driven approaches, and it will allow all research groups, no matter how large or small their computing resources, to play together. That's why we decided to share this enormous dataset with everyone. We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times. There are 13,653,070 unique words, after discarding words that appear less than 200 times."


Read the complete article.

Wednesday, August 02, 2006

More VoiceXML-Related Open Source Projects Announced

Yesterday Voxeo announced www.rocketsource.org - an open source project involving three off-the-shelf VoiceXML/CCXML applications. It seems at least one of the applications (Voice Conference Manager) already existed as an open source project, but getting Voxeo behind certainly won't hurt. I would suggest the project is appropriately named, considering it's the rocket scientists over at Voxeo who are making it happen. I don't think the sun ever sets in that shop!

Read the press release.

Tuesday, August 01, 2006

Microsoft's Unfortunate SR Demo Takes First Place on Google Video

For the past three days a 1 minute 39 second video clip of Microsoft's infamous speech recognition demo has the honors of being #1 on the Google Video Top 100 List. This is rather remarkable given the cacophony of often bizarre clips that it must compete with to earn the distinction. Please help me... is an unfortunate speech demo (alas they happen everyday... though not necessarily in financial analysts meetings) really as popular as Diet Coke & Mentos explosions, and lip syncing Chinese adolescents? Does anybody have any insight into the heuristics Google applies to generate the top 100 video list? One wonders if Google is simply having a little fun of their own at Microsoft's expense?

Watch the video.