<body><script type="text/javascript"> function setAttributeOnload(object, attribute, val) { if(window.addEventListener) { window.addEventListener('load', function(){ object[attribute] = val; }, false); } else { window.attachEvent('onload', function(){ object[attribute] = val; }); } } </script> <div id="navbar-iframe-container"></div> <script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script> <script type="text/javascript"> gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() { if (gapi.iframes && gapi.iframes.getContext) { gapi.iframes.getContext().openChild({ url: 'https://www.blogger.com/navbar.g?targetBlogID\x3d9519466\x26blogName\x3dthe+spkydog+koop\x26publishMode\x3dPUBLISH_MODE_BLOGSPOT\x26navbarType\x3dTAN\x26layoutType\x3dCLASSIC\x26searchRoot\x3dhttps://spkydog.blogspot.com/search\x26blogLocale\x3den_US\x26v\x3d2\x26homepageUrl\x3dhttp://spkydog.blogspot.com/\x26vt\x3d-4534400202552370894', where: document.getElementById("navbar-iframe-container"), id: "navbar-iframe" }); } }); </script>

Wednesday, October 04, 2006

Speech recognition of unstructured human speech approaching 99% accuracy?

The September 2006 issue of IEEE's Spectrum magazine has a rather interesting article reporting the results of a recent survey conducted among some 700 IEEE Fellows. The objective of the survey was to figure out what IEEE Fellows (not your average Saturday afternoon computer hobbyists!) expect or don't expect in science and technology over the 10 to 50 years.

When asked the question "Will computer speech recognition of unstructured human speech approach 99% accuracy?", 19.1% responded it was unlikely, 61.8% responded it was a likely. On the followup question "When is this likely to occur?" 25.2% indicated in 10 years or less, while 49.5% indicated 11 to 20 years.

The question itself is rather open ended and requires the respondent to make some assumptions. For example, are we to assume the unstructured human speech is coming from one or many speakers? Is the speech intended for human consumption, or is it assumed the user is speaking into a mic expecting a computer to transcribe to text? These sorts of issues make a world of difference in terms of raising/lowering the complexity of the problem.

In any case, its interesting to note that almost 20% of a crowd that one would assume consists of some of the brightest researches/engineers the world has to offer, responded negative to this question. Nevertheless, there's no need to fret. One can think of all kinds of examples where smart people were quite mistaken.

Perhaps the key take away here is the observation some of the more prudent fellows made: "science and technology are unpredictable."

Monday, October 02, 2006

Hands/Eyes Free on Windows Mobile 5.0

Fonix recently announced VoiceCenter 3.1 for Windows Mobile 5.0 - coming to your smartphone soon for only 40 bucks! According to Fonix VP Walt Nawrocki, "speech recognition is a 'must have' to avoid tedious menus and button pressing." So just what is he saying here? Navigating the Windows Mobile 5.0 UI shell involves a bunch of tedious menus and buttons and badly needs a fix? Or, is there a deeper issue here... small mobile devices are inherently difficult for humans to interface with due to their small craniums and large paws?

spkydog would suggest in the case of Windows Mobile 5.0 there is a little of both going on here. One could argue that iPod user's aren't clamoring for speech interface, but than again iPods have fairly narrow functionality, while smart phones support a wide range of functionality.

Perhaps a more interesting question is whether or not the iPhone will need a speech interface... anybody care to make a guess?

Read about Fonix's VoiceCenter here.