<body><script type="text/javascript"> function setAttributeOnload(object, attribute, val) { if(window.addEventListener) { window.addEventListener('load', function(){ object[attribute] = val; }, false); } else { window.attachEvent('onload', function(){ object[attribute] = val; }); } } </script> <div id="navbar-iframe-container"></div> <script type="text/javascript" src="https://apis.google.com/js/platform.js"></script> <script type="text/javascript"> gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() { if (gapi.iframes && gapi.iframes.getContext) { gapi.iframes.getContext().openChild({ url: 'https://www.blogger.com/navbar.g?targetBlogID\x3d9519466\x26blogName\x3dthe+spkydog+koop\x26publishMode\x3dPUBLISH_MODE_BLOGSPOT\x26navbarType\x3dTAN\x26layoutType\x3dCLASSIC\x26searchRoot\x3dhttps://spkydog.blogspot.com/search\x26blogLocale\x3den_US\x26v\x3d2\x26homepageUrl\x3dhttp://spkydog.blogspot.com/\x26vt\x3d-4534400202552370894', where: document.getElementById("navbar-iframe-container"), id: "navbar-iframe", messageHandlersFilter: gapi.iframes.CROSS_ORIGIN_IFRAMES_FILTER, messageHandlers: { 'blogger-ping': function() {} } }); } }); </script>

Wednesday, March 30, 2005

Mobile navigation software gains voice control

Navigon GmbH has added voice control to its mobile navigation software package for Pocket PCs. In addition, Mobile Navigator 5 boasts a completely revised user interface and many new features that make the system easier, more intuitive, and faster to use, according to the company.

The recognizer is an embedded speaker-dependent technology (VoCon 3200) from ScanSoft. Several mobile phone-based GPS navigation products have used speech recognition (via a POTS voice call to a network-based speech server) to enter the destination address, and others a human-in-the-loop call center. The fact that this solution uses embedded speech leads me to believe its multimodal, which would make for a clean UI. Anybody more familiar with the soon to be released Navigon app?

Read the article.

Monday, March 28, 2005

Nuance Blasts Off!

The International Space Station has a new crew member - Clarissa, a virtual assistant powered by Nuance's ASR engine. Apparently, notebook computers have a tendency to float away from the floating crew members, and speech interfaces are just the ticket for getting tasks done more efficiently. The grammars were constructed using the open source Regulus platform.

The speech UI for the water quality analysis procedures described in the article have me wondering if there might be a better choice of commands. I can see "complete" and "repeat" getting mixed up in a hurry. If they have a "exit" with that "next" command, they may have similar problems.

According to the NASA website, this is the first spoken dialogue system in space. Congrats Nuance!

Read the article.
NASA's Clarissa Project site.

Wednesday, March 23, 2005

Common Sense Speech Recognition

Henry Lieberman, a research scientists at MIT's Media lab investigate improving speech recognition by using a database of common sense facts. The Open Mind Common Sense Project database contains more than 700,000 facts that have been accumulated over the past 4 years or so and is used to help choose among recognition results returned by the speech recognizer with "close" confidence scores. They report a 17% reduction in errors and a 7.5% improvement in dictation time when dictating speech in topical areas for which the database contains facts.

Read the IUI '05 paper.
Henry Lieberman's homepage.

Friday, March 11, 2005

Those voices behind the scenes...

Ever wonder about where those pre-recorded prompts come from when your call gets answered by an automated system? A lot of them come from Walsh Media in Chicago. Nailing down the right pre-recorded audio for your speech app is critical to its success, and not at all an easy task!

Thursday, March 10, 2005

Speech recognition saves Prof his job!

In the face of a repetitive strain injury, Prof. Huntley Schaller has been able to use Scansoft's Dragon Naturally Speaking dictation engine to get his job done. This is an interesting example of speech technology, while not perfect, does add significant value to folks with a real need for it.

Read the article.

Wednesday, March 09, 2005

Microsoft Releases Voice Command for Three European Markets

"There's been much grumbling that Microsoft's Voice Command application was only available in the US and Canada, so I was pleased to see that they've released versions for UK English, German and French. The press release is below, and you should be able to find the software on the localized versions of Handango."

Read the article and press release.

Tuesday, March 08, 2005

India's call centers get outsourced to VoiceXML...

Some interesting commentary from a Datamonitor report:

"India's sunrise call centre industry could face a new threat. It is not a country, say a China or Philippines, but technology that is a cause for concern. Speech recognition technologies that were considered to be of little use when responding to customer queries have finally come of age. These are being deployed by large organisations such as Bank of America, Citigroup, Kodak, Prudential, Verizon, Quest, MCI, T-mobile, American Airlines and Continental Airlines to handle customer queries. The fact that speech recognition is maturing rapidly will ensure that it has to be considered as a threat to Indian call centres."

Read the whole article.

Monday, March 07, 2005

More Kudos for XHTML+Voice

Slate recently published a glowing review of the Opera 8 browser beta and its support for XHTML+Voice - the emerging multimodal standard based on mature W3C recommendations - including VoiceXML for coding the speech dialogs. If spkydog is not mistaken, Slate is owned by Microsoft. Rumor has it that Microsoft will soon be announcing VoiceXML support for Speech Server, and rolling X+V support into IE. ;-)

Kidding aside, all indicators are that VoiceXML is THE standard for voice dialog markup, and is rapidly becoming the voice dialog markup of choice for emerging multimodal applications as well.

Read the Slate article.

Wednesday, March 02, 2005

DTMF reigns! ...at least for now...

According to another analyst report (451 Group) the automatic speech recognition market segment represents approximately 5 percent of the overall market for interactive voice response (IVR) technology and services; 95% of the market still uses DTMF. We're told to expect more mergers and acquisitions in the speech industry. IBM and Microsoft are likely to emerge as the winners in this market, because of their strategic commitments to the sector, and their considerable resources.

Read more.

Scansoft ships 43% of all ASR ports worldwide in 04

A recent report from Gartner claims that Scansoft shipped 43% of all ASR ports worldwide in 2004. A total of 63.9K ports for the year, which according to Gartner was 20% more than their nearest competitor! You'll of course have to buy the report from Garner to get the rest of the story.

Read the Scansoft Press release.