Big Blue has a clue, when it comes to speech tech!
There are tons of companies involved in speech technology, yet when it comes to R&D and introducing core speech technology enablers, and setting industry trends, there are the big three: IBM, Microsoft, Nuance/Scansoft. Its interesting and important to pay close attention to the approach the big three take in the marketplace, as the approach of any or all of them take will have direct consequences on where the rest of the herd goes. Last week we touched on the Microsoft approach, and were left somewhat mystified as to why they are apparently heading towards the cliff, at least with regard to speech technology.
In contrast, IBM is an altogether a different story, perhaps best summarized in the transcript of this recent TMCnet interview with Bruce Morse, VP of Contact Center Solutions at IBM. Morse provides a crisp sketch of IBM's three-pronged focus in the speech industry: contact centers, multimodal interaction, and embedded speech enablers. None of this is particularly surprising as each of the big three is essentially targeting these same areas, among other things. What is interesting is that IBM's approach is well-positioned in terms of capturing speech developer mindshare, while Microsoft's approach is that of brute force, leveraging its lethal Windows control point and not necessarily paying attention to where the market has been going, at least thus far.
Morse mentions, VoiceXML - a mature W3C Recommendation that happens to be the industry standard for implementing speech dialogs. Microsoft promotes SALT, a non-standard technology introduced by Microsoft and rubberstamped by a number of avid VoiceXML supporters (with the exception of IBM) simply out of curiosity and/or fear of now paying lip service to anything Microsoft might be doing. Incidently, SALT was originally toted as a multimodal dialog markup that supposedly addressed VoiceXML's weakness, which according to Microsoft was that VoiceXML was only suitable for voice-only applications. Ironically, since then, the miniscule take-up of SALT in the marketplace has been for the most part limited to voice-only applications, while millions of multimodal VoiceXML (aka XHTML + Voice) have shipped in Opera 8 (Windows version) with IBM speech technology enablers.
Morse mentions, MRCP, an IETF standard in progress that provides an open speech resource integration. The MRCP version 1 draft specification alone has enjoyed widespread support in the industry by virtually every major speech resource vendor in the market, with the exception of Microsoft. To be fair, vendors (besides Microsoft) have supported Microsoft's MRCP equivalent (aka SAPI) but not to the degree MRCP has been supported in its comparably shorter life to-date.
Morse cites IBM's enthusiasm for Eclipse and the variety of free speech technology tooling they have introduced on the Eclipse platform. Microsoft of course has its proprietary Microsoft Visual Studio .NET - a not-so-shabby IDE that has hard-to-ignore mindshare of its own. Neverthless, things are changing rapidly. JBuilder, Visual Studio's only real competitor in the past simply cost too much - much more than an MSDN universal subscription which essentially gave you everything you needed in terms of Microsoft dev tools. Eclipse offers essentially everything JBuilder used to, at the right price! spkydog predicts the eclipse revolution will eventually provide Microsoft competition it can't afford to ignore, and this is a good thing as it assures developers that both Eclipse and Visual Studio will continue to get better.
On that note, we have to agree with a comment to one of my recent postings by IBM's Mr. Jablokov. Its not necessarily a bad thing that Microsoft has not yet woke up to the reality of VoiceXML, it will ensure that VoiceXML keeps getting better. Hopefully, the few SALT survivors will receive a similar benefit, but they ought not hold their breath.
0 Comments:
Post a Comment
<< Home