Speech recognition of unstructured human speech approaching 99% accuracy?
The September 2006 issue of IEEE's Spectrum magazine has a rather interesting article reporting the results of a recent survey conducted among some 700 IEEE Fellows. The objective of the survey was to figure out what IEEE Fellows (not your average Saturday afternoon computer hobbyists!) expect or don't expect in science and technology over the 10 to 50 years.
When asked the question "Will computer speech recognition of unstructured human speech approach 99% accuracy?", 19.1% responded it was unlikely, 61.8% responded it was a likely. On the followup question "When is this likely to occur?" 25.2% indicated in 10 years or less, while 49.5% indicated 11 to 20 years.
The question itself is rather open ended and requires the respondent to make some assumptions. For example, are we to assume the unstructured human speech is coming from one or many speakers? Is the speech intended for human consumption, or is it assumed the user is speaking into a mic expecting a computer to transcribe to text? These sorts of issues make a world of difference in terms of raising/lowering the complexity of the problem.
In any case, its interesting to note that almost 20% of a crowd that one would assume consists of some of the brightest researches/engineers the world has to offer, responded negative to this question. Nevertheless, there's no need to fret. One can think of all kinds of examples where smart people were quite mistaken.
Perhaps the key take away here is the observation some of the more prudent fellows made: "science and technology are unpredictable."