|
|
 | | From: | deltagreen | | Subject: | Re: Mac OSX | | Date: | 14 Dec 2004 08:09:25 -0800 |
|
|
 | Creating special pronunciations where a word is represented by a grammar of succession of single phonemes is risky, but may be worth a try. You may then end-up with a successful grammar match and the times associated with the recognition.
As a general rule though, in today's speech recognition, you should stay away from phoneme recognition level. Some people needs it though, like for drawing lips to a TTS character - but that's at the TTS level where they know which phoneme is being spoken at all time since they drive the process. On the speech recognition side, rare are the cases where you need to know exactly which phoneme is being spoken at every given time.
I am not aware of what you are trying to achieve, but there's a red flag if you need the phoneme time and score for your recognition. Phonemes are nothing in speech if it's not to get to recognized pronunciations.
|
|
 | | From: | Richard Owlett | | Subject: | Re: Mac OSX | | Date: | Tue, 14 Dec 2004 13:56:52 -0600 |
|
|
 | deltagreen wrote:
> [ possibly significant snip ] > > As a general rule though, in today's speech recognition, you should > stay away from phoneme recognition level.
Can you supply non-technical supportive link(s)? I'm interested in "command and control" and/or "limited vocabulary" applications.
I had a friend who was involved in speech recognition research back in early 70's. He was thrilled to have 1/2 of a KL-10.
I casually followed some SR in late 70's / early 80's ( discrete rather than continuous )
My 'gut feel' is "state of the art" has REGRESSED :(
OK, so ' AND " are significant ;) But I think I somehow represent one potential end user class.
BUT, I also recognize that I may not know enough to pose useful questions.
Please refer me to WEB resources as I *DO NOT* have access to any academic library.
|
|
|