knowledge-database (beta)

Current group: comp.ai.

RNNs and LSTMs for Noisy Substring Matching

RNNs and LSTMs for Noisy Substring Matching  
intrest86
 Re: RNNs and LSTMs for Noisy Substring Matching  
Ted Dunning
From:intrest86
Subject:RNNs and LSTMs for Noisy Substring Matching
Date:Mon, 17 Jan 2005 01:55:25 GMT
Hello everyone,

If this isn't the appropriate Newsgroup for this post, just let me
know.

Basically I am looking for a way to feed in a long string of symbols,
and then feed in another shorter string of symbols and check if the
second string is a substring of the first. However, the second string
would be noisy. In the end, I am really just looking for a confidence
value of how likely there is a match.

I'm not sure if this is possible with RNNs or LSTMs. I am thinking
that it just is impossible for classical RNNs since there could be a
long time lag between feeding in the match of the substring in the
first string, and feeding in the actual substring. That is why I am
looking more at LSTMs because of their ability to store information for
a longer period of time without it decaying away.

But is it even possible? The first string could be of significant
size, and it doesn't seem like there is anyway it could solve the
problem without somehow storing ALL of the large string, which would
mean an arbitrarily large number of memory cells. On the other hand,
I'm only looking for a probability, so if an NN can give a reasonable
probability, then that would still be success.

Ok, I am rambling. Basically, I am new to NNs and am trying to get a
feel for their applicability. I know that they can do pattern
matching, but how about subpatterns?

[ comp.ai is moderated. To submit, just post and be patient, or if ]
[ that fails mail your article to , and ]
[ ask your news administrator to fix the problems with your system. ]
From:Ted Dunning
Subject:Re: RNNs and LSTMs for Noisy Substring Matching
Date:Tue, 18 Jan 2005 19:32:09 GMT
Recurrent neural nets are probably massive overkill for a problem like
this.

You should start with sub-string presence or absence as a proxy for
string edit distance. You can even use the sub-strings to generate
hints for dynamic programming based edit distance systems.

Remember that weighted sub-string matching is just a handy way of
encoding a Markov model. As such, it has enormous power for doing
string matching sorts of operations.
What made you think that fancy stuff was needed?

[ comp.ai is moderated. To submit, just post and be patient, or if ]
[ that fails mail your article to , and ]
[ ask your news administrator to fix the problems with your system. ]
   

Copyright © 2006 knowledge-database   -   All rights reserved