|
|
 | | From: | Dhruva Krishnamurthy | | Subject: | Runtime syntax | | Date: | 25 Dec 2004 20:15:12 -0500 |
|
|
 | Hello, We are trying to support text parsers for slightly differing syntax. I was looking at developing an engine where we can supply the grammar at runtime. I know Yacc/Bison is at compile time, is there some project/working proof of concept to achieve this? Any help is appreciated.
-dhruva [There was a lot of work on extensible compilers that could extend the grammar on the fly in the 1970s. It's not hard to do, but it turns out not to be very useful. -John]
|
|
 | | From: | Dmitry A. Kazakov | | Subject: | Re: Runtime syntax | | Date: | 29 Dec 2004 01:39:05 -0500 |
|
|
 | On 25 Dec 2004 20:15:12 -0500, Dhruva Krishnamurthy wrote:
> We are trying to support text parsers for slightly differing > syntax. I was looking at developing an engine where we can supply the > grammar at runtime. I know Yacc/Bison is at compile time, is there > some project/working proof of concept to achieve this? Any help is > appreciated.
http://www.dmitry-kazakov.de/ada/components.htm
was developed with this idea in mind. It is completely table-driven and requires no grammar compilation. It parses infix expressions with brackets. Infix expressions are usually the most complex part of syntax. In many cases the rest can more or less trivially be covered using descent recursion.
-- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de
|
|
 | | From: | Cedric LEMAIRE | | Subject: | Runtime Syntax | | Date: | 29 Dec 2004 01:40:33 -0500 |
|
|
 | > We are trying to support text parsers for slightly differing > syntax. I was looking at developing an engine where we can supply the > grammar at runtime. I know Yacc/Bison is at compile time, is there > some project/working proof of concept to achieve this? Any help is > appreciated.
If you are working in C++, you can use the C++ runtime library of CodeWorker (distributed under LGPL and available at "http://www.codeworker.org"). It is both a parser interpretor and a source code generator. The extended-BNF parse scripts can be changed at runtime.
If you just have to select a grammar amongst some already written and frozen, the selection depending on a pre-analysis of the text for instance, and the grammars being almost the same, you can: - isolate the common BNF rules in a CW parse script, and then include them in each grammar, which will then add their specific rules, #include "common-rules.cwp" ... specific_rule_1 ::= ...; - overload some of the common BNF rules ('#overload' keyword) in the particular grammars, if required, #include "common-rules.cwp" ... #overload common_rule_1 ::= ... ...; - use the template writing of rules // in "common-rules.cwp", somewhere: statement ::= procedure_call // a non-terminal | assignment // another non-terminal | // important part of this rule: // read an identifier and put it into the // variable 'sKeyword' #readIdentifier:sKeyword // call of a template-like rule: // if sKeyword is worth 'while', the BNF rule // statement_keyword<"while"> should have been defined statement_keyword | ... ; ------------------------ // in one of the particular grammars: // add of a new statement, the C-like 'for' statement_keyword<"for"> ::= '(' expression ';' expression ';' expression ')' statement ;
If you want to improve the speed of parsing, you can translate the CW parse scripts to C++, using the switch '-c++' on the command line, and then link them with a C++ applicaton.
I have already applied these features for recognizing short Financial news: each time a new sentence template was appearing, its BNF expression was added to the grammar in charge of detecting interesting news. The improvements were applied dynamically: the grammar was changing at runtime, which isn't what you are looking for, but which is implemented similarly in the BNF scripts. The parse tree was just holding the semantic data resulting of the sentence analysis.
But why to rewrite a new engine? If CodeWorker doesn't suit you, I'm sure it exists some other tools that should answer your needs.
|
|
 | | From: | Nick Maclaren | | Subject: | Re: Runtime syntax | | Date: | 29 Dec 2004 01:39:44 -0500 |
|
|
 | Dhruva Krishnamurthy wrote: >[There was a lot of work on extensible compilers that could >extend the grammar on the fly in the 1970s. It's not hard to >do, but it turns out not to be very useful. -John]
My belief is that is because the languages were targetted at areas for which it was inappropriate. Round about then, I was thinking about really advanced statistical packages, and that is one area where allowing customisable extension could be very useful. Note that is specifically for complex constants and data structures, so that it would be possible to input them in a reasonably natural, checkable form.
Another area is that of checked command languages (shells, editors etc.) Most of the work there missed the point that convenience is critical, and the proposals were too painful to use. Most of the successful shells etc. have missed the point that checkability is essential if reliability is a major target - but that viewpoint is so out of fashion as to be heresy.
But I agree that it is only superficially attractive for conventional programming languages - i.e. the actual time wasted by not having it is almost invariably negligible.
Regards, Nick Maclaren.
|
|
|