knowledge-database (beta)

Current group: comp.compilers

Automatically generating ASTs with XML markup

Automatically generating ASTs with XML markup  
Manuel Collado
 Re: Automatically generating ASTs with XML markup  
Rafael 'Dido' Sevilla
 Re: Automatically generating ASTs with XML markup  
Jürgen_Kahrs
 Re: Automatically generating ASTs with XML markup  
Alexey Demakov
 Re: Automatically generating ASTs with XML markup  
Joel Jones
From:Manuel Collado
Subject:Automatically generating ASTs with XML markup
Date:29 Dec 2004 01:46:07 -0500
I'm trying to grab the abstract syntax of several code samples in
several different languages by using an XML representation. Example
(Modula-2):

From

MODULE Hello;
IMPORT InOut;
BEGIN
InOut.WriteString("Hello, world");
InOut.WriteLn;
END Hello.

get something like


Hello

InOut



InOut.WriteString
"Hello, world"


InOut.WriteLn




Perhaps this could be done by an automatic tool just by writing the
appropriate grammar (EBNF or similar):

::= MODULE ; * *
::= IMPORT ;
::= BEGIN (|||...)* END .
::= [ "(" , * ")" ]
...

Does anybody know such an automatic tool?

Of course, there are a lot of parser/compiler generators:
- yacc/lex
- bison/flex
- antlr
- asf+sdf
- coco/r
- javacc
- jjforester
- codeworker
- goldparser
- adagoop
- front/doggy
- elkhound
- etc. ...

I've attempted to use some of them, but it seems they require writing
specific AST printing code, in addition to the grammar. I wonder it the
whole process could be automated by just writing the grammar.

Thanks in advance.
--
------------------------------------------------------------------------
Manuel Collado Machuca | Facultad de Informatica UPM
Universidad Politecnica de Madrid | Campus de Montegancedo
Dep. LSIIS | Boadilla del Monte
Tel.+34-91-336.74.57 Fax.+34-91-336.74.12 | 28660 MADRID - SPAIN
From:Rafael 'Dido' Sevilla
Subject:Re: Automatically generating ASTs with XML markup
Date:30 Dec 2004 00:56:41 -0500
On Wed, Dec 29, 2004 at 01:46:07AM -0500, Manuel Collado wrote:
> I'm trying to grab the abstract syntax of several code samples in
> several different languages by using an XML representation. Example
> (Modula-2):

Look for asdlgen on SourceForge. I believe it is able to do what you
describe.
From:Jürgen_Kahrs
Subject:Re: Automatically generating ASTs with XML markup
Date:30 Dec 2004 00:58:01 -0500
Hello Manuel,

> I'm trying to grab the abstract syntax of several code samples in
> several different languages by using an XML representation.

Terence Parr has tried this in a project with ANTLR:

http://www.cs.usfca.edu/~parrt/course/652/projects-Spring-2004/xml-antlr.html

But as far as I understood it, he also has no
was of automatically converting syntactical
elements into XML markup blocks.

> Perhaps this could be done by an automatic tool just by writing the
> appropriate grammar (EBNF or similar):
>
> ::= MODULE ; * *
> ::= IMPORT ;
> ::= BEGIN (|||...)* END .
> ::= [ "(" , * ")" ]
> ...

If you already have an EBNF grammar, the best tool
for you probably is CoCo/R:

http://www.scifac.ru.ac.za/coco/

You can find an example on the web page. With CoCo/R you have to add
about one line to each EBNF rule to get XML output. The complexity of
this work should be at the student-homework-level.

Once you have a program's source code converted to XML, you can build
a compiler in XSL or XMLgawk. But I wonder if the advantage of having
the intermediate trees of the compiler in XML is really worth the
effort.
From:Alexey Demakov
Subject:Re: Automatically generating ASTs with XML markup
Date:30 Dec 2004 01:00:18 -0500
I'm planning to implement this feature in TreeDL tool
http://treedl.sourceforge.net But you have to specify AST structure
explicitly, not BNF (In fact, the same AST can be built from different
BNFs) in form of

node Program
{
child ID name;
child Import* importList;
child Declaration* declList;
child Body body;
}

node Import
{
child ID name;
}

node Body
{
child Stmt* stmtList;
// name is not required in AST,
// it is used only before AST construction
}

node Call
{
child ID name;
child Expr* expr;
}

node ID
{
attribute string name;
}

abstract node Stmt
{
}

node IfStmt : Stmt { ... }
node ForStmt : Stmt { ... }

Describing AST structure takes almost the same efforts as BNF
but gives more control over AST - you can specify children names,
skip unneeded parts (as final in block).

AST structure can be translated in Java or C#.
If you're interested - write me to all-x@users.sourceforge.net

Regards,
Alexey

-----
Alexey Demakov
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com

----- Original Message -----
From: "Manuel Collado"
Sent: Wednesday, December 29, 2004 9:46 AM
Subject: Automatically generating ASTs with XML markup


> I'm trying to grab the abstract syntax of several code samples in
> several different languages by using an XML representation.
From:Joel Jones
Subject:Re: Automatically generating ASTs with XML markup
Date:30 Dec 2004 01:03:59 -0500
>I'm trying to grab the abstract syntax of several code samples in
>several different languages by using an XML representation.

My student, Crutcher Dunnavant, has written a tool that solves exactly
this problem. Given a lexer/parser specification in XML, through a
series of XSL transformations it produces a C or Java lexer and parser
that produces an XML tree. The specification can include
user-specified tree rewrites to modify the concrete syntax tree into
an AST. The URL for the tool is at:

http://monket.samedi-studios.com/software/turtles/

Joel Jones jones@cs.ua.edu
Department of Computer Science http://cs.ua.edu/~jones
University of Alabama (205) 348-1618
   

Copyright © 2006 knowledge-database   -   All rights reserved