|
|
 | | From: | Sigurd Lerstad | | Subject: | Building C/C++ compilers | | Date: | 16 Dec 2004 00:50:21 -0500 |
|
|
 | Hello,
I'm trying to make a C/C++ compiler.
I'm using flex and bison, and I have the book "modern compiler implementation in C".
Some have commented that this book uses a non-traditional approach. Can someone be more specific why it's untraditional?
In the phase to produce Abstract Syntax Tree (AST) from the source. C/C++ has a difficult case that has an amiguity
g * h
this can mean declare a variable pointer to g named h, or the variable g multiplied with h;
In the above mentioned book, Appel first builds a syntax tree and then does semantic analysis, i.e. type checking etc. but it seems to me that because of the above ambiguity, ast building and semantic analysis must be performed at the same time ?
C++ makes it even more difficult, consider this example:
class myclass { int method() { g* h; }
typedef int g; }
Now, g is declared later in the source, which complicates things even more.
Does anyone have any pointers to me overcoming these two issues, also on the light of using bison as the parser.
thanks,
-- Sigurd Lerstad [Parsing C++ using a LALR parser is quite difficult. This has come up many times before. -John]
|
|
 | | From: | Vidar Hokstad | | Subject: | Re: Building C/C++ compilers | | Date: | 17 Dec 2004 00:28:53 -0500 |
|
|
 | Sigurd Lerstad wrote: > In the phase to produce Abstract Syntax Tree (AST) from the source. C/C++ > has a difficult case that has an amiguity > > g * h > > this can mean declare a variable pointer to g named h, or the variable g > multiplied with h; > > In the above mentioned book, Appel first builds a syntax tree and then does > semantic analysis, i.e. type checking etc. > but it seems to me that because of the above ambiguity, ast building and > semantic analysis must be performed at the same time ?
I haven't read the book in question, but with regards to building the AST this really depends on HOW the AST is represented. If, on parsing "g * h" the syntax tree just consists of some node for "*" with nodes for "g" and "h" attached to it, without a clear indication of which meaning of "*" applies, then no semantic analysis may be needed at that stage, because as long as one of the uses are syntactically valid it doesn't really make a huge difference.
If you wish to explicitly treat "pointer to" and "multiplication" differently in your AST, then you'll need to do more work. Many people would probably consider allowing the ambiguity described above in an AST as a hack.
During semantic analysis you'd then have the information to infer which use of "*" was meant, and may want to annotate the AST further to simplify later user.
Vidar
|
|
 | | From: | David Lindauer | | Subject: | Re: Building C/C++ compilers | | Date: | 17 Dec 2004 00:42:50 -0500 |
|
|
 | Sigurd Lerstad wrote:
> In the phase to produce Abstract Syntax Tree (AST) from the source. C/C++ > has a difficult case that has an amiguity > > g * h
One way of handling it is to keep track of symbol types; then you can tell from whether g is a type identifier or a regular variable what the deal is with this. That is different from type checking all the identifiers found in say an expression or argument list. Another way is to introduce an 'ambiguous' version of the '*' operator which will flag the fact that a resolution is required when you type-check the tree.
> C++ makes it even more difficult, consider this example: > > class myclass > { > int method() > { > g* h; > } > > typedef int g; > }
it gets even worse. Consider this:
class myclass { myclass method(myclass var) { var.a = this->a ; return var ; } int a,b,c ; } ;
Now neither the size of myclass nor its members are known at the time the method is encountered, so there is no way to completely specify what is going on at that time. You can either create a very 'loose' syntax tree and check it for validity later, or you can just defer even trying to compile the method until the class declaration is complete.
David
|
|
 | | From: | Vidya Praveen | | Subject: | Execution Profiling? | | Date: | 19 Dec 2004 23:49:08 -0500 |
|
|
 | Hi
I am presently involved in a study on execution profiling. I have the paper on gprof for call graph. I am interested in knowing about other kinds of profiling. Like the one with LCC.(i think it's called as expression profiling). Can someone refer me to some papers or material on web regarding this. Also i want to know about static profiling. Thank you!
VP
|
|
 | | From: | Robert J. Simpson | | Subject: | Re: Building C/C++ compilers | | Date: | 17 Dec 2004 00:33:24 -0500 |
|
|
 | "Sigurd Lerstad" wrote
> I'm trying to make a C/C++ compiler. > [...] > > In the phase to produce Abstract Syntax Tree (AST) from the source. C/C++ > has a difficult case that has an amiguity > > g * h > > this can mean declare a variable pointer to g named h, or the variable g > multiplied with h; > > In the above mentioned book, Appel first builds a syntax tree and then > does > semantic analysis, i.e. type checking etc. > but it seems to me that because of the above ambiguity, ast building and > semantic analysis must be performed at the same time ?
I would advise you to do symbol lookup before parsing. That way you will get a TYPENAME token or an IDENTIFIER token.
> C++ makes it even more difficult, consider this example: > > class myclass > { > int method() > { > g* h; > } > > typedef int g; > } > > Now, g is declared later in the source, which complicates things even > more.
Yes, you have to consider the function body to occur 'after' the declarations. I don't know if it even possible to do this with bison. You could parse it as 'g times h' and then manually convert when you find g is a type. My preferred way of doing this is to use a two pass approach. The first pass can be very crude e.g. just match '{' and '}' or you can do a full parse and discard the result.
> Does anyone have any pointers to me overcoming these two issues, > also on the light of using bison as the parser.
Personally I never use yacc or bison where the language is defined. They are very useful when developing a language but it's such a small proportion of the total work in writing a compiler and it really restricts what you can do.
Rob.
|
|
|