typedef parsing problem of a c language grammar

Nov 28, 2012 at 12:24 PM

Hi,

 

I need a hint how to manage what is known as "typedef parsing" problem.

I write a grammar for c language, only for the declaration part. I found this

grammar:

http://www.lysator.liu.se/c/ANSI-C-grammar-y.html

 

for lex/yacc and I traslate for Irony.

 

The problem is with the non terminal type_specifier

 

type_specifier

: VOID

| CHAR

| SHORT

| INT

| LONG

| FLOAT

| DOUBLE

| SIGNED

| UNSIGNED

| struct_or_union_specifier

| enum_specifier

| TYPE_NAME

 

 the last token TYPE_NAMEis managed in yacc with a little c function:

 

int check_type()

{

/*

* pseudo code --- this is what it should check

*

* if (yytext == type_name)

* return(TYPE_NAME);

*

* return(IDENTIFIER);

*/

 

/*

* it actually will only return IDENTIFIER

*/

 

return(IDENTIFIER);

}

 the function return that an identifier is a type_name if it is found in a symbol table.

 

So, my question is:

 

- What is the best way to manage in Irony the population of a symbol table during parsing?

- How can i use that symbol table to "trasform" a token from identifier to type_name ?

 For sake of clarity: sample code c

 

typedef int MYINT; // here MYINT is an identifier; after I put it in the symbol table

void f( MYINT aa); // here MYINT is a type_name; i have to translate a identifier token to type_name

 

TIA

 

 

 

 

 

 



Coordinator
Nov 29, 2012 at 5:29 PM

I would advise not to do this - do not try to catch typename/identifier difference during parsing. Recognizing certain names as types vs functions or structs or variable - this is in fact semantics of the language, the product of semantic analysis, and parser is doing syntax. This better be done after you parsed the file and have all program elements available in the parse tree.

Historically, for c and alike languages, the compilers were built in single-pass manner, because machines in 70s did not have enough memory to retain parse tree/AST tree in memory along with other stuff, so they had to do everything in one pass. This is quite a challenge, and it will add significantly to complexity of your solution, reducing maintainability. There is no need to do this anymore. So treat everything as identifier, produce the parse tree, then work with parse tree (using visitor/iterator) and build all the symbol tables.

Roman

Nov 30, 2012 at 7:28 AM

Thank you Roman, 

I try to apply your advice. I think I have to modify the grammar I found.