Complex Parse Tree (Lua Grammar)

Jun 22, 2009 at 11:27 AM
Edited Jun 22, 2009 at 11:29 AM

Hello all,

Thanks to salec's example (vslua) and a few modifications, I managed to have Irony parse a complex Lua chunk.

After a brief moment of joy, I realized that the resulting Parse Tree is a bit too complex to be traversed in order to emit instructions.

Here's an example of the resulting parse tree for two simple expressions. Boldfaced output shows what I meant about complexity.

The first statement is relatively easily parsed. I believe Grammar.MarkTransient is a way to create simpler trees, but I'm sure there's a better way to do it.

I tried to mark many non-terminals as transient but the result didn't exactly get better. If anyone feels brave enough to help, I can provide the grammar file if needed - haven't posted it because it was a bit too long. Thanks for any advice or clarification.

x = { }
x["Key"] = "Value"

*
NT0
                |var
                        |x (identifier)
                |*
                |= [Symbol]
                |expr
                        |table constructor
                                |{ [Symbol]
                                |field list?
                                |} [Symbol]
                |*
                |;?
NT0
                |var
                        |prefix expr
                                |var
                                        |x (identifier)
                        |[ [Symbol]
                        |expr
                                |Key (string)
                        |] [Symbol]
                |*
                |= [Symbol]
                |expr
                        |Value (string)
                |*
                |;?


        
    
Coordinator
Jun 23, 2009 at 11:57 PM
Edited Jun 24, 2009 at 12:21 AM

Well, I guess you have some unreasonable expectations about what is involved in full compilation process. As far as I understand, you try to take a parse tree and traverse it while generating IL code - that's not gonna work. The parse tree (concrete syntax tree) is just a tree representation of the source text, it is far from representing the runtime code structure in any aspect. It has too much noise (you identify it as "too-complex" term), but essentially you must go through at least one step - generate AST (abstract syntax tree), in which nodes represent something "executable". AST does not necessarily match parse tree, it removes the noise and reorganizes the whole structure. This is at least the first step.

In your example I see non-terminals with auto-assigned names (NT0), these are created by Irony grammar processor automatically, and this should never happen. This might work for quick parsing to support VS integration, but for full compilation that wouldn't work. You need to declare all non-terminals explicitly and assign AstNodeType property to each of them.

You need to define AST nodes (classes), and then instantiate them to get the AST tree. A good way to flexibly create nodes is NodeCreator methods that can be attached to non-terminals. Look at Scheme sample in old "release" version for some ideas. Remember this is just a sketch, only a few AST node types were defined by Irony, just to run simple programs. Full set of AST nodes, with ability to generate some intermediate codes will be provided in the future, I'm working on it. For now, you have to define the yourself.

Once you get the AST tree, the next step is some tree analysis (like variables, scopes, etc), and then possibly code generation. Each AST node class should be able to generate appropriate code. Built-in Irony support for this is in some distant future

As for simplifying the tree, Irony does some tricks like removing transient nodes. The candidates are usually some "generalized" nodes like "expr" in your case: expr node does not exist, instead you would have several types particular expr types like binExpr, unaryExpr, funcCall expr, etc.

Hope this helps

Roman

Jun 24, 2009 at 9:24 AM

Hello Roman,

Thanks for the detailed answer, I know I wasn't too clear so I apologise for any misunderstanding.

Knowing that the AST part of Irony isn't ready yet, I was just looking for a workaround using the parse tree as a reference for further processing.

Clearly I was missing an important part, that is being able to build an AST on my own, which is what I'll be doing now; quite clear coming to think of it.

 

Let me thank you again for your great work with Irony and sorry if my question came across a bit odd.

Roberto