Create AST Help

Feb 7, 2012 at 11:24 PM
Edited Feb 8, 2012 at 9:52 PM

OK, I am going to narrow down the problem to one line of code:

The following code works on this expression: Code >= "000000000" AND ContainerCode <  fn(200000000)

     public class CSharpGrammar : InterpretedLanguageGrammar

    {

        public CSharpGrammar()

            : base(caseSensitive: false)

        {

            var number = new NumberLiteral("number");

            var stringLiteral = new StringLiteral("string");

            stringLiteral.AddStartEnd("\"", StringOptions.NoEscapes);

            var identifier = new IdentifierTerminal("identifier");

 

            var Expr = new NonTerminal("expression");

            var Column = new NonTerminal("column", typeof(IdentifierNode));

            var BinExpr = new NonTerminal("binexpr", typeof(BinaryOperationNode));

            var ParExpr = new NonTerminal("parexpr");

            var FunctionCall = new NonTerminal("FunctionCall", typeof(FunctionCallNode));

            var Constant = new NonTerminal("Constant", typeof(LiteralValueNode));

            var BinOp = new NonTerminal("binop", "operator");

            var UnExpr = new NonTerminal("UnExpr", typeof(UnaryOperationNode));

            var ArgList = new NonTerminal("ArgList", typeof(ExpressionListNode));

            var UnOp = new NonTerminal("UnOp", "operator");

            var comma = ToTerm(",");

            var caret = ToTerm("^");

 

            // BNF rules

            this.Root = BinExpr;

 

            // components of an expressions

            Constant.Rule = number | stringLiteral;

            Column.Rule = identifier; // MakePlusRule(Column, caret, identifier);

            FunctionCall.Rule = identifier + "(" + ArgList + ")";

            FunctionCall.NodeCaptionTemplate = "call #{0}(...)";

            ArgList.Rule = MakeStarRule(ArgList, comma, Expr);

            // Expression

            Expr.Rule = UnExpr | BinExpr | Column | Constant | FunctionCall | ParExpr;

 

            // composit expressions

            UnExpr.Rule = UnOp + Expr;

            BinExpr.Rule = Expr + BinOp + Expr;

            ParExpr.Rule = "(" + Expr + ")";

           

            UnOp.Rule = ToTerm("+") | "-";

           

            BinOp.Rule = ToTerm("+") | "-" | "*" | "/" | "%" | "=" | ">" | "<" | ">=" | "<=" | "<>" | "!=" | "!<" | "!>";           

            RegisterOperators(10, "*", "/", "%");

            RegisterOperators(9, "+", "-");

            RegisterOperators(8, "=", ">", "<", ">=", "<=", "<>", "!=", "!<", "!>", "LIKE", "IN");

            RegisterOperators(7, "AND");

            RegisterOperators(6, "OR");

 

            MarkPunctuation("^", ",", "(", ")");

            RegisterBracePair("(", ")");

            MarkTransient(Expr, Constant, Column, UnOp, BinOp);

 

            this.LanguageFlags = LanguageFlags.CreateAst;

        }

    }

But I want it to work on this expression: Code >= "000000000" AND Container^Code <  fn(200000000)

so I convert this:

            Column.Rule = identifier; // MakePlusRule(Column, caret, identifier);

to this:

            Column.Rule = MakePlusRule(Column, caret, identifier);

But that does not work, as it gives me an Error: List non-terminals cannot be marked transient; list: (column)

What can I do??

 

Thanks!

Coordinator
Feb 7, 2012 at 11:44 PM

If you are doing this to translate the expression into SQL, then you do NOT need AST tree, you can use the parse tree. AST tree (especially standard nodes from interpreter that you reference) should be used for interpreting the script as a scripting program. Look at SearchGrammar demo - it converts input expression into FTS query without AST, by directly working with parse tree. 

Feb 8, 2012 at 12:13 AM

You are correct rivantsov, and thanks for your quick response, but I actually need to do a bit more than that, including possible optimizing and then converting the AST into an expression tree.

Feb 8, 2012 at 9:54 PM

rivantsov, I have changed the question to allow for an easier answer, I would appreciate your input.

Thank!

Coordinator
Feb 8, 2012 at 10:10 PM

Remove the "Column" from MarkTransient call at the end. Since you do this, you have to provide an AST node type to Column non-terminal.

Also, I see you register "AND" and "OR" as operator, but I do not see them listed in BinOp.Rule - should be there I guess. 

Feb 8, 2012 at 10:13 PM
Edited Feb 8, 2012 at 10:15 PM

Thanks again, the "AND" and "OR" were omitted accidentally during copy/paste.

I understood "provide an AST node type to Column non-terminal" to mean the following code change:

var Column = new NonTerminal("column", typeof(AstNode));

This worked!!

 

Thanks!

 


Coordinator
Feb 8, 2012 at 10:18 PM

Well, it actually should be "IdentifierNode type is no longer fit for your Column non-terminal". You should create your own AST node type, that knows how to "interpret" the column definition parsed in code. Now this column definition will be a list of identifiers separated by caret. AST node for interpreter should "know" how to evaluate particular language constructs - like identifier, or function call. Naturally, Irony has no node that can interpret your column expression - you have to create it, and specify its type in the declaration of the Column non-terminal. Just look at other AST node classes, and follow the pattern.

CSharpGrammar- is it really c#?! 

Coordinator
Feb 8, 2012 at 10:38 PM

AstNode is a base class, it would allow you to build AST tree in GrammarExplorer, but would not work in interpreter

Feb 8, 2012 at 10:41 PM

I am currently examining the ExpressionListNode, as it seems to be the closest, but each item would not be evaluated separately... still getting my head wrapped around it.

Coordinator
Feb 8, 2012 at 10:44 PM

I think MemberAccessNode is better. ExpressionList is a list of independent evaluations; what you have here I guess is smth similar to "obj.Prop" access

Feb 8, 2012 at 10:49 PM

actually MemberAccessNode is perfect!

now I'm having some issues when I expand my expression to something like this: 

( Code >= "000000000" AND Container^Code <  fn(200000000) ) OR Value < 3

Coordinator
Feb 8, 2012 at 10:52 PM

what issues?

Feb 8, 2012 at 10:55 PM

I get a "Root AST node is null, cannot evaluate script. Create AST tree first." 

 

BTW I really appreciate your help with this, this is saving me hours...

Coordinator
Feb 8, 2012 at 10:58 PM

this.Root = Expr;

 

 

Feb 8, 2012 at 11:24 PM

now I get a null reference exception in fmGrammarExplorer at: 

private void ParseSample() {

      ClearParserOutput();

      if (_parser == null || !_parser.Language.CanParse()) return;

      _parseTree = null;

      GC.Collect(); //to avoid disruption of perf times with occasional collections

      _parser.Context.TracingEnabled = chkParserTrace.Checked;

      try {

        _parser.Parse(txtSource.Text, "<source>")  <-- HERE

 

---

here's my full code if you like to try it:

public class ORSQLGrammar : InterpretedLanguageGrammar

    {

        public ORSQLGrammar()

            : base(caseSensitive: false)

        {

            var number = new NumberLiteral("number");

            var stringLiteral = new StringLiteral("string");

            stringLiteral.AddStartEnd("\"", StringOptions.NoEscapes);

 

            var identifier = new IdentifierTerminal("identifier");

            //var column = CreateColumnIdentifier("Column");

 

            var Expr = new NonTerminal("expression");

            //var Term = new NonTerminal("Term");

 

            var Column = new NonTerminal("column", typeof(MemberAccessNode));

            var BinExpr = new NonTerminal("binexpr", typeof(BinaryOperationNode));

            var ParExpr = new NonTerminal("parexpr");

            var FunctionCall = new NonTerminal("FunctionCall", typeof(FunctionCallNode));

            var Constant = new NonTerminal("Constant", typeof(LiteralValueNode));

            var BinOp = new NonTerminal("binop", "operator");

            var UnExpr = new NonTerminal("UnExpr", typeof(UnaryOperationNode));

            var ArgList = new NonTerminal("ArgList", typeof(ExpressionListNode));

            var UnOp = new NonTerminal("UnOp", "operator");

            var comma = ToTerm(",");

            var caret = ToTerm("^");

 

            // BNF rules

            this.Root = Expr;

 

            // components of an expressions

            Constant.Rule = number | stringLiteral;

            Column.Rule = MakePlusRule(Column, caret, identifier);

            FunctionCall.Rule = identifier + "(" + ArgList + ")";

            FunctionCall.NodeCaptionTemplate = "call #{0}(...)";

            ArgList.Rule = MakeStarRule(ArgList, comma, Expr);

 

            Expr.Rule = UnExpr | BinExpr | Column | Constant | FunctionCall | ParExpr;

 

            // composit expressions

            UnExpr.Rule = UnOp + Expr;

            BinExpr.Rule = Expr + BinOp + Expr;

            ParExpr.Rule = "(" + Expr + ")";

           

            UnOp.Rule = ToTerm("+") | "-";

           

            BinOp.Rule = ToTerm("+") | "-" | "*" | "/" | "%" //arithmetic

                 | "=" | ">" | "<" | ">=" | "<=" | "<>" | "!=" | "!<" | "!>"

                 | "AND" | "OR" | "LIKE";           

 

            RegisterOperators(10, "*", "/", "%");

            RegisterOperators(9, "+", "-");

            RegisterOperators(8, "=", ">", "<", ">=", "<=", "<>", "!=", "!<", "!>", "LIKE", "IN");

            RegisterOperators(7, "AND");

            RegisterOperators(6, "OR");

 

            MarkPunctuation("^", ",", "(", ")");

            RegisterBracePair("(", ")");

            MarkTransient(Expr, Constant, UnOp, BinOp);//, UnOp, BinOp);

 

            this.LanguageFlags = LanguageFlags.CreateAst;

        }

    }

 

 

Coordinator
Feb 8, 2012 at 11:34 PM
Edited Feb 8, 2012 at 11:34 PM

yeah, that's probably a manifestation of an error that have been already reported. Somewhere during refactorings a verification method had been lost. It used to verify the following - that any NonTerminal or terminal, either have AstNode type specified, or AstNodeCreator set, or are marked transient. Now with this check gone, the AST builder blows up when there's under-specified term. Try to stop on error and see what TreeNode and its Term (name of it) causes the failure. This term likely has missing AstNode type.

Edit: I will be fixing it soon, I promise

Feb 8, 2012 at 11:41 PM

Excellent tip. I was able to find out that I was missing the ParExpr from the transient list.

putting that in there got things working again.

pure magic! lol