Strange problem with simple grammer

Mar 10, 2013 at 8:56 PM
I've run into a very strange problem with a grammer I'm working on:

The grammer is:
    public class MyGrammar : Grammar
    {
        public MyGrammar()
        {
            var expr = new NonTerminal("expr"); 
            var exp0 = new NonTerminal("exp0");
            var exp = new NonTerminal("exp");

            var ID = new IdentifierTerminal("id");
            var NUMBER = new NumberLiteral("number", NumberOptions.AllowSign);
            var COMMA = ToTerm(",");
            var ASSIGN = ToTerm("=");
            var PLUS = ToTerm("+");
            var MINUS = ToTerm("-");
            var TIMES = ToTerm("*");
            var DIV = ToTerm("/");

            RegisterOperators(1, ASSIGN);
            RegisterOperators(30, PLUS, MINUS);
            RegisterOperators(40, TIMES, DIV);

            var assign = new NonTerminal("assign");
            assign.Rule = ID + ASSIGN + expr;

            var exps = new NonTerminal("exps");
            exps.Rule = MakeStarRule(exps, COMMA, expr);

            var apply = new NonTerminal("apply");
            apply.Rule = (exp0 + "(" + exps + ")");

            exp0.Rule = 
                "(" + expr + ")"
                | ID
                | apply;

            var binOp = new NonTerminal("binop");
            binOp.Rule = TIMES | DIV | PLUS | MINUS;

            var infixOp = new NonTerminal("infix");
            infixOp.Rule = exp + binOp + exp;
            exp.Rule = exp0 
                | infixOp
                | assign
                |NUMBER;
            expr.Rule = exp;

            MarkPunctuation("(", ")");
            RegisterBracePair("(", ")");
            MarkPunctuation(COMMA);
            this.MarkTransient(binOp);

            this.Root = expr;
        }
   }
And it fails on this expression:
f(a-1)+f(a-1)
with error: Syntax error, expected: ,, )
(the error happens on the second '-')
  • If I remove the MarkTransient(binOp) statement I don't get an error on the expression, but then the operator precedence does not work.
  • If I change the Rule of infixOp to expr + binOp + expr then I don't get the error, but I can't use this solution because in the actual grammer the rule of expr is more complicated.
Can some body help me with this problem?

Thanks,

Nadav
Coordinator
Mar 10, 2013 at 9:25 PM
shouldn't infixOp be:
plusMinus + expr
?
where plusMinus is "+" | "-"

Next, if you treat '-5' as infix op applied to 5, then remove AllowSign from NumberLiteral. It's either one or another. That's most likely what happens, parser recognizes -1 as signed number.
finally give better names to variables (exp, expr, exp0) are not very descriptive.
Once you finished, run Grammar explorer and check grammar errors- do you have any conflicts?
Mar 11, 2013 at 12:09 AM
Edited Mar 11, 2013 at 12:11 AM
Hi Rivantsov,
Thanks for your reply.

a. I have run Grammer explorer on my grammer and there are no conflicts (that was the first thing I did)
b. the problem is not the minus sign before the number.
My grammer correctly parses a-1 as infixOp with left operand a, operator '-' and right operand 1.
Also I get the same error if I try to parse a*f(b*c)

The problem seems to happen when the operator before the function has the same precedence or a higher precedence than the operator inside the ( ).
i.e. a*f(b*c) fails, a*f(b+c) fails, a+f(b*c) parses correctly.

Also when I look at the parser trace I see that when the parsing succeeds the parser reduces the '*' between b & c to binOp.
When the parsing fails the parser performs a precedence comparison, which returns TRUE and the parser reduce to expr which does not look right to me...

Nadav
Coordinator
Mar 11, 2013 at 4:03 AM
sorry, was wrong about infix op, confused with prefix op.
yes, it looks like it is related to precedence, it is being invoked in wrong place. Surprisingly similar arrangement works well in ExpressionEvaluator grammar.
Try
2 * abs(5-1)
parses/executes successfully in sampleExprEvaluator, but fails to parse in your grammar
trying to understand what is so different.
By the way, I think I found a bug while investigating this. In PrecedenceBasedParserAction.CheckMustReduce, the method should not look at entire stack, but only at the length of reduce production. Fixing this would most likely fix your problem (but it requires making method non-static), but then again - why it works ok in expr evaluator?

Roman
Coordinator
Mar 11, 2013 at 4:32 AM
Ok, it's a bug, here's quick fix. Replace PrecedenceBasedParserAction.CheckMustReduce with the folllowing:
    private bool CheckMustReduce(ParsingContext context) {
      var input = context.CurrentParserInput;
      var stackCount = context.ParserStack.Count;
      var prodLength = _reduceAction.Production.RValues.Count;
      for (int i = 1; i <= prodLength; i++) {
        var prevNode = context.ParserStack[stackCount - i];
        if (prevNode == null) continue;
        if (prevNode.Precedence == BnfTerm.NoPrecedence) continue;
        //if previous operator has the same precedence then use associativity
        if (prevNode.Precedence == input.Precedence)
          return (input.Associativity == Associativity.Left); //if true then Reduce
        else
          return (prevNode.Precedence > input.Precedence); //if true then Reduce
      }
      //If no operators found on the stack, do shift
      return false;
    }
I'm still trying to see why it worked ok in expr evaluator. I will push the fix later
thank you
Roman
Mar 11, 2013 at 7:30 AM

Hi Rivantsov,

Thanks for the quick response, I'll try it tonight and see if it solves the problem.

I have to say, I've only been using Irony for 2 days, but so far it looks great!

The grammar explorer is brilliant!

I've been porting a grammar I've implemented using another tool to Irony and after 1 day of work it's practically finished (including building the AST).

Thanks,

Nadav

From: rivantsov [email removed]
Sent: Monday, March 11, 2013 5:32 AM
To: Nadav Popplewell
Subject: Re: Strange problem with simple grammer [irony:436072]

From: rivantsov

Ok, it's a bug, here's quick fix. Replace PrecedenceBasedParserAction.CheckMustReduce with the folllowing:

    private bool CheckMustReduce(ParsingContext context) {
      var input = context.CurrentParserInput;
      var stackCount = context.ParserStack.Count;
      var prodLength = _reduceAction.Production.RValues.Count;
      for (int i = 1; i <= prodLength; i++) {
        var prevNode = context.ParserStack[stackCount - i];
        if (prevNode == null) continue;
        if (prevNode.Precedence == BnfTerm.NoPrecedence) continue;
        //if previous operator has the same precedence then use associativity
        if (prevNode.Precedence == input.Precedence)
          return (input.Associativity == Associativity.Left); //if true then Reduce
        else
          return (prevNode.Precedence > input.Precedence); //if true then Reduce
      }
      //If no operators found on the stack, do shift
      return false;
    }

I'm still trying to see why it worked ok in expr evaluator. I will push the fix later
thank you
Roman

Mar 11, 2013 at 1:44 PM
Hi Nadav,

I didn't understand the need for "expr", so I removed it and it just works:
        public MyGrammar()
        {
            var exp0 = new NonTerminal("exp0");
            var exp = new NonTerminal("exp");

            var ID = new IdentifierTerminal("id");
            var NUMBER = new NumberLiteral("number", NumberOptions.AllowSign);
            var COMMA = ToTerm(",");
            var ASSIGN = ToTerm("=");
            var PLUS = ToTerm("+");
            var MINUS = ToTerm("-");
            var TIMES = ToTerm("*");
            var DIV = ToTerm("/");

            RegisterOperators(1, ASSIGN);
            RegisterOperators(30, PLUS, MINUS);
            RegisterOperators(40, TIMES, DIV);

            var assign = new NonTerminal("assign");
            assign.Rule = ID + ASSIGN + exp;

            var exps = new NonTerminal("exps");
            exps.Rule = MakeStarRule(exps, COMMA, exp);

            var apply = new NonTerminal("apply");
            apply.Rule = (exp0 + "(" + exps + ")");

            exp0.Rule =
                "(" + exp + ")"
                | ID
                | apply;

            var binOp = new NonTerminal("binop");
            binOp.Rule = TIMES | DIV | PLUS | MINUS;

            var infixOp = new NonTerminal("infix");
            infixOp.Rule = exp + binOp + exp;
            exp.Rule = exp0
                | infixOp
                | assign
                | NUMBER;

            MarkPunctuation("(", ")");
            RegisterBracePair("(", ")");

            MarkPunctuation(COMMA);
            this.MarkTransient(binOp);

            this.Root = exp;
        }
    }
Cheers,
Daniel
Mar 11, 2013 at 1:59 PM

Yes, I know that if you remove expr the program is solved.

It's also solved if you replace infixOp.Rule = exp + binOp + exp with infixOp.Rule = expr + binOp + expr

(Which I find strange because if expr.Rule=exp then exp+binOp+exp should be the same as expr+binOp+expr, right?)

Anyway, Expr is there because the grammar is part of a bigger grammar where there are statements that are valid in a statement (or inside (..)) that not valid as an operand to a binary operator.

(for example, in the bigger grammar I use with id1=exp1,id2=exp2 do exp3 to define temp variables)

And even if expr is not needed, the grammar is still valid and has no conflicts so it should work…

Nadav

From: dkuppitz [email removed]
Sent: Monday, March 11, 2013 2:44 PM
To: Nadav Popplewell
Subject: Re: Strange problem with simple grammer [irony:436072]

From: dkuppitz

Hi Nadav,

I didn't understand the need for "expr", so I removed it and it just works:

 
        public MyGrammar()
        {
            var exp0 = new NonTerminal("exp0");
            var exp = new NonTerminal("exp");
 
            var ID = new IdentifierTerminal("id");
            var NUMBER = new NumberLiteral("number", NumberOptions.AllowSign);
            var COMMA = ToTerm(",");
            var ASSIGN = ToTerm("=");
            var PLUS = ToTerm("+");
            var MINUS = ToTerm("-");
            var TIMES = ToTerm("*");
            var DIV = ToTerm("/");
 
            RegisterOperators(1, ASSIGN);
            RegisterOperators(30, PLUS, MINUS);
            RegisterOperators(40, TIMES, DIV);
 
            var assign = new NonTerminal("assign");
            assign.Rule = ID + ASSIGN + exp;
 
            var exps = new NonTerminal("exps");
            exps.Rule = MakeStarRule(exps, COMMA, exp);
 
            var apply = new NonTerminal("apply");
            apply.Rule = (exp0 + "(" + exps + ")");
 
            exp0.Rule =
                "(" + exp + ")"
                | ID
                | apply;
 
            var binOp = new NonTerminal("binop");
            binOp.Rule = TIMES | DIV | PLUS | MINUS;
 
            var infixOp = new NonTerminal("infix");
            infixOp.Rule = exp + binOp + exp;
            exp.Rule = exp0
                | infixOp
                | assign
                | NUMBER;
 
            MarkPunctuation("(", ")");
            RegisterBracePair("(", ")");
 
            MarkPunctuation(COMMA);
            this.MarkTransient(binOp);
 
            this.Root = exp;
        }
    }

Cheers,
Daniel

Mar 11, 2013 at 5:45 PM
Hi Rivantsov,

I've tested the bug fix and it works great.

I've managed to parse all the test cases I've got for the my language.
Well, almost :)
There still is one problem (that is not related to this problem).
In my langauge there are two keyworks: with & with*
I defined
var TIMES=ToTerm("*");
var WITH=ToTerm("with");
var WITHREC=ToTerm("with*");
but Irony parses with* as WITH+TIMES instead of WITHREC
Is there any way around this?

(Right now I just changed replaced WITHREC with WITH+TIMES in all the rules that use it and it works, but it also accepts with *, which it shouldn't... )

Thanks,
Nadav
Coordinator
Mar 11, 2013 at 5:50 PM
I guess the parser prefers by default to treat 'with' as variable and '' as operator. Try declaring 'with' as reserved word using MarkReservedWords
Roman
Mar 11, 2013 at 6:19 PM
No, it didn't fix it.

Nadav


From: rivantsov [notifications@codeplex.com]
Sent: Monday, March 11, 2013 6:52 PM
To: Nadav Popplewell
Subject: Re: Strange problem with simple grammer [irony:436072]

From: rivantsov

I guess the parser prefers by default to treat 'with' as variable and '*' as operator. Try declaring 'with*' as reserved word using MarkReservedWords
Roman
Coordinator
Mar 12, 2013 at 6:15 AM
Edited Mar 12, 2013 at 6:18 AM
hmm.. here's what I did. added two new expressions to grammar:
      var WITH = ToTerm("with");
      var WITHREC = ToTerm("with*");
      MarkReservedWords("with*");
      var withOp = new NonTerminal("withOp");
      withOp.Rule = WITH + TIMES + ID;
      var withRecOp = new NonTerminal("withRecOp");
      withRecOp.Rule = WITHREC + ID;

      exp.Rule = exp0
          | infixOp
          | assign
          | NUMBER
          | withOp
          | withRecOp;
      expr.Rule = exp;

and everything just works, I've tried with the following input:

f(with * me, with* u)

the first arg shows as "withOp", the second as "withRecOp"
Mar 12, 2013 at 8:09 AM

I'll try it again…

Thanks,

Nadav

From: rivantsov [email removed]
Sent: Tuesday, March 12, 2013 7:16 AM
To: Nadav Popplewell
Subject: Re: Strange problem with simple grammer [irony:436072]

From: rivantsov

hmm.. here's what I did. added two new expressions to grammar:

      var WITH = ToTerm("with");
      var WITHREC = ToTerm("with*");
      MarkReservedWords("with*");
      var withOp = new NonTerminal("withOp");
      withOp.Rule = WITH + TIMES + ID;
      var withRecOp = new NonTerminal("withRecOp");
      withRecOp.Rule = WITHREC + ID;
 
      exp.Rule = exp0
          | infixOp
          | assign
          | NUMBER
          | withOp
          | withRecOp;
      expr.Rule = exp;

and everything just works, I've tried with the following input:

f(with * me, with* u)

the first arg shows as "withOp", the second as "withRecOp"

Coordinator
Mar 12, 2013 at 8:13 AM
yeah, here's what happens without MarkReservedWorsd:
Input :
with* me

'with' is initially scanned as "id" (variable), so the expression is interpreted as [id multOp id] - completely valid in your grammar. 'with*' keyword is not even checked - Id matches input, because identifiers have higher priority than keywords (but not reserved words). So scanner recognizes 'with' as identifier; then later it is matched against keyword list, and the Term is replaced with Keyword.
That arrangement seems strange (a bit), but it is like this because it works better for most common cases - made this way based on many try/fails to find 'best' solution working in most cases, without extra tuning. For cases like yours, you can use Reserved words facility - these are tried first, they have higher priority, and it would work as expected for you
Roman
Mar 13, 2013 at 6:38 AM
Hi Roman,

I tried it and it works.
(I don't know why it didn't work when I tried it before).

Thanks,

Nadav