not= treated differently than !=

Dec 6, 2012 at 5:41 PM

I have a grammar that I have simplified down to:


 [Language("Example", "", "showcase problem with not=")]
    public class Example : Irony.Parsing.Grammar
        public Example()
            //1. Terminals
            Terminal num = new NumberLiteral("number");

            //2. Non-Terminals
            var Expr = new NonTerminal("expr");
            var BinOp = new NonTerminal("binOp", "operator");
            var unOp = new NonTerminal("unOp","operator");
            var BinExpr = new NonTerminal("binExpr");
            var unExpr = new NonTerminal("unExpr");
            var exitLoop = new NonTerminal("exitLoop");
            var program = new NonTerminal("program");

            //3. BNF rules
            Expr.Rule = num  | BinExpr | unExpr | "(" + Expr + ")";
            BinOp.Rule = ToTerm("=") | "!=" | "not=";
            BinOp.Precedence = 20;
            unOp.Rule = ToTerm("not") | "!";
            unOp.Precedence = 10;
            exitLoop.Rule = ToTerm("exit") + "when" + Expr;
            BinExpr.Rule = Expr + BinOp + Expr;
            unExpr.Rule = unOp + Expr;

            program.Rule = Expr + program | exitLoop + program | Empty;
            this.Root = program;

            //4. Set operator precendence and associativity
            RegisterOperators(20, Associativity.Left, "=", "not=", "!=");
            RegisterOperators(10, Associativity.Left, "not","!");

            //5. Register Parenthesis as punctuation symbols so they will not appear in the syntax tree
            MarkPunctuation("(", ")", ",");
            RegisterBracePair("(", ")");
            MarkTransient(Expr, BinOp, unOp);

            this.LanguageFlags = LanguageFlags.NewLineBeforeEOF;

And I'm having problems with the following 4 lines of code (which should parse to identical AST's)

exit when (3 not= 4)
exit when (3 != 4)
exit when 3 != 4
exit when 3 not= 4

The first 3 work correctly (with the expression being the entire 3 != 4 part), but the last one fails. The parser seems to get to the 3, then see the "not" next, and assume that that is a new line (if expressions can't stand alone on a line, then this doesn't happen, and it parses correctly).  I'm wondering why the parser is treating "not" and "!" differently, as they have the exact same rules (the MarkReservedWords for not doesn't make a difference if I don't include it, it's really there just in case).

Dec 6, 2012 at 6:44 PM
Edited Dec 6, 2012 at 6:45 PM

Try MarkReservedWords("not=")

Another thing - seems there must be Grammar errors (shift reduce conflicts) - are there any?

Dec 6, 2012 at 7:49 PM

The MarkReservedWords works beautifully, thank you very much. There were no grammar errors, but I'm going to be honest and tell you that I don't really know where to begin to fix shift-reduce conflicts. Is there any material you could point me to to help?

Dec 6, 2012 at 7:59 PM

well, Google... Or read in any compiler book, with chapter about LR/LALR parsing. I would recommend either Dragon book (Aho, Ullman), or Parsing Techniques.

To fix conflicts: it is either slight refactoring of the grammar rules (getting rid of optional elements and explicitly listing alternatives), or adding hints (PreferShiftHere()) to force certain actions (like typical 'dangling else' conflict). See sample grammars.