Struggling with precedence

Jul 27, 2011 at 1:54 PM
Edited Jul 27, 2011 at 4:59 PM

Hi

I am trying to parse the very simple text "1+1=2"!

An extract from my grammar is as follows:


            expression.Rule = term Or binaryExpression Or equalityExpression
            term.Rule = literalExpression
            literalExpression.Rule = number Or identifier
            binaryOperator.Rule = ToTerm("+") Or "-" Or "/" Or "*" Or "^"
            binaryExpression.Rule = expression + binaryOperator + expression
            equalityOperator.Rule = ToTerm("=") Or "!=" Or "<>" Or ">=" Or "<=" Or ">" Or "<"
            equalityExpression.Rule = expression + equalityOperator + expression

            'Expression precedence
            binaryExpression.Precedence = 20
            equalityExpression.Precedence = 10

            'Operator precedence           
            RegisterOperators(30, "*", "/", "%")
            RegisterOperators(20, "+", "-")
            RegisterOperators(10, "=", ">", "<", ">=", "<=", "<>", "!=")

Atm, the parser is producing the following tree

Root:                   BinaryExpression

Children: LiteralExpression   EqualityExpression

whereas I want it to generate

Root:                  EqualityExpression

Children: BinaryExpression    LiteralExpression

 

Where am I going wrong? I tried reversing the precedence betweent the binary and equality operators and expressions and it didn't seem to make any difference, so I wondered if I have set my grammar up incorrectly

Many thx in advance

Simon

Coordinator
Jul 27, 2011 at 5:28 PM

Seems strange, everything looks ok, except the lines:

 'Expression precedence
            binaryExpression.Precedence = 20
            equalityExpression.Precedence = 10

remove these, they are not needed, you should set precedence only on operator symbols, not on non-terminals

Jul 27, 2011 at 5:47 PM
Edited Jul 27, 2011 at 5:51 PM

OK, thx ... I'll post my full grammar in case there is something else conflicting. As I step the code, when, in FindActionForStateAndInput, the parser is processing the "+" I am surprised to see the action it chooses ({Operator, shift to S21/reduce on 'listExpression -> compoundExpression '.}) in comparison to the available actions when the "=" is encountered ({Shift to S27}) - but this is most likely my lack of understanding - I just couldn't see why any reference was being made to listExpression in the context.

Anyway, here's my grammar

            'Terminals
            Dim number = New NumberLiteral("number")
            Dim identifier = New IdentifierTerminal("identifier")
            Dim quotedLiteral = New StringLiteral("quoted", "'", StringOptions.AllowsDoubledQuote)
            Dim comma = ToTerm(",", "comma")
            Dim dataitem = New RegexBasedTerminal("dataitem", "M[1-9][0-9]{0,2}E[1-9][0-9]{0,2}I[1-9][0-9]{0,2}\b")

            'Non-terminals
            Dim compoundExpression = New NonTerminal("compoundExpression", GetType(ExpressionNode))
            Dim simpleExpression = New NonTerminal("simpleExpression")
            Dim unaryExpression = New NonTerminal("unaryExpression", GetType(UnaryExpressionNode))
            Dim literalExpression = New NonTerminal("literalExpression", GetType(LiteralExpressionNode))
            Dim binaryExpression = New NonTerminal("binaryExpression", GetType(BinaryExpressionNode))
            Dim bracketedExpression = New NonTerminal("bracketedExpression", GetType(BracketedExpressionNode))
            Dim listExpression = New NonTerminal("listExpression", GetType(ListExpressionNode))
            Dim functionExpression = New NonTerminal("functionExpression", GetType(FunctionExpressionNode))
            Dim unaryOperator = New NonTerminal("unaryOperator", GetType(UnaryOperationNode))
            Dim binaryOperator = New NonTerminal("binaryOperator", GetType(BinaryOperatorNode))
            Dim dataitemExpression = New NonTerminal("dataitemExpression", GetType(DataItemExpressionNode))
            Dim quotedExpression = New NonTerminal("quotedExpression", GetType(QuotedExpressionNode))
            Dim functionNameExpression = New NonTerminal("functionNameExpression", GetType(FunctionNameExpressionNode))
            Dim equalityExpression = New NonTerminal("equalityExpression", GetType(EqualityExpressionNode))
            Dim equalityOperator = New NonTerminal("equalityOperator", GetType(EqualityOperatorNode))

            'BNF Rules
            compoundExpression.Rule = simpleExpression Or bracketedExpression Or functionExpression Or binaryExpression Or equalityExpression Or listExpression 
            simpleExpression.Rule = dataitemExpression Or literalExpression Or quotedExpression Or functionNameExpression
            literalExpression.Rule = number Or identifier
            bracketedExpression.Rule = "(" + compoundExpression + ")"        
            binaryOperator.Rule = ToTerm("+") Or "-" Or "/" Or "*" Or "^"
            binaryExpression.Rule = compoundExpression + binaryOperator + compoundExpression
            equalityOperator.Rule = ToTerm("=") Or "!=" Or "<>" Or ">=" Or "<=" Or ">" Or "<"
            equalityExpression.Rule = compoundExpression + equalityOperator + compoundExpression
            functionExpression.Rule = functionNameExpression + bracketedExpression
            listExpression.Rule = MakePlusRule(listExpression, comma, compoundExpression)
            dataitemExpression.Rule = "[" + dataitem + "]" Or dataitem
            quotedExpression.Rule = quotedLiteral
            functionNameExpression.Rule = FunctionNames()
          

            'Terminal priority
            identifier.Priority = 10
            dataitem.Priority = 20

            'Operator precedence           
            RegisterOperators(10, "*", "/", "%")
            RegisterOperators(9, "+", "-")
            RegisterOperators(8, "=", ">", "<", ">=", "<=", "<>", "!=", "!<", "!>")

            MarkPunctuation("(", ")", ".")
            MarkTransient(simpleExpression, compoundExpression)


            Me.Root = compoundExpression
            Me.LanguageFlags = LanguageFlags.CreateAst

The other thing which may (or may not be salient) is that the  EqualityExpressionNode is inheriting from the BinaryExpressionNode

Thx very much again

S

Coordinator
Jul 27, 2011 at 5:51 PM

remove setting Priority on anything (identifier, dataitem) and try again

Coordinator
Jul 27, 2011 at 5:53 PM

another question - when you load grammar in GrammarExplorer, does it show any conflicts in Grammar Errors page?

Jul 27, 2011 at 5:54 PM

Thx but unfortunately it makes no difference

S

Jul 27, 2011 at 6:03 PM
Edited Jul 27, 2011 at 6:04 PM

Thx, yes ... there are 9 grammar conflicts. Sorry, I'd been just using the VS test tool

Coordinator
Jul 27, 2011 at 6:08 PM

Well, that explains it. you should NOT start parsing until you fix all conflicts. Your grammar is ambiguous, and then parser chooses an arbitrary path, out of several available, and apparently not the one you like. Fix the conflicts first! Try to understand why they happen, what's their nature, fix the grammar accordingly

Roman

Jul 27, 2011 at 6:10 PM

ok... I know that they are about a different product, but they do teach the nuances of writing grammars:

http://vimeo.com/groups/29150/videos

In particular, watch video 3 on operator precedence:

http://vimeo.com/groups/29150/videos/8138418

Then it will be more clear on how to refactor the code. I have written grammars using irony, in particular to decompose sql statement, including operator precedence in the where part of the select, and it works, so I know that the product is working.

Jul 27, 2011 at 6:21 PM

Thx very much

S