How can I solve ambiguity

Jul 29, 2009 at 8:51 AM

Hello,

I have an old project with a simple basic language.
I tried to port the parser under Irony for develop some new feature.
My grammar has some ambiguity and I would like to know
how can I solve this with Irony.

To demonstrate my problem I cut down my grammar to the minimum.
When I parse the basic code snippet below I got SubProgramCall
instead of ArrayAssignment. What should I do to got the expected node?
We have a lot of old basic sources, so modify the grammar
(and modify the sources) is not an option.

The subroutines is predefined. So the parser may know if
the given identifier in an expression is subroutine or not.
But how can I give this information to the parser?

Thanks,
Zoltan

Code snippet:

for i=1 to 10
  a(i) = 1
next

The tree result:

<ParseTree>
  <Node Term="BasicFile">
    <Node Term="SequentialStatement">
      <Node Term="ForStatement">
        <Node Term="for" Terminal="SymbolTerminal" Value="for" />
        <Node Term="identifier" Terminal="IdentifierTerminal" Value="i" />
        <Node Term="=" Terminal="SymbolTerminal" />
        <Node Term="number" Terminal="NumberLiteral" Value="1" />
        <Node Term="to" Terminal="SymbolTerminal" />
        <Node Term="number" Terminal="NumberLiteral" Value="10" />
        <Node Term="SequentialStatementList">
          <Node Term="SequentialStatement">
            <Node Term="SubprogramCallWithParameters">
              <Node Term="identifier" Terminal="IdentifierTerminal" Value="a" />
              <Node Term="ExpressionList">
                <Node Term="BinExpr">
                  <Node Term="ParExpr">
                    <Node Term="identifier" Terminal="IdentifierTerminal" Value="i" />
                  </Node>
                  <Node Term="=" Terminal="SymbolTerminal" />
                  <Node Term="number" Terminal="NumberLiteral" Value="1" />
                </Node>
              </Node>
            </Node>
          </Node>
        </Node>
        <Node Term="next" Terminal="SymbolTerminal" Value="next" />
      </Node>
    </Node>
  </Node>
</ParseTree>

The Grammar:

    public class BasicGrammar : Grammar
    {
        public BasicGrammar()
            : base(false)
        {
            // 1. Terminals
            var number = new NumberLiteral("number");
            var Identifier = new IdentifierTerminal("identifier");
            // 2. Keywords
            SymbolTerminal _for_ = Symbol("for");
            SymbolTerminal _to_ = Symbol("to");
            SymbolTerminal _step_ = Symbol("step");
            SymbolTerminal _next_ = Symbol("next");
            SymbolTerminal _equal_ = Symbol("=");
            SymbolTerminal _left_ = Symbol("(");
            SymbolTerminal _right_ = Symbol(")");
            SymbolTerminal _comma_ = Symbol(",");
            var BasicFile = new NonTerminal("BasicFile");
            var AssignmentOrCall = new NonTerminal("AssignmentOrCall");
            var ArrayAssignment = new NonTerminal("ArrayAssignment");
            var Array = new NonTerminal("ArrayVariable");
            var BinExpr = new NonTerminal("BinExpr");
            var ForStatement = new NonTerminal("ForStatement");
            var ParExpr = new NonTerminal("ParExpr");
            var Expression = new NonTerminal("Expression");
            var BasicLine = new NonTerminal("BasicLine");
            var BasicItem = new NonTerminal("BasicItem");
            var SequentialItem = new NonTerminal("SequentialItem");
            var SequentialStatementList = new NonTerminal("SequentialStatementList");
            var SequentialStatement = new NonTerminal("SequentialStatement");
            var SimpleStatement = new NonTerminal("SimpleStatement");
            var Assignment = new NonTerminal("Assignment");
            var SubprogramCallWithParameters = new NonTerminal("SubprogramCallWithParameters");
            var ExpressionList = new NonTerminal("ExpressionList");
            var Term = new NonTerminal("Term");
            var BinOp = new NonTerminal("BinOp");
            var ExpressionOpt = new NonTerminal("ExpressionOpt");
            // 4. The Grammar
            BasicFile.Rule = MakeStarRule(BasicFile, SequentialStatement);
            this.Root = BasicFile;       // Set grammar root
            // 5. Rules
            SequentialStatement.Rule =
                SimpleStatement + NewLine
                | ForStatement
                ;
            SimpleStatement.Rule =
                Identifier  // subprogram call without parameters
                | ArrayAssignment
                | SubprogramCallWithParameters
                ;
            ForStatement.Rule =
                _for_ + Identifier + _equal_ + Expression + _to_ + Expression
                + NewLine
                + SequentialStatementList
                + _next_
                + NewLine
                ;
            SequentialStatementList.Rule = MakeStarRule(SequentialStatementList, SequentialStatement);
            SubprogramCallWithParameters.Rule =
                Identifier + ExpressionList
                ;
            ArrayAssignment.Rule = Identifier + _left_ + Expression + _right_ + _equal_ + Expression;
            ExpressionList.Rule = MakePlusRule(ExpressionList, _comma_, Expression);
            Expression.Rule = Term | BinExpr;
            ParExpr.Rule = _left_ + Expression + _right_;
            BinExpr.Rule = Expression + BinOp + Expression;
            Term.Rule = number| ParExpr | Identifier | Array;
            BinOp.Rule = Symbol("+") | "-" | "*" | "/" | "=";
            Array.Rule = Identifier + _left_ + Expression + _right_;
            ExpressionOpt.Rule = Empty | _comma_ + Expression;
            Delimiters = "#(),:+-*/%&|^!~<>=";
            RegisterOperators(70, "*", "/");
            RegisterOperators(60, "+", "-");
            RegisterOperators(40, "=");
            RegisterPunctuation(",", "(", ")", ":", "#");
            LineTerminators = "\r\n";
            WhitespaceChars = " \t\r\n\v";
            MarkTransient(Term, Expression);
            this.SetLanguageFlags(LanguageFlags.NewLineBeforeEOF | LanguageFlags.AutoDetectTransient | LanguageFlags.CreateAst | LanguageFlags.AutoDetectKeywords);
        }
    }
Coordinator
Jul 29, 2009 at 5:55 PM

Obviously, you cannot do it - I mean, resolve ambiguity at parsing stage. Because array element access and function call look identical in Basic, it is clearly a SEMANTIC difference, not syntactic. Therefore, parser cannot do this.

In your code snippet, you have array element on the left side of the assignment, so theoretically very advanced parser could detect that this is array, not function call.

But for code like

x = a(b,c)

you cannot decide what it is until you know all surrounding semantic context. The disambiguation can be done only at code analysis phase, after parsing, or even directly at runtime by the interpreter.

The solution is to merge array and function call into one non-terminal, and leave it to later stage to make a decision

Roman