problem in the SQL grammar (NOT LIKE)

Aug 30, 2011 at 8:33 AM


i'm trying to parse SQL.

i'm having a problem with the "NOT LIKE" operator.

i'm using "SQL, version 89".

this sentence is parsed as it should

SELECT col FROM tbl WHERE var1 <> '2' or var2 LIKE 'VIR*' and var3 = '5'

              |--------------- LIKE ---------------|
       |------OR------|                      |----AND----|
 |--- <> ---|        var2                  VIR*     |--- = ---|
var1        '2'                                         var3       '5'

but this not

SELECT col FROM tbl WHERE var1 <> '2' or var2 NOT LIKE 'VIR*' and var3 = '5'

 |--- <> ---|                 |----NOT LIKE----|
var1        '2'             var2         |----AND----|
                                            VIR*     |--- = ---| 
                                                    var3        '5'


If you have an idea...

Thanks a lot!


Aug 30, 2011 at 9:41 PM

yeah, it's a problem. BTW, I think even the first statement is parsed incorrectly.

The problem is with correct precedence handling - assigning/inheriting proper prec value to BinOp instances - it is not transient because of composite operators: "NOT LIKE" and "NOT IN", and that's what makes this grammar special. Looks like Irony is missing a proper facility for handling this corner case.

I"m looking into this, can't offer quick fix for now. 

Aug 31, 2011 at 4:21 AM

Here's a temp fix. Change the last lines of SQL grammar to the following:

      RegisterOperators(10, "*", "/", "%"); 
      RegisterOperators(9, "+", "-");
      RegisterOperators(8, "=", ">", "<", ">=", "<=", "<>", "!=", "!<", "!>", "LIKE", "IN");
      RegisterOperators(7, "^", "&", "|");
      RegisterOperators(6, NOT); 
      RegisterOperators(5, "AND");
      RegisterOperators(4, "OR");

      MarkPunctuation(",", "(", ")");
      MarkPunctuation(asOpt, semiOpt);
      //Note: we cannot declare binOp as transient because it includes operators "NOT LIKE", "NOT IN" consisting of two tokens. 
      // Transient non-terminals cannot have more than one non-punctuation child nodes.
      // Instead, we mark binOp as Operator, so that it inherits precedence value from it's children, and this precedence is used
      // in conflict resolution when binOp node is sitting on the stack
      base.MarkTransient(stmt, term, asOpt, aliasOpt, stmtLine, expression, unOp, tuple);
} //end of constructor

Also, in CoreParser.cs, replace the following method:


    private ParseTreeNode ReduceRegularNode(ParserAction action) {
      var childCount = action.ReduceProduction.RValues.Count; 
      int firstChildIndex = Context.ParserStack.Count - childCount;
      var span = ComputeNewNodeSpan(childCount);
      var newNode = new ParseTreeNode(action.ReduceProduction, span);
      var newIsOp = newNode.Term.Flags.HasFlag(TermFlags.IsOperator); 
      for(int i = 0; i < childCount; i++) {
        var childNode = Context.ParserStack[firstChildIndex + i];
          continue; //skip punctuation or empty transient nodes
        CheckCreateAstNode(childNode); //AST nodes for lists and for terminals are created here 
        //Inherit precedence and associativity, to cover a standard case: BinOp->+|-|*|/; 
        // BinOp node should inherit precedence from underlying operator symbol. Keep in mind special case of SQL operator "NOT LIKE" which consists
        // of 2 tokens. We therefore inherit "max" precedence from any children
        if(newIsOp && childNode.Precedence != BnfTerm.NoPrecedence && childNode.Precedence > newNode.Precedence) {
          newNode.Precedence = childNode.Precedence;
          newNode.Associativity = childNode.Associativity;
      }//for i
      return newNode;     

That should fix it. I will submit the fix in the next code drop. Also, I think of another solution for situation like this (allowing terminals that represent sequence of keywords with arbitrary whitespace in between: "NOT LIKE") - that would solve the problem



Aug 31, 2011 at 1:18 PM

thank you very much, it works perfectly.