Need help with grammar.

May 26, 2009 at 5:24 PM

I'm trying to create a grammar with syntax similar to a SQL WHERE statement, but much simpler. It should basically allow me to specify "conditions" that are joined with AND's, OR's, NOT's and parenthesis. Here is a list of possible "conditions":

  • <column name> = <value>
  • <column name> != <value>
  • <column name> <> <value>
  • <column name> between <value> and <value>
  • <column name> < <value>
  • <column name> > <value>
  • <column name> <= <value>
  • <column name> >= <value>
  • <column name> = <value>
  • <column name> is null
  • <column name> is not null
  • <column name> child of <value>
  • <column name> not child of <value>
  • <column name> like <value>
  • <column name> not like <value>
  • <column name> begins with <value>
  • <column name> not begins with <value>

Column name is an arbitrary identifier, and <value> is either a single/double quoted string, a number, or a token in the form {1} where 1 is an arbitrary integer.

 As you can see, the "conditions" are not recursive. They are all terminal. Here's my attempt at a grammar:

  var xOr = new NonTerminal("Or");
  var xAnd = new NonTerminal("And");
  var xNot = new NonTerminal("Not");
  var xParenthesis = new NonTerminal("Parenthesis");
  var xComparison = new NonTerminal("Comparison");

  var xBetween = new NonTerminal("Between");
  var xChildOf = new NonTerminal("ChildOf");
  var xEquals = new NonTerminal("Equals");
  var xNotEquals = new NonTerminal("NotEquals");
  var xGreater = new NonTerminal("Greater");
  var xGreaterEqual = new NonTerminal("GreaterEqual");
  var xLess = new NonTerminal("Less");
  var xLessEqual = new NonTerminal("LessEqual");
  var xBeginsWith = new NonTerminal("BeginsWith");
  var xNotBeginsWith = new NonTerminal("NotBeginsWith");
  var xNull = new NonTerminal("IsNull");
  var xNotNull = new NonTerminal("IsNotNull");
  var xLike = new NonTerminal("Like");
  var xNotLike = new NonTerminal("NotLike");
   

  var xValue = new NonTerminal("Value");
  var xColumn = new IdentifierTerminal("ColumnIdentifier");
  var xPlaceholder = new NonTerminal("Placeholder");

  var opOr = Symbol("or");
  opOr.SetOption(TermOptions.IsNonGrammar);

   
  xValue.Rule =
  new StringLiteral("String", "'", StringFlags.AllowsAllEscapes | StringFlags.AllowsDoubledQuote | StringFlags.HasEscapes | StringFlags.AllowsLineBreak)
  | new NumberLiteral("Number", NumberFlags.AllowSign | NumberFlags.AllowStartEndDot)
  | xPlaceholder;

  xPlaceholder.Rule = "{" + new NumberLiteral("PlaceholderIndex", NumberFlags.IntOnly) + "}";

  xBetween.Rule = xColumn + "between" + xValue + "and" + xValue;
  xChildOf.Rule = xColumn + "child" + "of" + xValue;
  xEquals.Rule = xColumn + "=" + xValue | xValue + "=" + xColumn;
  xNotEquals.Rule = xColumn + "!=" + xValue | xValue + "!=" + xColumn | xColumn + "<>" + xValue | xValue + "<>" + xColumn;
  xGreater.Rule = xColumn + ">" + xValue | xValue + "<" + xColumn;
  xGreaterEqual.Rule = xColumn + ">=" + xValue | xValue + "<=" + xColumn;
  xLess.Rule = xColumn + "<" + xValue | xValue + ">" + xColumn;
  xLessEqual.Rule = xColumn + "<=" + xValue | xValue + ">=" + xColumn;
  xBeginsWith.Rule = xColumn + "begins" + "with" + xValue;
  xNotBeginsWith.Rule = xColumn + "not" + "begins" + "with" + xValue;
  xNull.Rule = xColumn + "is" + "null";
  xNotNull.Rule = xColumn + "is" + "not" + "null";
  xLike.Rule = xColumn + "like" + xValue;
  xNotLike.Rule = xColumn + "not" + "like" + xValue;

  xComparison.Rule =
  xBetween
  | xChildOf
  | xEquals
  | xNotEquals
  | xGreater
  | xGreaterEqual
  | xLess
  | xLessEqual
  | xBeginsWith
  | xNotBeginsWith
  | xNull
  | xNotNull
  | xLike
  | xNotLike;


  xOr.Rule = xOr + opOr + xAnd | xAnd;
  xAnd.Rule = xAnd + "AND" + xNot | xNot;
  xNot.Rule = Symbol("NOT") + xParenthesis | xParenthesis;
  xParenthesis.Rule = xComparison | "(" + xOr + ")";


  this.Root = xOr;

Now, the problem is - when I try this grammar in the sample Grammar  Explorer, I get all the keywords like "and", "or", "begins", "with", etc. as instances of ColumnIdentifier. How can this be?

Coordinator
May 26, 2009 at 6:18 PM
Edited May 26, 2009 at 6:24 PM

what Irony version are you using? - from Downloads page or from SourceCode page?

You should be using latest one from Source Code page, it should work correctly there. Try to follow the SQL sample

Roman

May 26, 2009 at 7:28 PM

OK, I'll try. Currently I'm using the one on the front page.

Anyway, I checked out the SQL sample, and there is one little difference - I have some "operators" that are two keywords - like "is null" or "child of". The SQL sample doesn't have anything like that.

May 27, 2009 at 9:29 AM

Here's another question:

I have a node like:

 xPlaceholder.Rule = "{" + new NumberLiteral("PlaceholderIndex", NumberFlags.IntOnly) + "}";

In the parse tree the nodes are reflected like:

Placeholder
     |
     +------ 123 (PlaceHolderIndex)

Is it possible to have the node represented simply by PlaceHolderIndex? The parent node (Placeholder) is pretty useless, because it will always have exactly one child - PlaceHolderIndex which also contains the information I need.

Coordinator
May 27, 2009 at 3:41 PM

Mark xPlaceholder transient:

MarkTransient(xPlaceholder);

You should also register curly braces as punctuation, but I guess you're already doing so

Roman

May 28, 2009 at 12:31 PM

Yes, that did it! :)

Actually I'm registering all my literals as punctuation. Like:

 

  private SymbolTerminal PSymbol(string text)
  {
      var ret = this.Symbol(text);
      ret.SetOption(TermOptions.IsPunctuation);
      return ret;
  }

 

and then later

 

xBetween.Rule = xColumn + PSymbol("between") + xValue + PSymbol("and") + xValue;

 

That's OK, isn't it?

Coordinator
May 28, 2009 at 3:20 PM

There's a bit more efficient way to do this. Just use string literals "as is" in grammar rules, like this:

xBetween.Rule = xColumn + "between" + xValue + "and" + xValue;

and at the end call RegisterPunctuation to register all symbols at once:

RegisterPunctuation("between", "and", "or", ....);