Logical operator

May 22, 2014 at 8:24 PM
Hi guys,

I am creating a parser for a mongo stryle query language.

I have created a parser that can take an input string like this and create LINQ expressiong
name$eq:Roman,age$gt:10
public MongoQueryGrammer()
        {

            var expressionList = new NonTerminal("expressionList");

            var identifier = TerminalFactory.CreateCSharpIdentifier("identifier");
            var value = new DsvLiteral("value", TypeCode.String);
            var expression = new NonTerminal("expression");
            var binexpr = new NonTerminal("binexpr");
            var binoperator = new NonTerminal("binoperator");

            expressionList.Rule = MakePlusRule(expressionList, null, expression);

            expression.Rule = binexpr;
            binexpr.Rule = identifier + binoperator + value;
            binoperator.Rule = ToTerm("$eq:") | "$lt:" | "$le:" | "$gt:" | "$ge:";
            Root = expressionList;
        }
It interprets the commas as logical ANDs. This was fine for the first version. Now I want to add support for both AND and OR logical operators, and also parentheses to force operator affinity.

So the new input string will look like this:
name$eq:Roman$and:age$gt:10
or
(name$eq:Roman$or:lastname$eq:Smith)$and:age$gt:10
But I can't really figure out how to express this in my grammar. What I have now looks like this. But using the grammar explorer, I can see that the AND and OR are not parsed correctly. I haven't even gotten to the parentheses yet.
    public class MongoQueryGrammer : Grammar
    {
        public MongoQueryGrammer() : base (false) //Case Insensitive
        {
            var expressionList = new NonTerminal("expressionList");

            var identifier = TerminalFactory.CreateCSharpIdentifier("identifier");
            var value = new DsvLiteral("value", TypeCode.String);
            var binexpr = new NonTerminal("binexpr");
            var logicalexpr = new NonTerminal("logicalexpr");
            var binoperator = new NonTerminal("binoperator");
            var logicaloperator = new NonTerminal("logicaloperator");

            expressionList.Rule = logicalexpr | binexpr;

            logicalexpr.Rule = binexpr + logicaloperator + binexpr;
            logicalexpr.ErrorRule = SyntaxError + ";";
            binexpr.Rule = identifier + binoperator + value;
            binexpr.ErrorRule = SyntaxError + ";";
            binoperator.Rule = ToTerm("$eq:") | "$lt:" | "$lte:" | "$gt:" | "$gte:" | "$ne:";
            logicaloperator.Rule = ToTerm("$or:") | "$and:";
            Root = expressionList;

            RegisterOperators(1, "$and:");
            RegisterOperators(2, "$or:");
            RegisterOperators(3, "$eq:", "$lt:", "$lte:", "$gt:", "$gte:", "$ne:");
        }
    }
What am I getting wrong here?
Coordinator
May 22, 2014 at 9:33 PM
I think your problem is DsvLiteral - it should be used only in comma-separated text files, not in regular files. use identifier, and numberLiteral and stringliteral instead
May 22, 2014 at 9:59 PM
That was the problem. Thanks!

But is there a way to use a stringliteral without start and end symbols? If you look at the input string it doesn't have quotes around the name.

I can use TerminalFactory.CreateCSharpIdentifier, which will work for most cases, but not if there are spaces or dashes. I want the string to terminate when it encounters another operator or EOF.
May 22, 2014 at 10:01 PM
I want to use dashes when specifying a date. e.g. 2004-04-08
Coordinator
May 22, 2014 at 10:04 PM
you can look at FreeTextLiteral - it works like 'any symbol until you hit certain terminator char(s)'. Then hook to ValidateToken and analyze the token in code, and decide if its date literal in certain format, or identifier etc
May 22, 2014 at 11:13 PM
That worked! And I can also add in parentheses.

While the examples I have tried work, I still have two problems.

1) The operator affinity for AND and OR isn't honored. Should RegisterOperator not ensure that? It just splits in the middle.

2) I get a grammar conflict, even though things seem to work. "Reduce-reduce conflict. State S24, lookaheads: $and: $or: EOF ). Selected reduce on first production in conflict set."
        public MongoQueryGrammer() : base (false) //Case Insensitive
        {
            var expressionList = new NonTerminal("expressionList");
            var expression = new NonTerminal("expression");
            
            var identifier = TerminalFactory.CreateCSharpIdentifier("identifier");
            var binexpr = new NonTerminal("binexpr");
            var logicalexpr = new NonTerminal("logicalexpr");
            var binoperator = new NonTerminal("binoperator");
            var logicaloperator = new NonTerminal("logicaloperator");
            var parexpr = new NonTerminal("parexpr");
            var value = new FreeTextLiteral("value", FreeTextOptions.AllowEof, "$", ")");
            value.Escapes.Add(@"\\", @"\");
            value.Escapes.Add(@"\)", @")");

            this.MarkPunctuation("(", ")");

            expressionList.Rule = MakePlusRule(expressionList, logicaloperator, expression);

            expression.Rule = parexpr | logicalexpr | logicalexpr + logicaloperator + logicalexpr | logicalexpr + logicaloperator + binexpr | binexpr;

            logicalexpr.Rule = binexpr + logicaloperator + binexpr;
            logicalexpr.ErrorRule = SyntaxError + ";";
            binexpr.Rule = identifier + binoperator + value;
            binexpr.ErrorRule = SyntaxError + ";";
            binoperator.Rule = ToTerm("$eq:") | "$lt:" | "$lte:" | "$gt:" | "$gte:" | "$ne:";
            logicaloperator.Rule = ToTerm("$and:") | "$or:";

            parexpr.Rule = "(" + logicalexpr + ")" | "(" + binexpr + ")";

            Root = expressionList;

            RegisterOperators(3, "$and:");
            RegisterOperators(2, "$or:");
            RegisterOperators(1, "$eq:", "$lt:", "$lte:", "$gt:", "$gte:", "$ne:");

            this.MarkTransient(parexpr);
        }
Coordinator
May 23, 2014 at 1:59 AM
Mark logicalOp and binOp as Transient, it is a bug still there - this will fix op precedence
as for conflict - try to look at items in the conflict state and understand why and what causes it
May 23, 2014 at 5:47 AM
Marking the operators as transient makes everything look nicer, but it didn't fix the operator precedence.

The new message I get about the grammar error is this:
State S26 (Inadequate)
  Reduce-reduce conflicts on inputs: $and: $or: EOF )
  Reduce items:
    binexpr -> SYNTAX_ERROR ; · [identifier $and: $or: EOF )]
    logicalexpr -> SYNTAX_ERROR ; · [$and: $or: EOF )]
  Transitions: 
Which I guess means I haven't declared all possible combinations of my expressions? Is that correct?

Is this also the reason why I can only use parentheses around the first part of my expression and not on later expression parts.
Coordinator
May 27, 2014 at 5:44 PM
The problem is that you have error rule for logicalExpr and binExpr - they result in conflict. Replace both with ErrorRule on expression