Defining Constants (null/true/false)

Sep 9, 2012 at 2:19 AM
Edited Sep 9, 2012 at 2:24 AM

Hi,

I tried to create a grammar for a small JSON-like language. In order to define null/true/false, I thought ConstantTerminal would be the way to go. However, I'm running into problems. First I tried to define the grammar like this:

    [Language("Problem", "0.1", "Problem Grammar...")]
    public class ProblemGrammar : Grammar
    {
        public ProblemGrammar()
            : base(caseSensitive: true)
        {
            var nullConst = new ConstantTerminal("null", typeof(LiteralValueNode));
            nullConst.Add("null", null);

            var numberLit = new NumberLiteral("number", NumberOptions.None, typeof(LiteralValueNode));

            var type = new IdentifierTerminal("type", IdOptions.None);
            var field = new IdentifierTerminal("field", IdOptions.None);

            var comma = ToTerm(",");
            var optComma = comma.Q();

            var map = new NonTerminal("map");
            var @object = new NonTerminal("object");

            var value = new NonTerminal("value");
            var valuePair = new NonTerminal("valueList");
            var fieldValuePair = new NonTerminal("fieldValuePair");
            var valuePairList = new NonTerminal("valuePairList");
            var fieldValuePairList = new NonTerminal("valuePairList");
            
            value.Rule = nullConst | numberLit | map | @object;
            
            valuePair.Rule = value + ToTerm(":") + value;
            fieldValuePair.Rule = field + ToTerm(":") + value;

            valuePairList.Rule = MakePlusRule(valuePairList, comma, valuePair);
            fieldValuePairList.Rule = MakePlusRule(fieldValuePairList, comma, fieldValuePair);
            
            map.Rule = (ToTerm("{") + valuePairList + optComma + ToTerm("}"));
            @object.Rule = type + (ToTerm("{") + fieldValuePairList + optComma + ToTerm("}"));

            MarkPunctuation("{", "}");

            base.Root = value;
        }
    }

At first glance, this works well. For example, the following is parsed fine:

MyObject {
    field0: 1,
    field1: 2,
    field2: 3,
}

However, the null constant does not work as expected:

MyObject {
    field0: null,
    field1: 2,
    field2: 3,
}

This reports "SyntaxError, expected: {". This means "null" is matched as a type identifier rather than a constant.

Sep 9, 2012 at 2:32 AM
Edited Sep 9, 2012 at 2:41 AM

In an attempt to fix this, I tried changing the priority of the constant.

nullConst.Priority = TerminalPriority.ReservedWords;

This allows the previous examples to parse. However, it introduces a new problem. For example, this is no longer valid:

nullary {
    field0: null,
    field1: 2,
    field2: 3,
}

Now, the parser matches the "null" in "nullary" and decides that it is a constant. Everything after "null" yields "SyntaxError, unexpected input". However, I would like it to be matched as a type identifier. While this is less of a problem than the first one, it is still annoying.

So in short, my question is: How should I modify my code in order to match "null" as a constant and "nullary" as a type identifier?

Sep 9, 2012 at 3:08 PM
Edited Sep 9, 2012 at 3:22 PM

In order to focus on the issue, I have simplified the grammar:

    [Language("ProblemGrammar", "0.1", "Problem Grammar...")]
    public class ProblemGrammar : Grammar
    {
        public ProblemGrammar()
            : base(caseSensitive: true)
        {
            var nullConst = new ConstantTerminal("null", typeof(LiteralValueNode));
            nullConst.Add("null", null);
            //nullConst.Priority = TerminalPriority.ReservedWords;

            var type = new IdentifierTerminal("type", IdOptions.None);
            var field = new IdentifierTerminal("field", IdOptions.None);

            var comma = ToTerm(",");

            var value = new NonTerminal("value");
            var fieldValuePair = new NonTerminal("fieldValuePair");
            var fieldValuePairList = new NonTerminal("fieldValuePairList");
            var @object = new NonTerminal("object");

            value.Rule = nullConst | @object;
            fieldValuePair.Rule = field + ToTerm(":") + value;
            fieldValuePairList.Rule = MakePlusRule(fieldValuePairList, comma, fieldValuePair);
            @object.Rule = type + "{" + fieldValuePairList + "}";

            MarkPunctuation(",", ":", "{", "}");

            base.Root = value;
        }
    }

And this is my minimal test case:

nullary{a:null}

I would like this to parse as:

value
    + object
        + nullary (type)
        + fieldValuePairList
            + fieldValuePair
                + a (field)
                + value
                    + (null)

Should not be too hard, right?

Coordinator
Sep 11, 2012 at 4:50 PM

sorry for late reply. Will get to it in a few days; for now, use the same approach as in Json grammar sample; do not use constantTerminal, just use keywords, and for interpreter define values for these in Globals dictionary. Will investigate the issue soon

Roman

Sep 12, 2012 at 3:06 AM

I had a look at the Json grammar sample. Using the ToTerm-method does indeed solve the issue. However, I have been scratching my head over why exactly that is. In particular, I cannot explain the following behaviour (but maybe it will help you track down the issue):

Since ConstantTerminal did not work and ToTerm-method did, I assumed that this had something to do with the implementation of ConstantTerminal vs. KeyTerm. However, it turns out that this is not the case. Creating a KeyTerm manually results in the same behaviour as with ConstantTerminal. The desired result seems to depend solely on the presence of the term in the KeyTermTable of the grammar. In other words, this works:

var nullConst = new KeyTerm("null", "null");
KeyTerms["null"] = nullConst;

But this does not:

var nullConst = new KeyTerm("null", "null");
//KeyTerms["null"] = nullConst;

Sep 15, 2012 at 8:56 PM

I may have been wrong about the KeyTermTable, I'm not sure anymore at this point. Anyway, I managed to do most of what I wanted to. There is only one problem that remains, concerning MakeListRule. My lists can be empty and I would really like to allow trailing delimiters for my lists. However, currently setting both TermListOptions.AllowEmpty and TermListOptions.AllowTrailingDelimiter creates a "Shift-reduce conflict". Here is an example:

    [Language("ProblemGrammar2", "0.1", "Problem Grammar...")]
    public class ProblemGrammar2 : Grammar
    {
        public ProblemGrammar2()
            : base(caseSensitive: true)
        {
            var delimiter = ToTerm(",");

            var number = new NumberLiteral("number", NumberOptions.None);
            var numberList = new NonTerminal("numberList");

            var listOptions = TermListOptions.AllowEmpty | TermListOptions.AllowTrailingDelimiter;
            numberList.Rule = MakeListRule(numberList, delimiter, number, listOptions);

            var list = new NonTerminal("list");
            list.Rule = "[" + numberList + "]";

            MarkPunctuation(",", "[", "]");

            base.Root = list;
        }
    }

You'll probably know how to fix this. For now, this seems to work:

        protected BnfExpression MakeCustomListRule(NonTerminal list, BnfTerm delimiter, BnfTerm listMember)
        {
            var emptyListContent = new NonTerminal(listMember.Name + "-");
            emptyListContent.Rule = Empty;

            var fullListContent = new NonTerminal(listMember.Name + "+");
            fullListContent.Rule = (listMember) | (fullListContent + delimiter + listMember);

            var listContent = new NonTerminal(listMember.Name + "*");
            listContent.Rule = (emptyListContent + delimiter.Q()) | (fullListContent + delimiter.Q());

            emptyListContent.SetFlag(TermFlags.IsList);
            emptyListContent.SetFlag(TermFlags.NoAstNode);

            fullListContent.SetFlag(TermFlags.IsList);
            fullListContent.SetFlag(TermFlags.NoAstNode);

            listContent.SetFlag(TermFlags.IsListContainer);
            listContent.SetFlag(TermFlags.NoAstNode);

            return listContent;
        }

Sep 22, 2012 at 8:42 PM
Edited Sep 22, 2012 at 8:42 PM
kloffy wrote:

There is only one problem that remains, concerning MakeListRule. My lists can be empty and I would really like to allow trailing delimiters for my lists. However, currently setting both TermListOptions.AllowEmpty and TermListOptions.AllowTrailingDelimiter creates a "Shift-reduce conflict". 

No response to this? Is my observation correct or am I doing things wrong? Will this be supported/fixed? Is the work-around that I proposed a good idea?

Sep 22, 2012 at 10:22 PM

It is unnecessary to eliminate all shift-reduce and reduce-reduce conflicts; doing so needlessly bloats the grammar. The PreferShiftHere() and ReduceHere() functions exist to resolve the conflicts appropriately while remaining efficient.

Sep 23, 2012 at 12:32 AM
Edited Sep 23, 2012 at 12:35 AM

Ok, but the example grammar that I posted does not work. (Even if I add TermListOptions.AddPreferShiftHint.)

For example, the following list:

[1,]

Yields a "Syntax error, expected: number". The only way I can get it to work is using my custom make list rule method.

Coordinator
Sep 23, 2012 at 4:05 AM

confirmed, it's a flaw in Grammar.MakeListRule. Change it to the following:

    protected BnfExpression MakeListRule(NonTerminal list, BnfTerm delimiter, BnfTerm listMember, TermListOptions options = TermListOptions.PlusList) {
      //If it is a star-list (allows empty), then we first build plus-list
      var isPlusList = !options.IsSet(TermListOptions.AllowEmpty);
      var allowTrailingDelim = options.IsSet(TermListOptions.AllowTrailingDelimiter) & delimiter != null;
      NonTerminal plusList = isPlusList ? list : new NonTerminal(listMember.Name + "+");
      //"list" is the real list for which we will construct expression - it is either extra plus-list or original listNonTerminal. 
      // In the latter case we will use it later to construct expression for listNonTerminal
      plusList.SetFlag(TermFlags.IsList);
      plusList.Rule = plusList;  // rule => list
      if (delimiter != null)
        plusList.Rule += delimiter;  // rule => list + delim
      if (options.IsSet(TermListOptions.AddPreferShiftHint))
        plusList.Rule += PreferShiftHere(); // rule => list + delim + PreferShiftHere()
      plusList.Rule += listMember;          // rule => list + delim + PreferShiftHere() + elem
      plusList.Rule |= listMember;        // rule => list + delim + PreferShiftHere() + elem | elem
      if (isPlusList) {
        // if we build plus list - we're almost done; plusList == list
        // add trailing delimiter if necessary; for star list we'll add it to final expression
        if (allowTrailingDelim)
          plusList.Rule |= list + delimiter; // rule => list + delim + PreferShiftHere() + elem | elem | list + delim
      } else {
        // Setup list.Rule using plus-list we just created
        list.Rule = Empty | plusList;
        if (allowTrailingDelim)
          list.Rule |= plusList + delimiter | delimiter;
        plusList.SetFlag(TermFlags.NoAstNode);
        list.SetFlag(TermFlags.IsListContainer); //indicates that real list is one level lower
      } 
      return list.Rule; 
    }//method

That should work now. I will push the update soon. Sorry, but never thought of this scenario (star list with allow trailing delim)

Roman

Coordinator
Sep 23, 2012 at 4:51 AM
Edited Sep 23, 2012 at 6:53 AM

about your original problem with constants. First, the original idea was to use it for "special looking" constants. The comment on top in ConstantTerminal says:

 //This terminal allows to declare a set of constants in the input language
  // It should be used when constant symbols do not look like normal identifiers; e.g. in Scheme, #t, #f are true/false
  // constants, and they don't fit into Scheme identifier pattern.

But I see the problem, the terminal should work even if accidentally used in cases like yours. Here's the fix. Add the following line at the end of constructor of ConstantTerminal: 

      this.Priority = TerminalPriority.High; //constants have priority over normal identifiers

And change TryMatch implementation to:

 

    public override Token TryMatch(ParsingContext context, ISourceStream source) {
      string text = source.Text;
      foreach (var entry in Constants) {
        source.PreviewPosition = source.Position;
        var constant = entry.Key;
        if (source.PreviewPosition + constant.Length > text.Length) continue;
        if (source.MatchSymbol(constant)) {
          source.PreviewPosition += constant.Length;
          if (!this.Grammar.IsWhitespaceOrDelimiter(source.PreviewChar))
            continue; //make sure it is delimiter
          return source.CreateToken(this.OutputTerminal, entry.Value);
        }
      }
      return null;
    }

 

Basically I added check that the constant is followed by delimiter or whitespace. That should work now. Thanks for finding this

Roman

Sep 29, 2012 at 1:02 PM

Thank you for fixing this issue. It works perfectly now!