How to define syntax

Aug 2, 2010 at 9:24 PM
Edited Aug 3, 2010 at 2:55 PM

Hi, I am new at language processing and I want to create a parser with Irony for a following syntax: name1:value1 name2:value2 name3:value ... where name1 is the name of an xml element and value is the value of the element which can also include spaces. I have tried to modify included samples like this:

public TestGrammar()
    {
        var name = new IdentifierTerminal("name");
        var value = CreateTerm("value");

        var queries = new NonTerminal("queries");
        var query = new NonTerminal("query");
        queries.Rule = MakePlusRule(queries, null, query);
        query.Rule = name + ":" + value;
        Root = queries;
    }

    private IdentifierTerminal CreateTerm(string name)
    {
        IdentifierTerminal term = new IdentifierTerminal(name, "!@#$%^*_'.?-", "!@#$%^*_'.?0123456789");
        term.CharCategories.AddRange(new[]
                                         {
                                             UnicodeCategory.UppercaseLetter, //Ul
                                             UnicodeCategory.LowercaseLetter, //Ll
                                             UnicodeCategory.TitlecaseLetter, //Lt
                                             UnicodeCategory.ModifierLetter, //Lm
                                             UnicodeCategory.OtherLetter, //Lo
                                             UnicodeCategory.LetterNumber, //Nl
                                             UnicodeCategory.DecimalDigitNumber, //Nd
                                             UnicodeCategory.ConnectorPunctuation, //Pc
                                             UnicodeCategory.SpacingCombiningMark, //Mc
                                             UnicodeCategory.NonSpacingMark, //Mn
                                             UnicodeCategory.Format //Cf
                                         });
        //StartCharCategories are the same
        term.StartCharCategories.AddRange(term.CharCategories);
        return term;
    }

but this doesn't work if the values include spaces. Can this be done (using Irony) without modifying the syntax (like adding quotes around values)? Many thanks!

Aug 3, 2010 at 8:38 AM

I can think of 3 easy solutions:

  • You escape the spaces with something like "\s". I believe this requires some extra code.
  • You use delimiters for the string:

      public TestGrammar()
      {
         var name = CreateTerm("name");
         var value = new StringLiteral("value", "\"", StringOptions.AllowsDoubledQuote | StringOptions.NoEscapes);
         var queries = new NonTerminal("queries");
         var query = new NonTerminal("query");

         queries.Rule = MakePlusRule(queries, null, query);
         query.Rule = Empty | name + ":" + value;

         Root = queries;
      }

  • You place each query on a different line:
      public void TestGrammar()
      {
         var name = CreateTerm("name");
         var value = new FreeTextLiteral("value", FreeTextOptions.AllowEof, "\r", "\n");
         var queries = new NonTerminal("queries");
         var query = new NonTerminal("query");

         queries.Rule = MakePlusRule(queries, null, query);
         query.Rule = Empty | name + ":" + value;

         Root = queries;
      }

The code here is not perfect and has not been thoroughly tested, but it might give you a good starting point.

Good luck!