This project has moved and is read-only. For the latest updates, please go here.

Simple XML Grammar doesn't work as expected

Mar 14, 2013 at 12:50 AM
Edited Mar 14, 2013 at 12:52 AM
I've been trying to write a simple XML grammar using Irony. The grammar is defined as follows:
this.Root = tagList;
tagList.Rule = this.MakeStarRule(tagList, tag);
tag.Rule = conditionBlock;

ifTagOpen.Rule = < + "if" + "condition" + equals + "'a'" + >;
ifTagClose.Rule = < + slash + "if" + >;
elseIfTag.Rule = < + "else" + "if" + equals + "'a'" + >;
elseTagClose.Rule = < + slash + "else" + >;

conditionBlock.Rule = ifTagOpen + ifTagCloseElseIf | ifTagOpen + tag + ifTagCloseElseIf;
elseIfTagBlock.Rule = elseIfTag + elseTagClose;
ifTagCloseElseIf.Rule = ifTagClose | ifTagClose + elseIfTagBlock | Empty;
Now, I want Irony to parse code like this (which works):
<if condition='a'><if condition='a'></if><else if='a'></else></if>
But, I also want something like that to be parsable:
<if condition='a'><if condition='a'></if></if>
and this is where I got stuck as this won't parse because of the last rule:
ifTagCloseElseIf.Rule = ifTagClose | ifTagClose + __elseIfTagBlock__ | Empty;
I suspect the reason is that Irony looks ahead only one token at a time and when it encounters
it applies the rule ifTagClose + elseIfTagBlock instead of the plain ifTagClose because of the following < token and as a result it expects </if><else if='a'>.

My question is, is there any workaround to this? Or should I use something else instead of Irony?
Mar 14, 2013 at 3:20 AM
Quite strange way to express this. I'm pretty sure you have grammar conflicts (are there in grammar explorer?!), and one for sure is 'dangling-else' type (google it); Look at how if-then-else is expressed in one of sample languages (like c# or miniPython), and do similar thing
Mar 14, 2013 at 2:59 PM
Edited Mar 14, 2013 at 3:09 PM
Thank you for you answer. There are no grammar conflicts. Are you sure that it's a dangling-else problem? Because if it were, there would be conflicts if my understanding of dangling-else problem is correct. Also, if I change the fourth to seventh rule as follows:
ifTagOpen.Rule = <  + "a" + "condition" + equals + "'i'" + >;
ifTagClose.Rule = <  + slash + "a" + >;
elseIfTag.Rule = <  + "b" + "a" + equals + "'i'" + >;
elseTagClose.Rule = <  + slash + "a" + >;
it still doesn't perform as expected even though there are no "if" and/or "else" keywords. I think the problem lies in the "<" sign at the start of every terminal. To my mind, this is similar to the problem with generic types (List<string>) as defined in c# grammar shipped with Irony. Am I at least close?
Mar 14, 2013 at 6:00 PM
Edited Mar 14, 2013 at 6:25 PM
I think I found the reason why it doesn't work. There indeed is a conflict, but Irony Grammar Explorer doesn't show it. I tried to define the grammar in GOLD parser, which found the conflict and even solved it correctly. So if I'm right, I need to replace the last rule with this one:
ifTagCloseElseIf.Rule = ifTagClose | ifTagClose + this.PreferShiftHere() + elseIfTagBlock | Empty;
That, however, does not solve the problem. Am I doing something wrong, or is this maybe a bug in Irony?
Mar 15, 2013 at 3:58 AM
At first sight, problem lays down in literals concatenation. For example, "a" + "condition" will be "acondition" and "b"+"a" will be "ba". I don't think that it's what you wanted. Try using ToTerm("b") + ToTerm("a") instead.
Mar 21, 2013 at 5:20 AM
from your sample above, you want to parse smth like this:
<if condition='a'><if condition='a'></if><else if='a'></else></if>

well, this is really messy arrangement, IMHO. In one case 'if' is element name, in other - it is attribute name. This is the root of your problems that results in messy rules that you can't figure out (and I couldn't, honestly)
I can suggest a grammar that parses 'If' elements that have nested if's and optional 'else' blocks:
      var stmtList = new NonTerminal("stmtlist");
      var stmt = new NonTerminal("stmt");
      var stmtOpt = new NonTerminal("stmtOpt");
      var ifStmt = new NonTerminal("ifStmt");
      var ifHead = new NonTerminal("ifHead");
      var elseBlockOpt = new NonTerminal("elseBlock");

      var openIf = ToTerm("<if");
      var endIf = ToTerm("</if>");
      var elseTag = ToTerm("<else/>");
      var beep = ToTerm("<beep/>");
      var condExpr = ToTerm("'a'");

      this.Root = stmtList;
      stmtList.Rule = this.MakeStarRule(stmtList, stmt);
      stmt.Rule = ifStmt | beep;
      stmtOpt.Rule = stmt | Empty; 
      ifStmt.Rule = ifHead + stmtOpt + elseBlockOpt + endIf;
      ifHead.Rule = openIf + "condition" + "=" + condExpr + ">";
      elseBlockOpt.Rule = Empty | elseTag + stmtOpt;

it parses successfully the following input:

<if condition='a'>
<if condition='a'></if>
<if condition='a'><beep/></if>