Maybe optionals aren't working how they should

Nov 26, 2010 at 6:25 PM

Imagine two rules:

A.Rule = C + (Empty | D) + E;

B.Rule = C + D + E;
B.Rule |= C + E;

To the untrained eye (e.g. mine), these two rules should be equivalent. But they aren't. Most of the conflicts I have revolve around optional parts (be they star rules or optional rules). I can resolve them by changing them from the form of rule A into rule B.

I understand the technical reason why this happens but is there a way to fix the state generator so that these two rules are the same?


Nov 26, 2010 at 7:21 PM

This automatic resolution is known and called GLR:

this is still in TODO list - GLR elements... 

The trouble is that you got to be really careful about relying on GLR to resolve conflicts; some conflicts (like "dangling else") are intrinsic to grammar and must be resolved explicitly by extra resolution rules; others (like the ones you encounter) are result of limitations of parsing method. 

OSLO parser (now defunct and abandoned by MS) used GLR, but did not detect properly the nature of the conflict, or even reported conflicts, hoping to resolve them at parsing time through GLR... this approach seems unacceptable to me.

So the proper implementation requires some research


Nov 26, 2010 at 8:25 PM

I agree that the GLR approach is unacceptable.  I prefer to know if my grammar is ambiguous.  

My point is that if I can resolve a conflict by spelling out the options (as in my original post) then there shouldn't have been a conflict in the first place because I haven't (or shouldn't have) changed the grammar at all, only how I express it.


Nov 27, 2010 at 7:48 AM
Edited Nov 27, 2010 at 3:28 PM

"Spelling out the options" - you simply express the GLR paths to take - with middle element and without it. 

I was actually a bit imprecise. This is in fact Non-Canonical Parser (not GLR), when you try to continue parsing with both alternatives in mind. The main difference between NonCanonical and GLR parser is that in NonCanonical parser we try to modify the state automaton (at parser construction time) to "delay" conflict resolution by parsing both alternatives. In GLR, we do not modify state machine (change productions, etc), we use the same LALR state set, but we explore various alternatives (either in parallel or sequentially) until we find error-free path. 

I spent about 6 months last year researching and implementing non-canonical algorithms in Irony; it did not work well, parser builder was going into infinite loops, number of states often exploded, and it never worked beyond primitive samples. I did release the code, it had been in Irony for a while (as an option), but finally I pulled the plug and removed it. Maybe I'll have more luck with GLR

A small comment on your last statement:  "I haven't ..... changed the grammar at all, only how I express it." Well, you did in fact change the grammar, what you did not change is the language described by the grammar. For a given language, there are many grammars that describe (define) it. Next, there are several parsing algorithms (LR, LL, LALR, etc); parser implementing an algorithm is guided by the language grammar, but this grammar is not just any grammar out of many that describe the language. Each method requires special kind of grammar, with certain properties. For ex, LL parser cannot handle left recursion in grammar rules (it goes into infinite loop). So usually, when you construct a parser for a language, you go into 2 steps:

1 - obtain/write a grammar describing the language;

2 - adjust the grammar (restructure it) to fit the parsing method.

So that's exactly what you do now - adjusting the grammar to fit LALR parser. It is part of a normal, standard process, nothing unusual. 


Nov 28, 2010 at 12:12 AM
Edited Nov 28, 2010 at 12:13 AM

Ok.  That makes sense.  I have noticed that Irony does seem to put off making a decision for as long as possible.  For example, the following rules don't conflict:


A.Rule = B + C + D + "Bob" + F;
G.Rule = B + C + D + "Jim" + F;


The more I work with Irony the more impressed I am.  Thanks for your patience with me. :-)