This project has moved and is read-only. For the latest updates, please go here.

How to understand parser states for a simple ambiguity

Jan 14, 2012 at 12:28 PM
Edited Jan 14, 2012 at 12:35 PM


Think this is a simple one ... but I can't see it.

I am writing a simple sql dialect to permit none, one or multiple joins





I think, the offending part of my grammar is here


fromClause.Rule = FROM + eventIdentifier + joinClauseList;
joinClause.Rule = JOIN + eventIdentifier;
joinClauseList.Rule = MakeStarRule(joinClauseList, joinClause);


If I reference the joinClause in the fromClause it is fine, but the joinClauseList seems to be problematic. The parser states indicated in the ambiguity are

State S31
  Reduce items:
    JoinClauseListOptional -> JoinClauseListOptional JoinClause ·

State S32
  Shift items:
    JoinClause -> JOIN ·eventIdentifier
    eventIdentifier -> ·standardIdentifier
    standardIdentifier -> ·id_simple
  Transitions: eventIdentifier->S49, standardIdentifier->S24, id_simple->S10

State S33 (Inadequate)
  Shift items:
    JoinClauseListOptional -> JoinClauseListOptional ·JoinClause
    JoinClause -> ·JOIN eventIdentifier
  Reduce items:
    fromClause -> FROM eventIdentifier JoinClauseListOptional · [WHERE JOIN EOF]
  Transitions: JoinClause->S31, JOIN->S32

I'd be grateful if someone could just explain how I should read these parser states to understand the ambiguity. AFAICS, if the parser comes across a JOIN token if doesn't know where to go next. Is that correct or is it telling me something else?

Many thx




Jan 14, 2012 at 6:17 PM
Edited Jan 14, 2012 at 6:20 PM

I see that JOIN appears as a lookahead in reduce item in state S33. That's the problem, so where it's coming from? My guess is that your language allows (Select ... From) as join element, so the following is legal:

Select From A Join B Join Select From C Join D. 

If this is the case, then parser conflict is that it does not know what D is joined to - to C only, or to A-B-(SelectFromC)

Same thing as with dangling else. You can fix it by adding PreferShiftHere() before Join in the rule for JoinClause. This will tell parser to interpret the sample as "(C JOIN D)", so we join to the smallest ending chunk of expression.


If my guess is wrong, then trace where this JOIN lookahead is coming from - this is the source of ambiguity. Or publish more details.


Jan 15, 2012 at 8:24 AM
Edited Jan 15, 2012 at 8:55 AM

Thx very much again, Roman. Very helpful. That worked well. Can I just ask you a further question about this?

I found another discussion here speaking to the same issue

In there you suggest to use a ReduceHere() at the end of the join

Join a
Join b
Join c

gets parsed as (a Join b) Join c

a) does using a ReduceHere() at the end of the join statement equate to the PreferShiftHere() at the start?
b) what would I need to do to achieve a join (b join c)? fyi, I'm still not sure which I need for my grammar

Many thx again, for the excellent tool.


Jan 16, 2012 at 2:18 AM


answering your questions. Yes, my advice in previous post is really opposite in effect to what I just advised. For some reason, at this time long ago I thought this was more "natural" join, while today it seems to me the other way is more appropriate. Time to see SQL standard I think. You can do it either way. 

Your question a. : putting ReduceHere and PreferShiftHere have an OPPOSITE effect, each telling parser to do opposite things in conflict state. 

b. to achieve "a join (b join c)" you need to put ReduceThis() at the end of joinClause expression

Jan 16, 2012 at 6:46 AM

Excellent. Thx very much once again.