NonTerminal precedence

Jun 14, 2010 at 3:47 PM
Edited Jun 14, 2010 at 3:51 PM
Hi,

I'm new to Irony, I love it, but there is one thing I can't figure out. This is probably a newbie question, but I can't find any help in the forums.

What I'd like to do is something similar to Google search. For example if I search for "operator" on this forum I would type:
"site:irony.codeplex.com operator"

I could also use google to search for a web address (stupid I know :)) but I could type something like
"http://www.google.com"

What I want is to for all "site:..." to be Site nodes and everything else, even if it has a colon, to end up in Title nodes. I can't figure out how to give higher precedence to non terminal.

Here is my code so far:

var Term = new IdentifierTerminal("term", ":/.", "");
var Phrase = new StringLiteral("Phrase", "\"");
var ImpliedAnd = new ImpliedSymbolTerminal("ImpliedAnd");

var Sentence = new NonTerminal("Sentence");
var BinarySentence = new NonTerminal("BinarySentence", typeof(BinaryNode));
var Expression = new NonTerminal("Expression");
var PrimaryExpression = new NonTerminal("PrimaryExpression");
var BinaryExpression = new NonTerminal("BinaryExpression", typeof(BinaryNode));
var ParenthesizedExpression = new NonTerminal("ParenthesizedExpression");
var Title = new NonTerminal("Title", typeof(TitleNode));
var BinaryOp = new NonTerminal("BinaryOp");
var Site = new NonTerminal("Site", typeof(SiteNode));
Root = Sentence;

Sentence.Rule = Expression;
Expression.Rule = PrimaryExpression | BinaryExpression;
BinaryExpression.Rule = Expression + BinaryOp + Expression;
BinaryOp.Rule = ImpliedAnd | "and" | "or";

PrimaryExpression.Rule = Site | ParenthesizedExpression | Title;
Site.Rule = ToTerm("site:") + Term | ToTerm("site:") + Phrase;
ParenthesizedExpression.Rule = "(" + Expression + ")";
Title.Rule = Term;

RegisterOperators(10, "or");
RegisterOperators(20, "and");
RegisterOperators(20, ImpliedAnd);

MarkPunctuation(new string[] { "(", ")"});
MarkTransient(new[] { PrimaryExpression, Expression, ParenthesizedExpression, BinaryOp, Sentence});

MarkReservedWords("site:");
LanguageFlags = LanguageFlags.CreateAst;

If I use this code only Title nodes are created, even if I parse "site:www.google.com", probably because I allow colon in IdentifierTerminal for Title.

How can I exclude "site:" from IdentifierTerminal for Title.
Is there any other way to do this? Do I need to create Custom terminal? What is MarkReservedWords used for (could it help me in this case)?
Coordinator
Jun 14, 2010 at 10:01 PM

You definitely have ambiguous grammar, and Irony selects one route by default, and this route is not what you want. As first guess, try setting priority of "site:" higher, like 

var site = ToTerm("site:"); //put this in declarations and use it in rules

site.Priority = 100; //should be higher than identifier's priority

 

I don't think you need to register "site" as reserved word with MarkReservedWords. Other thing - with MarkTransient, you can provide the list directly, with creating array with new[]

Let me know if it helps

Roman

 

Jun 15, 2010 at 11:50 AM
Edited Jun 15, 2010 at 11:54 AM
Hi,

Setting priority didn't help, Title Node always takes over. My code is now

var Term = new IdentifierTerminal("term", ":/.", "");
var Phrase = new StringLiteral("Phrase", "\"");
var ImpliedAnd = new ImpliedSymbolTerminal("ImpliedAnd");

var Sentence = new NonTerminal("Sentence");
var BinarySentence = new NonTerminal("BinarySentence", typeof(BinaryNode));
var Expression = new NonTerminal("Expression");
var PrimaryExpression = new NonTerminal("PrimaryExpression");
var BinaryExpression = new NonTerminal("BinaryExpression", typeof(BinaryNode));
var ParenthesizedExpression = new NonTerminal("ParenthesizedExpression");
var Title = new NonTerminal("Title", typeof(TitleNode));
var BinaryOp = new NonTerminal("BinaryOp");
var Site = new NonTerminal("Site", typeof(SiteNode));
var SiteTerm = ToTerm("site:");
SiteTerm.Priority = 100;
Root = Sentence;

Sentence.Rule = Expression;
Expression.Rule = PrimaryExpression | BinaryExpression;
BinaryExpression.Rule = Expression + BinaryOp + Expression;
BinaryOp.Rule = ImpliedAnd | "and" | "or";

PrimaryExpression.Rule = Site | ParenthesizedExpression | Title;
Site.Rule = SiteTerm + Term | SiteTerm + Phrase;
ParenthesizedExpression.Rule = "(" + Expression + ")";
Title.Rule = Term;

RegisterOperators(10, "or");
RegisterOperators(20, "and");
RegisterOperators(20, ImpliedAnd);

MarkPunctuation(new string[] { "(", ")" });
MarkTransient(PrimaryExpression, Expression, ParenthesizedExpression, BinaryOp, Sentence);

LanguageFlags = LanguageFlags.CreateAst;


I thought maybe I could change priority to Site,

var Site = new NonTerminal("Site", typeof(SiteNode));

but I could only find Precedence property. And it didn't help either. What else can I do?
Coordinator
Jun 15, 2010 at 10:45 PM

I'll look and try your grammar tonight

Roman

Coordinator
Jun 16, 2010 at 5:44 AM
Edited Jun 16, 2010 at 5:46 AM

OK, got it. Just two things:

SiteTerm.AllowAlphaAfterKeyword = true;       

Term.Priority = Terminal.LowestPriority;

Completely forgot about this AllowAlpha... flag, it was a while ago. This was the main problem. With your original grammar it works ok if there's a space after "site:", but if there's no space, then the default rule is that keyword cannot be followed by smth immediately- which makes sense most of the time. Setting this flag overrides this.

The other line is setting lowest priority on normal identifier (not higher priority on KeyTerm, you don't need to do it). It appears there's some inconsistency in scanner constructor, it automatically assigns keywords priorities, overwriting dev's values in grammar. I will change it to overwrite them only if it was not assigned explicitly

Let me know how it works for you

Roman

 

Jun 16, 2010 at 9:16 AM
Works like a charm, thank you very much.
I would never figure out to use AllowAlphaAfterKeyword, I always assumed the problem was with priority.
Thanks again.