Is there a way to specify a grammar rule as "anything but x or y or z" or not x?

Sep 18, 2011 at 10:23 AM
Edited Sep 19, 2011 at 5:51 PM

 I'm new to Irony, and am finding it very powerful and flexible.  Currently, I'm looking for a way to set up a rule to be "anything but x or y or z", is there an operator or construct to support NOT?

I have a template language that is kind of asp/c# like.  An example template looks like:

#region "Methods"

///--------------------------------------------------------------------------------
/// <summary>This method gets the <%=BLLClassName%> instance from the database,
/// using its primary key values.</summary>
///--------------------------------------------------------------------------------
public override void Load()
{
	Load(<%=TAB 1%>
<%
	foreach (Property where IsPrimaryKeyMember == true)
	{
		<%&<%=BLLPropertyName%>, %>
	}
	<%&0);%>
%>
}<%=TAB -1%>
#endregion "Methods"

In the literal text sections of the template, it's important to retain all of the white space, so I have a preparsing stage that would store these "blobs" of text.

The result of preparsing and translating the above template example would look like:

<%=CONTENT 1%><%=BLLClassName%><%=CONTENT 2%><%=TAB 1%>
<%
	foreach (Property where IsPrimaryKeyMember == true)
	{
		<%&<%=BLLPropertyName%><%=CONTENT 3%>%>
	}
	<%&<%=CONTENT 4%>%>
%>
<%=CONTENT 5%><%=TAB -1%>
<%=CONTENT 6%>
 

I'm having no trouble with the template grammar and parsing after the preparsing stage.  I'm having trouble with the preparsing grammar though.  I'm trying to define a rule that includes anything but symbols, including whitespace and line terminations.

The preparsing grammar looks like the following (the blob rule is the one I'm having trouble with):

	public partial class TemplateContentPreGrammar : Grammar
	{
		public TemplateContentPreGrammar()
		{
			this.LanguageFlags |= LanguageFlags.CreateAst;
			this.GrammarComments = "This grammar is used to preparse a template.\r\n" +
								   "This preparser replaces blob text with , where n\r\n" +
								   "is the index to the actual text stored in a dictionary.\r\n";

			#region "Symbols, Punctuation, etc."

			// symbols
			KeyTerm evalOpen = ToTerm("<%", "evalOpen");
			KeyTerm contentPropOpen = ToTerm("<%=", "contentPropOpen");
			KeyTerm outputPropOpen = ToTerm("<%>", "outputPropOpen");
			KeyTerm appendOpen = ToTerm("<%&", "appendOpen");
			KeyTerm appendLineOpen = ToTerm("<%+", "appendLineOpen");
			KeyTerm close = ToTerm("%>", "close");

			// nothing is whitespace for the preprocessor
			this.WhitespaceChars = "";
			#endregion

			#region "Nodes"
			// high level nodes
			var template = new NonTerminal("template");
			var templateBlock = new NonTerminal("templateBlock");
			var paragraph = new NonTerminal("paragraph");
			var paragraphBlock = new NonTerminal("paragraphBlock");
			var evaluation = new NonTerminal("evaluation");
			var evaluationBlock = new NonTerminal("evaluationBlock");

			// statements
			var appendStatement = new NonTerminal("appendStatement");
			var appendLineStatement = new NonTerminal("appendLineStatement");

			// properties
			var property = new NonTerminal("property");
			var contentProperty = new NonTerminal("contentProperty");
			var outputProperty = new NonTerminal("outputProperty");

			// text and symbols
			var blob = new NonTerminal("blob");
			var symbol = new NonTerminal("symbol");
			#endregion

			#region "Rules"
			// a template consists of any number of template blocks
			template.Rule = MakeStarRule(template, null, templateBlock);

			// a template block is an evaluation or a paragraph
			templateBlock.Rule = evalOpen + evaluation + close | paragraph;

			// a paragraph consists of any number of paragraph blocks
			paragraph.Rule = MakeStarRule(paragraph, null, paragraphBlock);

			// a paragraph block is a property or a blob of text
			paragraphBlock.Rule = property | blob;

			// an evaluation consists of any number of evaluation blocks
			evaluation.Rule = MakeStarRule(evaluation, null, evaluationBlock);

			// an evaluation block is an append statement, append line statement or a blob of text
			evaluationBlock.Rule = appendStatement | appendLineStatement | blob;

			// an append statement includes a paragraph to append
			appendStatement.Rule = appendOpen + paragraph + close;

			// an append line statement includes a paragraph to append as a new line
			appendLineStatement.Rule = appendLineOpen + paragraph + close;

			// a property is a content property or an output property
			property.Rule = contentProperty | outputProperty;

			// a content property contains a blob of text between the content property delimiters
			contentProperty.Rule = contentPropOpen + blob + close;

			// an output property contains a blob of text between the output property delimiters
			outputProperty.Rule = outputPropOpen + blob + close;

			// a symbol is any of the recognized "directives"
			symbol.Rule = evalOpen | contentPropOpen | outputPropOpen | appendOpen | appendLineOpen | close;

			// a blob is anything but a symbol
			blob.Rule = symbol;

			#endregion

			// the template is the root of the grammar
			this.Root = template;

			// mark nodes to filter from the parse tree
			this.MarkTransient(templateBlock, paragraphBlock, evaluationBlock, property);
		}
	}

 Thanks!


Sep 19, 2011 at 5:28 PM

I guess the answer is no.  Using RegexBasedTerminal helped define the "blob", but I'm still unable to get a token to go beyond end of line.  So at this point I'm doing preparsing through a custom technique.

Coordinator
Sep 19, 2011 at 6:28 PM

well, parsing templates is a tricky area, and I admit, Irony does not have a direct support for this. 

I think the proper way is to treat template text as special comment-like terminals, injected among main text which is your evaluation scriptlets. You take the template source and add "%>" at the beginning. Then you define your "templateText" terminal as quoted string, which starts with "%>" and ends with "<%". Like this:

 

%>public override void Load()
{
	Load(<%=TAB 1%>
<%
	foreach (Property where IsPrimaryKeyMember == true)
..

... and so on - fragments shown in red are now "strings" embedded into the main text. Then you write your grammar around this modified source.  

I'm planning to add more direct support to template parsing in some future

Sep 19, 2011 at 11:51 PM

Hi Roman,

Thanks for the suggestion, and thanks for bringing Irony into this world!

I updated my template grammar, requiring explicit symbols to start and end literal text, and defining a terminal based on CommentTerminal.  It works great!  The template grammar is now a little more verbose, but the preparser (and separate grammar) is no longer necessary, and it will be easier to match errors to the template and color areas of the template text.

Thanks again,

Dave