Constraining a non-terminal to a single line

Mar 19, 2012 at 7:09 PM

I'm working on a parser for email headers, and have run into a bit of trouble with messages that have blank subjects.  What I want is for the parser to see a line that starts with "Subject:", and accept everything until the next line terminator as the message's subject.

What I have right now is:


var text = new FreeTextLiteral("text", FreeTextOptions.AllowEof, "\r\n", "\r", "\n") {Priority = TerminalPriority.Low};
var subject = new NonTerminal("subject");
subject.Rule = ToTerm("subject:") + text;

This works fine when the subject contains some text.  When the subject line is blank, though, the 'text' rule captures the next non-blank line instead.  What can be done to prevent this from happening?


Mar 20, 2012 at 4:56 PM

The problem is that scanner skips whitespace after reading "subject:"; so make "subject:" a prefix of FreeTextLiteral instead.

Mar 20, 2012 at 6:02 PM
Edited Mar 20, 2012 at 6:04 PM

I changed the code to:

var subjectText = new FreeTextLiteral("text", FreeTextOptions.AllowEof, "\r\n", "\r", "\n") 
    { Firsts = new StringSet { "subject:" } };
var subject = new NonTerminal("subject");
subject.Rule = subjectText;

and it's working well now, thanks!

Out of curiosity: Style considerations aside, does leaving that vestigial nonterminal hurt anything?

Mar 20, 2012 at 11:12 PM

no, it doesn't hurt, it's harmless