Constraining a non-terminal to a single line

Mar 19, 2012 at 8:09 PM

I'm working on a parser for email headers, and have run into a bit of trouble with messages that have blank subjects.  What I want is for the parser to see a line that starts with "Subject:", and accept everything until the next line terminator as the message's subject.

What I have right now is:

 

var text = new FreeTextLiteral("text", FreeTextOptions.AllowEof, "\r\n", "\r", "\n") {Priority = TerminalPriority.Low};
var subject = new NonTerminal("subject");
subject.Rule = ToTerm("subject:") + text;

This works fine when the subject contains some text.  When the subject line is blank, though, the 'text' rule captures the next non-blank line instead.  What can be done to prevent this from happening?

 

Coordinator
Mar 20, 2012 at 5:56 PM

The problem is that scanner skips whitespace after reading "subject:"; so make "subject:" a prefix of FreeTextLiteral instead.

Mar 20, 2012 at 7:02 PM
Edited Mar 20, 2012 at 7:04 PM

I changed the code to:

var subjectText = new FreeTextLiteral("text", FreeTextOptions.AllowEof, "\r\n", "\r", "\n") 
    { Firsts = new StringSet { "subject:" } };
var subject = new NonTerminal("subject");
subject.Rule = subjectText;

and it's working well now, thanks!

Out of curiosity: Style considerations aside, does leaving that vestigial nonterminal hurt anything?

Coordinator
Mar 21, 2012 at 12:12 AM

no, it doesn't hurt, it's harmless