This project has moved and is read-only. For the latest updates, please go here.

Significant whitespace between non-terminals

Apr 1, 2010 at 6:18 PM

Hello. I am trying to make a CSS parser. In CSS the two selectors are completely different because of spaces between them.

[title="my-title"] .my-class #my-id

The production rule is (note no spaces inside).

simple_selector : element_name [ HASH | class | attrib | pseudo ]* | [ HASH | class | attrib | pseudo ]+

My implementation is wrong because plus operator ignores whitespace.  
simpleSelector.Rule = (elementName + (hash | @class | attribute | pseudo).Star())
| (hash | @class | attribute | pseudo).Plus();

What is the best way to handle this case?


Apr 1, 2010 at 9:24 PM
Edited Apr 1, 2010 at 9:25 PM

This reminds me of an earlier discussion thread with Roman where I recommended the addition to Irony's parser the ability to NOT allow white-space between two terminals. He found it to be an interesting suggestion and said that it would be done using a Grammar Hint, such as NoWhiteSpace(), and syntax sugar could be added to short hand it with something like the & character. It's probably low on his list and I haven't had time to attempt the change.

Anyhow, as Irony lays now, I think your best bet would be to create a custom terminal that lays emphasis on the special CSS characters ( [ . # ) and white-space. Unfortunately I don't think this would be a fun custom terminal to code, but I could be wrong.

Apr 1, 2010 at 10:43 PM

ye, that's true, we've already discussed the no-whitespace option, and it is still on my to-do list. But for this case, I don't see how whitespace is relevant. Your BNF expression does not indicate that spaces are involved at all. Can you please elaborate on this? I suspect you have different kind of problem. 

General note - don't use Plus() and Star() methods, they are obsolete, I will remove them soon; use MakePlusRule and MakeStarRule instead.


Apr 27, 2010 at 1:29 PM

Dubrovsky, I think the second example CSS selector you give is what's termed a "selector" in the grammar, not a "simple_selector".