First of all, kudo for the nice work on Irony...
I'm trying to implement the parser for a little home-made langage. Among other things, I would like to define a new Terminal for boolean litterals (TRUE or FALSE). I've tried several approaches, but non is really satisfying (Note : in the following, I'll speak
about "TRUE" recognition, but the same holds for "FALSE" of course) :
- Define BoolLiteral as an non-terminal :
NonTerminal BoolLiteral = new NonTerminal("BoolLiteral");
BoolLiteral.Rule = Symbol("TRUE") | "FALSE";
This one works, but it makes walking the tree painfull, since bool literal are not leaves on the tree, which makes semantic analysis, interpretation or code generation quite painful, and is not consistent with NumberTerminal or StringLiteral
- Define my own BoolLiteral terminal class, either by writing it directly, either by using a ConstantSetTerminal instance. This would look more familiar, but it doesn't work because, in the scanner terminals lookup table (_data.TerminalLookup), I then have
2 entries for "T" prefix ("T" like "TRUE"). Those 2 entries are IdentifierTerminal, then my BoolLiteral terminal IN THAT ORDER. Since terminals are evaluated in that order, and that both terminals match the same string, of the same length ("TRUE "), the token
is therefore interpreted as an Identifier, not as a BoolLiteral.
Adding "TRUE" or "FALSE" to the ReserverdWords property of IdentifierTerminal doesn't help because in that case, IdentifierTerminal emit a ReservedWord token, and I do not know what to do with it.
One solution is to set the Priority of my BoolLiteral to something greater than 0, but I found it is a rather fragile approach (maintaining such priorities among numerous terminals would be tedious and error prone).
Is the Priority setting the right solution indeed? In that case, shouldn't IdentifierTerminal have it's Priority set to something very low (Int32.MinValue) by default, so that Identifiers are always the "last match" among various terminals??
Feb 7, 2008 at 4:17 AM
Generally, boolean constants are global predeclared readonly boolean variables, so that's the way you should recognize them - as normal variables. It is after parsing, when you analyze variables scope and definition location (local/parameter/global), you would
recognize that this TRUE variable is a global variable with value "true" predefined by runtime. So let it be treated as normal identifier at time of scanning/parsing. Sorry, after-parse processing is not there yet, so you'll have to do it yourself, until next
code drop from me at least. See also other discussion about processing AST trees.