Nested parsers

Apr 20, 2010 at 10:44 AM

Hello.I need to parse strings that contain structured data, like CSV.
Example:

constant FBGA256 : PIN_MAP_STRING :=

    "IOD3   : D3  , IOC2   : C2  , "&

    "IOE3   : E3  , IOC3   : C3  , "&

    "IOE4   : E4  , IOD2   : D2";


Depending on constant's type, string may contain different data (not only CSV).

 

What is the best way to implement such parsing?

Is it possible to do that in one pass, with one grammar?

 

Apr 20, 2010 at 6:26 PM
Edited Apr 20, 2010 at 6:55 PM

Hey K0zer,

Irony has a great Literal called DsvLiteral just for parsing Delimited Separated Values.  An example of this can be found under Irony.Samples > DataGrammars > SampleCsvGrammar.cs.

However, this may not be the best approach for your problem.  I believe just a straight grammar with MakePlusRule should suffice for the mapping pairs. I have not tested the following but it may be a good start.

 

MAP_PAIR.Rule = IO_ID  + ":" + ID;  //  IOD3 : IOC2     [IO_ID and ID would have to be declared depending on your rules; these may require a CustomTerminal]

MAP_PAIR_SET.Rule = MakePlusRule(MAP_PAIR_SET, MAP_PAIR ,","); // IOD3 : D3 , IOC2 : C2

MAP_PAIR_STRING.Rule = ToTerm("\"") + MAP_PAIR_SET + ToTerm("\""); // "IOD3 : D3 , IOC2 : C2 , "

MAP_PAIR_LINES.Rule = MakePlusRule(MAP_PAIR_LINES, MAP_PAIR_STRING, "&"); // "IOD3 : D3 , IOC2 : C2 , " & "IOE3 : E3 , IOC3 : C3 , "

MAP_TYPE.Rule = TYPE + (Empty | ":=" + MAP_PAIR_LINES) + ";"

CONST_STMNT = ToTerm("constant") + NAME + ":" + MAP_TYPE; // NAME would probably be an IdentifierTerminal

Let me know if you have any further questions.

-MindCore

Apr 21, 2010 at 10:44 PM
Edited Apr 22, 2010 at 8:51 AM

The thing is, quoted string can be split not only by a comma, but in any place. And split strings content have to be parsed all together.

Example:

constant FBGA256 : PIN_MAP_STRING :="IOD3   : D3 , IO” & ”C2   : C2”; // for "IOD3: D3, IOC2: C2"

 

So grammar for string would be:

MAP_PAIR.Rule = IO_ID + ":" + ID;

MAP_PAIR_SET.Rule = MakePlusRule(MAP_PAIR_SET, MAP_PAIR ,",");

 

And grammar for constant statement:

CONST_STMNT = ToTerm("constant") + NAME + ":" + TYPE + ":=" + quotedStringConcat;

 

The question is, can I use a single grammar for the parsing of the entire expression in a one pass?

 

Apr 22, 2010 at 12:24 PM
I believe this can be done in one pass; however to accomplish it you will have to create a CustomTerminal. This terminal would trigger from the first double quote and terminate off the quote followed by a semicolon. Next the character sets, such as IOD3 and D3 will create child Tokens to the parent token based on the colons and previous token. The tree of tokens would look something like this: MAP |_____ IOD3 | |_______D3 | |______IOC2 |_______C2 This would allow you to parse even concatenated strings in a single pass, however you would have to add additional logic to verify the the concatenation is create, and if not return an Error token (i.e. looks like " & "). -MindCore