Custom Terminal

Jun 15, 2009 at 11:38 PM

 

 

Hello,

 

I’m new to Irony and I’m trying to use it to interpret a PLC program. For now, I’m only interested in the XML three of the PLC program. 

 

I’m using the Irony version from the Download page (November 5 2008) and I have some difficulties with the Custom Terminal. 

 

In the header of a PLC program, we can assign a title to the program using this syntax:

 

TITLE = MAIN CALL OF THE PROCESS

 

This is my implementation of a Custom Terminal to retrieve the title’s string (everyting on the line after the '='):

 

// Get the string up to the EOL

static Token TitleMatchHandler(Terminal terminal, CompilerContext context, ISourceStream source)

{

Token result = null;

int newPosition = source.Text.IndexOf("\r", source.Position);

if (newPosition >= source.Position)

{

source.Position = newPosition;

Token tkn = Token.Create(terminal, context, source.TokenStart, source.GetLexeme());

result = tkn;

}

return result;

}

 

Here, how I’m using the Custom Terminal:

CustomTerminal titleTerminal = new CustomTerminal("titleTerminal", TitleMatchHandler, null);

NonTerminal title = new NonTerminal("title");

title.Rule = Symbol("TITLE") + "=" + titleTerminal;

 

My problem, the Custom Terminal is never called and I have a parser error.

Is my function for the non terminal well implemented?

 

Thank You

 

François

 

 

Coordinator
Jun 16, 2009 at 12:52 AM

First of all I would change the terminal definition to include the "=" char, and to add this char as a third parameter (prefixes). Check if the handler is actually called in this case. Inside the handler, exclude the beginning char from result string. If the handler is still not called then try to set the Priority property on terminal to something like 10.

Now the behavior would depend on whether you use "=" anywhere else, for assignment statement for ex.  The problem is that now everywhere the scanner will try to parse the "=" char as the beginning of Title terminal. The scanner needs assistance from the parser to resolve the conflict. The latest version in Source page handles this much better - in case of conflict like this the Scanner asks parser what are the appropriate terminals to expect in the current position. So try latest version, it might work better. Still you would need to specify "=" as a starting symbol of your title.

Let me know if it works.

Roman

 

Jun 17, 2009 at 10:27 AM

 

Hello Roman,

With the version of November 2008, if I include the "=" char in my rule and add in the prefix of the handler, the handler is called and it works fine. However if I remove the first character from the result string using the following code:

Token tkn = Token.Create(terminal, context, source.TokenStart, source.GetLexeme().TrimStart(' ', '\t', '=')

the parser failed when analyzing the next token.

 

With the new version of Irony (25364) everything seems to work fine using this handler:

private static Token TitleMatchHandler(Terminal terminal, CompilerContext context, ISourceStream source) {
  Token result = null;
  char[] endOfLineChars = {'\r', '\n'};
  int newPosition = source.Text.IndexOfAny(endOfLineChars, source.Position);
  if (newPosition >= source.Position) {
    source.Position = newPosition;
    string title = source.GetLexeme();
    var tkn = new Token(terminal, source.TokenStart, title, title.TrimStart(' ', '\t', '='));
    result = tkn;
  }
  return result;
}

 

Thank you for your help,

François J.

 

Jun 18, 2009 at 1:19 PM

Hello,

I have a question concerning the IdentifierTerminal:

How is it possible to use some 'special characters' like a dot (.) or the German umlauts (ä, ö, ü) within an IdentifierTerminal? (e.g. "konto.vermögen")
The string which is used for parsing is UTF-8 encoded.

Thanks for your help,
- Tobias

Coordinator
Jun 18, 2009 at 4:20 PM

You should provide these extra characters as constructor parameters - there are separate lists of extra chars for first char and all chars that are allowed in identifier. In addition, you may want to add the entire german alphabet, in case if similar letters like "a" or "o" in german charset are actually different chars (different char code) from their english counterparts, which I guess is your case