Problem with comment grammar

May 12, 2009 at 1:46 PM


This is another bug that has cropped up since the recent build and I'm not entirely sure what I can do to fix it.  In Lua, a line comment is "--"  while the beginning of a block comment is "--[[".  Using an older build of Irony, I was able to correctly deduce everytime which items were line comments and which items are block comments.  It seems now it will always read as a line comment and never a block comment.  I've tried increasing the priority of the block comment terminal but that did not have any effect.  Any ideas?


May 12, 2009 at 4:01 PM

I'll have a look and let you know; I think I know where the problem may be...

May 12, 2009 at 4:22 PM
Edited May 12, 2009 at 4:24 PM

Not sure why it's happening for you. I've tried to add "block comment" to SQL grammar and it works fine. Try yourself, change comment definitions in SQL grammar to the following:

      var comment = new CommentTerminal("comment", "/*", "*/");
      var lineComment = new CommentTerminal("line_comment", "--", "\n", "\r\n");
      var blockComment = new CommentTerminal("block_comment", "--[[", "]]");

Then run Grammar Explorer and try the following script with SQL grammar:

-- line comment
FROM Product
 block comment
-- line comment
where A oR B and C OR D + X * 5;

Everything works fine, block comment is recognized. I've tried changing the order of adding to NonGrammarTerminals - works ok in every case


May 12, 2009 at 4:43 PM

This is interesting.  I tried testing this out in the Grammar Explorer and it was correctly recognizing the line comments and block comments.  However, when I go to run the language addon in Visual Studio, it is not working.  I stepped through the tokens that are returned by Parser.Scanner.VsReadToken() and it appears that both --[[ and --]] are read as 2 separate line-comments and not as a single block comment.  Is it possible that VsReadToken is where the problem is?

May 12, 2009 at 5:19 PM

it might be. can you send me your sources so I can play with it and fix the remaining issues with integration?

May 12, 2009 at 5:39 PM

Here ya go:

May 12, 2009 at 6:34 PM

got it, will look at it

May 18, 2009 at 6:35 PM

Any luck finding the issue?  It seems that VsReadToken is the culprit as it does not take into account any tokens on other lines that could modify the context (such as a block comment).  I verified this by removing the line comment terminal and only having the block comment terminal.  When the start/end tokens were on separate lines, but coloring was wrong.  But when both tokens were on the same line, the coloring was correct.  It really stinks that Visual Studio's standard line colorizing process only gives you the source for the specific line it wants to colorize.

May 18, 2009 at 7:19 PM

Just caught the issue :) 

The problem resides in VsScannerStateMap used in CompilerContext which uses bitfields to specify various bits and peices of information.  Well 2 of the fields (TerminalIndex, and TerminalSubType) both refer to specific indices (with 0 being a valid index!).  This means that whenever the first multiline token is found and the index in ScannerState is set to 0, the Value of ScannerState will still remain 0 (which is wrong since everything checks if Value is nonzero).  As a quick fix, I increment both fields by 1 and decrement them when they are used.  This is not the most elegant fix but it works :)  I'll post a patch once I figure out how to generate a patch from tortoisesvn.

May 18, 2009 at 8:17 PM

Yes, I've found and fixed this problem already, but there's some more trouble with StringLiteral. I'm trying to revive the integration tests in ScannerTests.cs (now called IntegrationTests). Need to complete all this and then submit the code. Sorry for the delay, had been busy with a lot of side issues (day-time job, you know...)

Will try to push it today or tomorrow


May 18, 2009 at 8:29 PM

Sounds good :)  Let me know if there is any are in particular you'd like some testing on.

May 19, 2009 at 5:14 AM

ok, uploaded, try it now. It looks like the whole integration thing needs a good refactoring

May 19, 2009 at 3:00 PM

There seems to be an ArgumentOutOfRangeException occuring when I type in the following line:

asdf --[[ comment
nutha comment --]] flibby

I noticed that "nutha comment" is not colored green until a third line is created under this.  The exception seems to be a problem with the TokenStart property associated with SourceStream.


I agree that the integration thing needs some refactoring.  I dislike the fact that Visual Studio gives you only the line modified and a state integer to use when coloring a line.  While this may be a perceived performance optimization, it seems like the loops you must jump through in order to make it work defeats the purpose but alas there's nothing that can be done in that respect.  It seems currently that the VS Integration is too tightly connected with the Irony compiler and parser which will ultimately distort the code and limit the flexibility of the colorizer.  It would be nicer to have this functionality in its own class or structure and still work in the single line/state limitation.  This way we can use it as a boiler plate that can be expanded.  

For example, I'm looking into building a CodeDOM from the ParseTree that Irony generates and then using my own implementation of the IVsColorizer interface, colorize the lines based on that DOM.  This way I can colorize not only base on language semantics but also based on content such as declared or missing variables, methods, etc.  Also, while the brace matching functionality that is currently in place is nice, it's still limiting as it cannot support multiple or colorize missing braces in a different color.  

This all will take time though but so far I really like the progress being made :)

May 19, 2009 at 4:58 PM

thanks, will try to see into that exception.

Yea, I would like to do just that - move the VsScan functionality away from Scanner, and handle it in more general way; although this partial scanning stuff should still remain in terminals. My plan is to switch to scanning using scanner's pipeline which includes token filters; it would be even better to actually involve parser, even in this case of line scanning; parser would simply work in start-stop mode. In this case you'd have much more info from parser context. That still needs to happen in some future, will try my best.

thanks again