Custom & FreeTextLiteral Terminals are not sorted by priority

Feb 3, 2010 at 6:58 PM
Edited Feb 3, 2010 at 7:05 PM

This is in reference to the work item #6291:

The latest code drop (change set #42496 on Jan 26, 2010) still does not fix this issue.

I decided to investigate this more thoroughly, and here is what I found:

In ScannerDataBuilder::BuildTerminalsLookupTable():

The _grammarData.Terminals contains ALL of the terminals in ALPHABETIC order.  (I haven't traced this to confirm that it is sorted this way, but it always appears this
way when I'm stepping through the function in the debugger.)

Since _grammar.FallbackTerminals is populated by sequentially going through _grammarData.Terminals and extracting the ones that have no "first"
characters defined, _grammar.FallbackTerminals is in ALPHABETIC order as well.

Since _data.TerminalsLookup.FallbackTerminals is populated by a call to AddRange(_grammar.FallbackTerminals), it too is in ALPHABETIC order.

(Meanwhile, the rest of the Terminals that DO have "first" characters are sorted into lists based on the "first" character and stored
in the _data.TerminalsLookup table.)

For each of the "first" characters that are in the lookup table (with the list of terminals that have this "first" character), the _grammar.FallbackTerminals
list is added and they are all sorted by reverse priority -- which is great!

However, the problem arises later in Scanner::ComputeCurrentTerminals() at Line #141 in Scanner.cs:

The Data.TerminalsLookup table is asked for the list of possible terminals based on the PreviewChar.  However, the index operator says that if the
preview char is not a stored key into the lookup table, then return the list of FallbackTerminals (which is sorted ALPHABETICALLY).

Basically, whenever there is NO terminal defined that specifically has the preview char as it's "first" character, then the preview char will not be in the
lookup table.  This causes it to return the FallbackTerminals which are sorted ALPHABETICALLY.  This causes a problem when the result returned
(In this case, the FallbackTerminals) is assumed to be sorted in reverse priority order.  The result is that they are tried in alphabetical order.

(Technically, since the _grammar.FallbackTerminals is a TerminalSet, and not a TerminalList, it can end up in any arbitrary order [sets have no order],
but every time I ran it, it retained the original alphabetical ordering.  Either way, it's not guaranteed to be reverse priority order).

My proposed solution is to simply sort the _data.TerminalsLookup.FallbackTerminals list right after it is populated on line #70:

private void BuildTerminalsLookupTable() {
  foreach (Terminal term in _grammarData.Terminals) {
    IList<string> firsts = term.GetFirsts();
    if (firsts == null || firsts.Count == 0) {
      continue; //foreach term
    //Go through prefixes one-by-one
    foreach (string prefix in firsts) {
      if (string.IsNullOrEmpty(prefix)) continue;
      //Calculate hash key for the prefix
      char hashKey = prefix[0];
        AddTerminalToLookup(hashKey, term);
      else {
        AddTerminalToLookup(char.ToLower(hashKey), term);
        AddTerminalToLookup(char.ToUpper(hashKey), term);
    }//foreach prefix in firsts
  }//foreach term

  //Add Fallback terminals to TerminalsLookup special field that contains list to return as default, when there is no key char in the table
  // Sort the FallbackTerminals in reverse priority order
  //Now add Fallback terminals to every list, then sort lists by reverse priority
  // so that terminal with higher priority comes first in the list
  foreach(TerminalList list in _data.TerminalsLookup.Values) {
    if(list.Count > 1)
  }//foreach list

After minor testing, I believe this completely fixes the issue.  One thing I haven't looked into is whether or not _grammar.FallbackTerminals is ever used later -- it's still not in reverse priority order.
I don't know if that could potentially cause issues or not.

As a side note, I happened to look at the Terminal.ByPriorityReverse function and wondered why you didn't just write it as:

public static int ByPriorityReverse(Terminal x, Terminal y)
  return y.Priority - x.Priority;
Feb 4, 2010 at 5:16 PM

thanks for catching it, will look into this. Shoot, it never occurred to me that Fallback terminals may not be always of "lowest" priorities. There's another bug there i believe, related to OutputTerminal field, I wll check it also and fix it (hopefully finally fix it)