This project has moved and is read-only. For the latest updates, please go here.

Resolve ambigious term

Jun 28, 2012 at 1:52 AM

Hi, I have using this and make some changes in rules. Most of them make parse state error, but it is works. The problem is the ambigious parsing in statement when it come to local_variable_type and member_access.

In this version it using member_access as local variable type, but it is error when meet array type. So I replace member_access with type_ref and works fine.

Example we have class :

public class A
      public A _f1;
      public A M1() {}

Let's look at some member access expressions :

_f1.M1()._f1._f1....... works fine :)

but it got error because it jumps to local variable type state when 2nd expression is identifier

_f1._f1.M1(); got error in ( which expect identifier, which mean this parser jump into type_ref state and no type using () right.

My question how to jump to other state which is it should be member_access ?

Jun 28, 2012 at 6:17 PM


it is really hard to understand what's going on from your explanations, but I will try my best guesses.

.. Most of them make parse state error, but it is works...

I guess you see Grammar errors (conflicts) in Grammar Explorer, but you think it's OK. It is NOT ok. You should not start parsing until you fixed/resolved all grammar conflicts. All your troubles are because of this. 

... jump to other state...

There is no such thing as jumping to other state. Parser is a state machine built from your grammar. Fix the grammar. 

My guess your conflicts may come from one of two missteps:

1. You defined two identifier terminals - one for types, one variables/methods/fields. They are identical for parsing engine and it has very little info to decide which is that when it sees a sequence of characters. 

2. You have one terminal for types and variables, but you have non-terminals for Type and MemberName that "wrap" this terminal. The trouble for parser is that it has to decide which one of these is an identifier that it had just received, and your grammar is ambiguous - it forces parser to choose too early, before it had seen the rest of the expression. So get rid of these non-terminals (Type and MemberName and whatever else you have), and encode expressions for declarations using identifier terminal directly. 

Fix all conflicts before you start parsing samples. Look at other sample grammars, like c#, for clues.




Jun 29, 2012 at 2:20 AM
Edited Jun 29, 2012 at 2:29 AM

Well, I m not really understand with your codes and your explanation. I know the grammer parser state must no erros, because the code make decision from that state right ? But what I need parser that can parse C# code into symbols and luckily it's work, means no parse output error.

There are some problems in latest version which it should be simple C# statement that throw error in parsing.

Example :

  • member_access.Rule = identifier_ext + member_access_segments_opt;

         that rule only works on simple member access and throw error on statement like this

         ((ClassA)id).method();   -> parentheized member access,

         typeof(ClassA).ToString() --> typeof access , so I change your rule into something like this :


 member_access.Rule = "this" + member_access_segments_opt | "base" + member_access_segments_opt |
                typeof_expression + member_access_segments_opt |
                //    qual_name_with_targs + member_access_segments_opt |
                identifier_or_builtin + member_access_segments_opt |
                parenthesized_expression + member_access_segments_opt;
  • local_variable_type.Rule = member_access | "var";
         when this meet array type declaration like String[] , it is throw error and I force to change like below
    local_variable_type.Rule = type_ref | "var";
  • type_ref.Rule = type_or_void + qmark_opt + rank_specifiers_opt + typearg_or_gendimspec_list;
         I change since only metho return type has void :         type_ref.Rule = qual_name_with_targs + qmark_opt + rank_specifiers_opt + typearg_or_gendimspec_list;
  • typecast_expression.Rule = parenthesized_expression + primary_expression;
         changed into :
         typecast_expression.Rule = typecast_target + primary_expression;
         typecast_target.Rule = Lpar + type_ref + Rpar;
  • method_declaration.Rule = member_header + type_ref + qual_name_with_targs  // + type_parameter_list.Q()
                 + formal_parameter_list_par + type_parameter_constraints_clauses_opt + method_body;
          changed into :
    method_declaration.Rule = member_header + "void" + qual_name_with_targs
              + formal_parameter_list_par + type_parameter_constraints_clauses_opt + method_body |
                member_header + type_ref + qual_name_with_targs
              + formal_parameter_list_par + type_parameter_constraints_clauses_opt + method_body;
From all those it will give some confilct in parse state, but it can parsing all my codes and I got symbols :)

BUT, ....

It has one problem, that it can not parse this statement : ( think member_access as local_variable_declaration )

property1.property2.method1(); // --> error on (, I check the parser think it is as local_variable_declaration, 
// because the second property2 meet criteria of
qual_name_segment ,
// in type_ref -> qual_name_with_targs -> qual_name_segment,
but, it is works on something like :

property1.method1().property2; // --> the method1() is parsing into member_access_segment -> 
// argument_list_par , since it found ().

so, my conclusion is :
  1. The parser only look from first to next, and match the first rule, and just throw error when the next not match the next rule.
  2. The parser should using cache for one complete statement from start to end, so when there are error found in the middle, it should retry parse from beginning of statement using other option rule available.

Ok that is all I can explain..., I hope find a solution for his :)

Jul 2, 2012 at 5:39 PM

Well, I'm afraid I can't help you here. As far as I understood, finally, you're playing with c# sample grammar, and expect it to parse all kinds of c# expressions. It does not work - c# grammar is just a limited sample, it is far from complete c# parser. To finish it would require a lot of tweaking, custom actions (token preview) and other stuff. And definitely advanced level of expertise in LALR parsing algorithms. Do not take this sample too seriously.