Sample of Standard Pascal in Irony

Sep 29, 2009 at 3:31 PM

I decided to take some time and port some code I found on the web over to the Irony framework. I am delighted with the out come so far.

Below, please find an implementation of ISO 7185:1990 Standard Pascal. I have also included some useful links if someone wants to improve upon or correct what I've done so far. This was meant to be a learning experience for me, so any feedback would be greatly appreciated.

Thanks,
MindCore

StandardPascalGrammar.cs  (add file in the 020.Irony.Samples project folder)

#region Useful Links
/* **********************************************************************************
 * 
 * http://standardpascal.org/iso7185rules.html  (Summary)
 * http://www.pascal-central.com/docs/iso7185.pdf  (Full Doc)
 * 
 * http://www.moorecad.com/standardpascal/pascal.y  (YACC)
 * http://www.moorecad.com/standardpascal/pascal.l  (LEXER)
 * 
 * **********************************************************************************/
#endregion

using System;
using System.Collections.Generic;
using System.Text;
using Irony.Parsing;
using Irony.Ast;

namespace Irony.Samples
{

  [Language("Standard Pascal", "1990", "ISO-7185:1990 Standard Pascal")]
  public partial class StandardPascalGrammar : Grammar
  {

    public StandardPascalGrammar(): base(false)
    {
      this.GrammarComments = @"Sample implementation of ISO-7185 Standard Pascal";

      this.MarkReservedWords("and", "array", "begin", "case", "const", "div", "do");
      this.MarkReservedWords("downto", "else", "end", "file", "for", "function");
      this.MarkReservedWords("goto", "if", "in", "label", "mod", "nil", "not" ,"of");
      this.MarkReservedWords("or", "packed", "procedure", "program", "record");
      this.MarkReservedWords("repeat", "set", "then", "to", "type", "until", "var");
      this.MarkReservedWords("while", "with");

      #region 1. Terminals

      var identifier = new IdentifierTerminal("Identifier", IdFlags.NameIncludesPrefix);
      identifier.AddPrefix(Strings.AllLatinLetters, IdFlags.None);   //[a-zA-Z]([a-zA-Z0-9])

      var charcode = new NumberLiteral("CharacterCode", NumberFlags.IntOnly);
      charcode.AddPrefix("#", NumberFlags.None); // \#[0-9]+

      var character_string = new StringLiteral("CharacterString", @"'", StringFlags.AllowsLineBreak & StringFlags.NoEscapes); //'({NQUOTE}|'')+'

      var digit_sequence = new NumberLiteral("DigitSequence", NumberFlags.IntOnly); // [0-9]+

      var real_number = new NumberLiteral("RealNumber", NumberFlags.HasDot); //[0-9]+"."[0-9]+

      var comment1 = new CommentTerminal("Comment", "(*", "*)");
      NonGrammarTerminals.Add(comment1);

      var comment2 = new CommentTerminal("Comment", "{", "}");
      NonGrammarTerminals.Add(comment2);

      #endregion
          
      #region 2. Non-terminal

      var file = new NonTerminal("FILE");
      var comment = new NonTerminal("COMMENT");
      var program = new NonTerminal("PROGRAM");
      var program_heading = new NonTerminal("PROGRAM_HEADING");
      var identifier_list = new NonTerminal("IDENTIFIER_LIST");
      var block = new NonTerminal("BLOCK");
      var module = new NonTerminal("MODULE");
      var label_declaration_part = new NonTerminal("LABEL_DECLARATION_PART");
      var label_list = new NonTerminal("LABEL_LIST");
      var label = new NonTerminal("LABEL");
      var constant_definition_part = new NonTerminal("CONSTANT_DEFINITION_PART");
      var constant_list = new NonTerminal("CONSTANT_LIST");
      var constant_definition = new NonTerminal("CONSTANT_DEFINITION");
      var cexpression = new NonTerminal("C_EXPRESSION");
      var csimple_expression = new NonTerminal("C_SIMPLE_EXPRESSION");
      var cterm = new NonTerminal("C_TERM");
      var cfactor = new NonTerminal("C_FACTOR");
      var cexponentiation = new NonTerminal("C_EXPONENTIATION");
      var cprimary = new NonTerminal("C_PRIMARY");
      var constant = new NonTerminal("CONSTANT");
      var non_string = new NonTerminal("NON_STRING");
      var type_definition_part = new NonTerminal("TYPE_DEFINITION_PART");
      var type_definition_list = new NonTerminal("TYPE_DEFINITION_LIST");
      var type_definition = new NonTerminal("TYPE_DEFINITION");
      var type_denoter = new NonTerminal("TYPE_DENOTER");
      var new_type = new NonTerminal("NEW_TYPE");
      var new_ordinal_type = new NonTerminal("NEW_ORDINAL_TYPE");
      var enumerated_type = new NonTerminal("ENUMERATED_TYPE");
      var subrange_type = new NonTerminal("SUBRANGE_TYPE");
      var new_structured_type = new NonTerminal("NEW_STRUCTURED_TYPE");
      var structured_type = new NonTerminal("STRUCTURED_TYPE");
      var array_type = new NonTerminal("ARRAY_TYPE");
      var index_list = new NonTerminal("INDEX_LIST");
      var index_type = new NonTerminal("INDEX_TYPE");
      var ordinal_type = new NonTerminal("ORDINAL_TYPE");
      var component_type = new NonTerminal("COMPONENT_TYPE");
      var record_type = new NonTerminal("RECORD_TYPE");
      var record_section_list = new NonTerminal("RECORD_SELECTION_LIST");
      var record_section = new NonTerminal("RECORD_SELECTION");
      var variant_part = new NonTerminal("VARIANT_PART");
      var variant_selector = new NonTerminal("VARIANT_SELECTOR");
      var variant_list = new NonTerminal("VARIANT_LIST");
      var variant = new NonTerminal("VARIANT");
      var case_constant_list = new NonTerminal("CASE_CONSTANT_LIST");
      var case_constant = new NonTerminal("CASE_CONSTANT");
      var tag_field = new NonTerminal("TAG_FIELD");
      var tag_type = new NonTerminal("TAG_TYPE");
      var set_type = new NonTerminal("SET_TYPE");
      var base_type = new NonTerminal("BASE_TYPE");
      var file_type = new NonTerminal("FILE_TYPE");
      var new_pointer_type = new NonTerminal("NEW_POINTER_TYPE");
      var domain_type = new NonTerminal("DOMAIN_TYPE");
      var variable_declaration_part = new NonTerminal("VARIABLE_DECLARATION_PART");
      var variable_declaration_list = new NonTerminal("VARIABLE_DECLARATION_LIST");
      var variable_declaration = new NonTerminal("VARIABLE_DECLARATION");
      var procedure_and_function_declaration_part = new NonTerminal("PROCEDURE_AND_FUNCTION_DECLARATION_PART");
      var proc_or_func_declaration_list = new NonTerminal("PROC_OR_FUNC_DECLARATION_LIST");
      var proc_or_func_declaration = new NonTerminal("PROC_OR_FUNC_DECLARATION");
      var procedure_declaration = new NonTerminal("PROCEDURE_DECLARATION"); 
      var procedure_heading = new NonTerminal("PROCEDURE_HEADING"); 
      var directive = new NonTerminal("DIRECTIVE");
      var formal_parameter_list = new NonTerminal("FORMAL_PARAMETER_LIST");
      var formal_parameter_section_list = new NonTerminal("FORMAL_PARAMETER_SECTION_LIST");
      var formal_parameter_section = new NonTerminal("FORMAL_PARAMETER_SECTION");
      var value_parameter_specification = new NonTerminal("VALUE_PARAMETER_SPECIFICATION");
      var variable_parameter_specification = new NonTerminal("VARIABLE_PARAMETER_SPECIFICATION");
      var procedural_parameter_specification = new NonTerminal("PROCEDURAL_PARAMETER_SPECIFICATION");
      var functional_parameter_specification = new NonTerminal("FUNCTIONAL_PARAMETER_SPECIFICATION");
      var procedure_identification = new NonTerminal("PROCEDURE_IDENTIFICATION");
      var procedure_block = new NonTerminal("PROCEDURE_BLOCK");
      var function_declaration = new NonTerminal("FUNCTION_DECLARATION");
      var function_heading = new NonTerminal("FUNCTION_HEADING");
      var result_type = new NonTerminal("RESULT_TYPE");
      var function_identification = new NonTerminal("FUNCTION_IDENTIFICATION");
      var function_block = new NonTerminal("FUNCTION_BLOCK");
      var statement_part = new NonTerminal("STATEMENT_PART");
      var compound_statement = new NonTerminal("COMPOUND_STATEMENT");
      var statement_sequence = new NonTerminal("STATEMENT_SEQUENCE");
      var statement = new NonTerminal("STATEMENT");
      var open_statement = new NonTerminal("OPEN_STATEMENT");
      var closed_statement = new NonTerminal("CLOSED_STATEMENT");
      var non_labeled_closed_statement = new NonTerminal("NON_LABELED_CLOSED_STATEMENT");
      var non_labeled_open_statement = new NonTerminal("NON_LABELED_OPEN_STATEMENT");    
      var repeat_statement = new NonTerminal("REPEAT_STATEMENT");
      var open_while_statement = new NonTerminal("OPEN_WHILE_STATEMENT");
      var closed_while_statement = new NonTerminal("CLOSED_WHILE_STATEMENT");
      var open_for_statement = new NonTerminal("OPEN_FOR_STATEMENT");
      var closed_for_statement = new NonTerminal("CLOSED_FOR_STATEMENT");
      var open_with_statement = new NonTerminal("OPEN_WITH_STATEMENT");
      var closed_with_statement = new NonTerminal("CLOSED_WITH_STATEMENT");
      var open_if_statement = new NonTerminal("OPEN_IF_STATEMENT");
      var closed_if_statement = new NonTerminal("CLOSED_IF_STATEMENT");
      var assignment_statement = new NonTerminal("ASSIGNMENT_STATEMENT");
      var variable_access = new NonTerminal("VARIABLE_ACCESS");
      var indexed_variable = new NonTerminal("INDEXED_VARIABLE");
      var index_expression_list = new NonTerminal("INDEX_EXPRESSION_LIST");
      var index_expression = new NonTerminal("INDEX_EXPRESSION");
      var field_designator = new NonTerminal("FIELD_DESIGNATOR");
      var procedure_statement = new NonTerminal("PROCEDURE_STATEMENT");
      var parms = new NonTerminal("PARAMETERS");
      var actual_parameter_list = new NonTerminal("ACTUAL_PARAMETER_LIST");
      var actual_parameter = new NonTerminal("ACTUAL_PARAMETER");
      var goto_statement = new NonTerminal("GOTO_STATEMENT");
      var case_statement = new NonTerminal("CASE_STATEMENT");
      var case_index = new NonTerminal("CASE_INDEX");
      var case_list_element_list = new NonTerminal("CASE_LIST_ELEMENT_LIST");
      var case_list_element = new NonTerminal("CASE_LIST_ELEMENT");
      var otherwise_part = new NonTerminal("OTHERWISE_PART");
      var control_variable = new NonTerminal("CONTROL_VARIABLE");
      var initial_value = new NonTerminal("INITIAL_VALUE");
      var direction = new NonTerminal("DIRECTION");
      var final_value = new NonTerminal("FINAL_VALUE");
      var record_variable_list = new NonTerminal("RECORD_VARIABLE_LIST");
      var boolean_expression = new NonTerminal("BOOLEAN_EXPRESSION");
      var expression = new NonTerminal("EXPRESSION");
      var simple_expression = new NonTerminal("SIMPLE_EXPRESSION");
      var term = new NonTerminal("TERM");
      var factor = new NonTerminal("FACTOR");
      var exponentiation = new NonTerminal("EXPONENTIATION");
      var primary = new NonTerminal("PRIMARY");
      var unsigned_constant = new NonTerminal("UNSIGNED_CONSTANT");
      var unsigned_number = new NonTerminal("UNSIGNED_NUMBER");
      var unsigned_integer = new NonTerminal("UNSIGNED_INTEGER");
      var unsigned_real = new NonTerminal("UNSIGNED_REAL");
      var function_designator = new NonTerminal("FUNCTION_DESIGNATOR");
      var set_constructor = new NonTerminal("SET_CONSTRUCTOR");
      var member_designator_list = new NonTerminal("MEMBER_DESIGNATOR_LIST");
      var member_designator = new NonTerminal("MEMBER_DESIGNATOR");
      var adding_operator = new NonTerminal("ADD_OP");
      var multiplying_operator = new NonTerminal("MUL_OP");
      var relational_operator = new NonTerminal("REL_OP");
      var sign = new NonTerminal("SIGN");

      #endregion

      #region 3. BNF Rules

      sign.Rule = Symbol("+") | "-";

      relational_operator.Rule = Symbol("=") | "<>" | "<" | ">" | "<=" | ">=" | "in";

      multiplying_operator.Rule = Symbol("*") | "/" | "div" | "mod" | "and";

      adding_operator.Rule = Symbol("+") | "-" | "or";

      member_designator.Rule = 
        member_designator + ".." + expression | expression;

      member_designator_list.Rule = 
        MakePlusRule(member_designator_list, Symbol(","), member_designator);

      set_constructor.Rule = 
          "[" + member_designator_list + "]" | "[" + "]" |
          "(." + member_designator_list + ".)" | "(." + ".)";

      /* functions with no params will be handled by plain identifier */
      function_designator.Rule = identifier + parms;

      unsigned_real.Rule = real_number;

      unsigned_integer.Rule = digit_sequence;

      unsigned_number.Rule = unsigned_integer | unsigned_real;

      unsigned_constant.Rule = 
        unsigned_number | character_string | charcode | "nil";

      primary.Rule =
        variable_access | 
        unsigned_constant | 
        function_designator | 
        set_constructor | 
        "(" + expression + ")" |
        "not" + primary;

      exponentiation.Rule = primary | primary + "**" + exponentiation;

      factor.Rule = sign + factor | exponentiation;

      term.Rule = factor | term + multiplying_operator +  factor;

      simple_expression.Rule = term | simple_expression + adding_operator +  term;

      expression.Rule = 
        simple_expression | simple_expression + relational_operator + simple_expression;

      boolean_expression.Rule = expression;

      record_variable_list.Rule = 
        MakePlusRule(record_variable_list, Symbol(","), variable_access);

      final_value.Rule = expression;
      
      direction.Rule = Symbol("to") | "downto";

      initial_value.Rule = expression;

      control_variable.Rule = identifier;

      otherwise_part.Rule = Symbol("otherwise") | "otherwise" + ":";

      case_list_element.Rule = case_constant_list + ":" + statement;

      case_list_element_list.Rule = 
        MakePlusRule(case_list_element_list, Symbol(";"), case_list_element);

      case_index.Rule = expression;

      case_statement.Rule = 
          "case" + case_index + "of" + case_list_element_list + "end" | 
          "case" + case_index + "of" + case_list_element_list + ";" + "end" |
          "case" + case_index + "of" + case_list_element_list + ";" + otherwise_part + statement + "end" |
          "case" + case_index + "of" + case_list_element_list + ";" + otherwise_part + statement + ";" + "end";

      goto_statement.Rule = "goto" + label;

      /*
       * this forces you to check all this to be sure that only write and
       * writeln use the 2nd and 3rd forms, you really can't do it easily in
       * the grammar, especially since write and writeln aren't reserved
       */
      actual_parameter.Rule =
          expression |
          expression + ":" + expression |
          expression + ":" + expression + ":" + expression;

      actual_parameter_list.Rule = 
        MakePlusRule(actual_parameter_list, Symbol(","), actual_parameter);

      parms.Rule = "(" + actual_parameter_list + ")";

      procedure_statement.Rule = identifier + parms | identifier;

      field_designator.Rule = variable_access + "." + identifier;

      index_expression.Rule = expression;
      
      index_expression_list.Rule = 
        MakePlusRule(index_expression_list, Symbol(","), index_expression);

      indexed_variable.Rule = 
          variable_access + "[" + index_expression_list + "]" |
          variable_access + "(." + index_expression_list + ".)";

      variable_access.Rule = 
          identifier | 
          indexed_variable |
          field_designator | 
          variable_access + (Symbol("^") | "->" | "@");

      assignment_statement.Rule = 
        variable_access + ":=" + expression;

      closed_if_statement.Rule = 
        "if" + boolean_expression + "then" + closed_statement + PreferShiftHere() + "else" + closed_statement;

      open_if_statement.Rule = 
        "if" + boolean_expression + "then" + statement | 
        "if" + boolean_expression + "then" + closed_statement + PreferShiftHere() + "else" + open_statement;

      closed_with_statement.Rule = 
        "with" + record_variable_list + "do" + closed_statement;

      open_with_statement.Rule = 
        "with" + record_variable_list + "do" + open_statement;

      closed_for_statement.Rule = 
        "for" + control_variable + ":=" + initial_value + direction + final_value + "do" + closed_statement;

      open_for_statement.Rule = 
        "for" + control_variable + ":=" + initial_value + direction + final_value + "do" + open_statement;

      closed_while_statement.Rule = 
        "while" + boolean_expression + "do" + closed_statement;

      open_while_statement.Rule = 
        "while" + boolean_expression + "do" + open_statement;

      repeat_statement.Rule = 
        "repeat" + statement_sequence + "until" + boolean_expression;

      non_labeled_open_statement.Rule =  
          open_with_statement | 
          open_if_statement | 
          open_while_statement | 
          open_for_statement;

      non_labeled_closed_statement.Rule = 
          assignment_statement | 
          procedure_statement | 
          goto_statement | 
          compound_statement | 
          case_statement | 
          repeat_statement | 
          closed_with_statement | 
          closed_if_statement | 
          closed_while_statement | 
          closed_for_statement | 
          Empty;

      closed_statement.Rule = 
        label + ":" + non_labeled_closed_statement | 
        non_labeled_closed_statement;

      open_statement.Rule = 
        label + ":" + non_labeled_open_statement | 
        non_labeled_open_statement;

      statement.Rule = open_statement | closed_statement;

      statement_sequence.Rule = 
        statement_sequence + ";" + statement | statement;

      compound_statement.Rule = "begin" + statement_sequence + "end" ;

      statement_part.Rule = compound_statement;

      function_block.Rule = block;

      function_identification.Rule = "function" + identifier;

      result_type.Rule = identifier;

      function_heading.Rule = 
          "function" + identifier + ":" + result_type |
          "function" + identifier + formal_parameter_list + ":" + result_type;

      function_declaration.Rule =
          function_heading + ";" + directive |
          function_identification + ";" + function_block | 
          function_heading + ";" + function_block;

      procedure_block.Rule = block ;

      procedure_identification.Rule = "procedure" + identifier;

      functional_parameter_specification.Rule = function_heading;

      procedural_parameter_specification.Rule = procedure_heading;

      variable_parameter_specification.Rule = 
        "var" + identifier_list + ":" + identifier;

      value_parameter_specification.Rule = 
        identifier_list + ":" + identifier;

      formal_parameter_section.Rule = 
            value_parameter_specification | 
            variable_parameter_specification | 
            procedural_parameter_specification | 
            functional_parameter_specification;

      formal_parameter_section_list.Rule = 
        MakePlusRule(formal_parameter_section_list, Symbol(";"), formal_parameter_section);

      formal_parameter_list.Rule = 
        "(" + formal_parameter_section_list + ")" ;

      directive.Rule = Symbol("forward") | "extern" | "external";

      procedure_heading.Rule = 
        procedure_identification | 
        procedure_identification + formal_parameter_list;

      procedure_declaration.Rule = 
        procedure_heading + ";" + directive | 
        procedure_heading + ";" + procedure_block;

      proc_or_func_declaration.Rule = 
        procedure_declaration | function_declaration;

      proc_or_func_declaration_list.Rule = 
        MakePlusRule(proc_or_func_declaration_list, Symbol(";"), proc_or_func_declaration);

      procedure_and_function_declaration_part.Rule = 
        proc_or_func_declaration_list + ";" | Empty;

      variable_declaration.Rule = 
        identifier_list + ":" + type_denoter;
      
      variable_declaration_list.Rule = 
        MakePlusRule(variable_declaration_list, Symbol(";"), variable_declaration);

      variable_declaration_part.Rule = 
        "var" + variable_declaration_list + ";" | Empty;

      domain_type.Rule = identifier;

      new_pointer_type.Rule = (Symbol("^") | "->" | "@") + domain_type;

      file_type.Rule = "file" + "of"+ component_type;

      base_type.Rule = ordinal_type;

      set_type.Rule = "set" + "of" + base_type;

      tag_type.Rule = identifier ;

      tag_field.Rule = identifier ;

      case_constant.Rule = 
        constant | constant + ".." + constant;

      case_constant_list.Rule = 
        MakePlusRule(case_constant_list, Symbol(","), case_constant);

      variant.Rule =  
        case_constant_list + ":" + "(" + record_section_list + ")" | 
        case_constant_list + ":" + "(" + record_section_list + ";" + variant_part + ")" | 
        case_constant_list + ":" + "(" + variant_part + ")";

      variant_list.Rule = 
        MakePlusRule(variant_list, Symbol(";"), variant);

      variant_selector.Rule = 
        tag_field + ":" + tag_type | tag_type;

      variant_part.Rule = 
        "case" + variant_selector + "of" + variant_list + ";" | 
        "case" + variant_selector + "of" + variant_list | 
        Empty;

      record_section.Rule = 
        identifier_list + ":" + type_denoter;

      record_section_list.Rule = 
        MakePlusRule(record_section_list, Symbol(";"), record_section);

      record_type.Rule = 
        "record" + record_section_list + "end" | 
        "record" + record_section_list + ";" + variant_part + "end" |
        "record" + variant_part + "end";

      component_type.Rule = type_denoter;

      ordinal_type.Rule = new_ordinal_type | identifier;

      index_type.Rule = ordinal_type;
      
      index_list.Rule = 
        MakePlusRule(index_list, Symbol(";"), index_type);
      
      array_type.Rule = 
        "array" + "(" + index_list + ")" + "of" + component_type;

      structured_type.Rule = 
        array_type | 
        record_type | 
        set_type | 
        file_type;

      new_structured_type.Rule = 
        structured_type | "packed" + structured_type;

      subrange_type.Rule = constant + ".." + constant;

      enumerated_type.Rule = "(" + identifier_list + ")";

      new_ordinal_type.Rule = 
        enumerated_type| 
        subrange_type;

      new_type.Rule = 
        new_ordinal_type | 
        new_structured_type | 
        new_pointer_type;

      type_denoter.Rule = identifier | new_type;

      type_definition.Rule = identifier + "=" + type_denoter + ";";

      type_definition_list.Rule = 
        MakePlusRule(type_definition_list, type_definition);
      
      type_definition_part.Rule = 
        "type" + type_definition_list | Empty;

      non_string.Rule = 
        digit_sequence | 
        identifier | 
        real_number;

      constant.Rule = 
        non_string | 
        sign + non_string | 
        character_string;

      cprimary.Rule = 
        identifier | 
        "(" + cexpression + ")" | 
        unsigned_constant | 
        "not" + cprimary;

      cexponentiation.Rule = 
        cprimary | cprimary + "**" + cexponentiation;

      cfactor.Rule = 
        sign + cfactor | cexponentiation;

      cterm.Rule = 
        cfactor | cterm + multiplying_operator + cfactor;

      csimple_expression.Rule =  
        cterm | csimple_expression + adding_operator + cterm;

      cexpression.Rule = 
        csimple_expression | 
        csimple_expression + relational_operator + csimple_expression;

      constant_definition.Rule = 
        identifier + "=" + cexpression + ";";
      
      constant_list.Rule = 
        MakePlusRule(constant_list, constant_definition);

      constant_definition_part.Rule = 
        "const" + constant_list | Empty;

      label.Rule =  digit_sequence;

      label_list.Rule = 
        MakePlusRule(label_list, Symbol(","), label);

      label_declaration_part.Rule = 
        "label" + label_list + ";" | Empty;

      module.Rule = 
        constant_definition_part + 
        type_definition_part +
        variable_declaration_part + 
        procedure_and_function_declaration_part;

      block.Rule = 
        label_declaration_part + 
        constant_definition_part + 
        type_definition_part +
        variable_declaration_part + 
        procedure_and_function_declaration_part +
        statement_part;

      identifier_list.Rule = 
        MakePlusRule(identifier_list, Symbol(","), identifier);

      program_heading.Rule = 
        "program" + identifier | 
        "program" + identifier + "(" + identifier_list + ")";

      program.Rule = program_heading + ";" + block + ".";

      file.Rule = program | module;

      #endregion

      #region 4. Set starting symbol

      this.Root = file;

      #endregion

      #region 5. Operators precedence

      this.RegisterOperators(1, "+", "-", "or");
      this.RegisterOperators(2, "*", "/", "div", "mod", "and");
      this.RegisterOperators(3, Associativity.Right, "**");
      this.RegisterOperators(4, "=", "<>", ">", "<", ">=", "<=", "in");

      #endregion

      #region 6. Punctuation symbols

      this.RegisterPunctuation(";", ",", ".", "..", "(", ")", "{", "}", "[", "]", ":");

      this.RegisterBracePair("(.", ".)");
      this.RegisterBracePair("[", "]");
      this.RegisterBracePair("{", "}");
      this.RegisterBracePair("(*", "*)");

      #endregion
    }
  }
}

 

Sep 30, 2009 at 8:47 AM

Thanks a lot for sharing, much to learn from this. :)

Coordinator
Oct 2, 2009 at 7:18 AM
Edited Oct 2, 2009 at 9:54 PM

Nice work! Will definitely add it to sample grammars - with 99bottles.pas of course!

Did you get any parser conflicts? If not, that's a great result!

Just with quick look, without compiling - a few comments:

identifier - you don't need this .AddPrefix line, identifier by default expects first char as a letter and following as letters/digits

charcode literal - specifying # as prefix - prefix is actually optional for a number, which means parser may read " 123" as a charcode with skipped # prefix.

it is better to specify it as a non-terminal with rule like "#" + intNumber

realnumber - You should not specify this HasDot flag, it is for internal use and is set when scanner actually sees the dot. I know it is confusing, I will refactor it.

Specifying binary operation non-terminals and operators (+, -, *, / etc) - you should make use of operator precedence handling in Irony; then you can avoid specifying all these extra nonterminals like cterm, csimple_expression, cexpression etc. Just use a single "binary_operator" and specify precedence values for operator symbols

What's the deal with open/close versions of statement? didn't quite get it... can it be simplified somehow?

thanks again

I will play with it myself when I have time and place it into samples

Roman

 

 

Oct 2, 2009 at 5:15 PM

Hey Roman!

Thanks for the feedback. I now see your points on both the identifier, charcode literals, and the realnumber. 

One question I have about your point on charcode,  is wouldn't using a NonTerminal "#" + int allow white-space between the two tokens? And if so, is there a way to create a mandatory prefix?

For the binary operations, you are exactly right. I was just doing a straight port of the YACC/Lexer to Irony. In another project I am working on, I reduced all of these extra rules down to a couple of rules and used the precedence. It is much cleaner and makes much more sense.

Also, just like the binary operators, I'm not sure what the deal is with the open/close version of the operators.  I'm sure it can be simplified.

Once time comes available, I may go back through and optimize the grammar for Irony.  I did get some successful parses with this version, however most pascal source I found on the web was in Turbo Pascal and not Standard Pascal which have a few differences (ex. uses statement - which could be easily added).

Thanks,

MindCore

 

Coordinator
Oct 4, 2009 at 6:18 PM

About char literal with # prefix - you're right about whitespace, it becomes allowed in this case. Mandatory prefix - not sure, will think about this... in any case, the trouble is that you try to use NumberLiteral for this, and the output value is number, while it would be logical to expect Char .NET type. Will think about this.

thanks

Roman

Oct 7, 2009 at 2:33 PM

Roman,

I've given what I previously said some thought and actual came to an alternate solution. I'm not sure of the complexities involved in implementing something like this, but is it possible to add a new operator that doesn't allow whitespace between the tokens? Currently, it looks like the | and + symbols are the only two operators, so I would like to suggest the & symbol.

Scenarios:
"#" & charcode   -  would match to "#27", but would not match to "# 27"
Or in HTML, character literals like this
"&" & charcode & ";"  -  which would be used to capture things like "&nbsp;" or "&amp;"

The major problem I see with this suggestion is rule precedents.

Example:
If a developer had the following two rules, which one would win?
term1.Rule = "#" + charcode;
term2.Rule = "#" & charcode;


Just some thoughts, let me know what you think.

Thanks,
MindCore

Coordinator
Oct 8, 2009 at 4:16 PM

That's an interesting suggestion, and I will look into this. However, this & operator would probably come as an enhancement, syntax sugar over basic implementation which probably should be a grammar hint, with something like NoWhitespace() method injected inside the expression that would signal to parser that here you shouldn't bypass whitespaces before starting to scan the token. The & operator then would simply work as combination of "+ NoWhitespace() + ". The problem is now to figure out how to modify parser/scanner to handle this situation. I think it's possible, and I will give it a try.

thanks for the suggestion!

Roman