P21: Implement fast compiled delegate to create AST nodes

Jul 20, 2011 at 12:44 AM

Roman,

I have had a look at your item P21.  I have got some code working and I have tested it against the ExpressionEvaluator Grammar.

While we all know that Reflection is expensive, it doesnt appear to be a hugely expensive part of the parse time.  On initial (unscientific) tests of roughly 100,000 tokens, I get a parse time of 78ms on both Reflection and 78ms my method.  My method usually runs at least 10x faster than reflection, including the dictionary lookup.  That is real results that I see elsewhere on other projects. 

It is not quite as fast as a pure compiled method,  but then this requires minimal code change and sticks with the simple requirement of just an AST node type with a parameterless constructor so should be compatible with all existing grammars. 

The basis of my code change:

      //nodeInfo.AstNode =  Activator.CreateInstance(nodeType);

      if (ASTNodeConstructorList.ContainsKey(nodeType))
      {
          ASTNodeConstructor = ASTNodeConstructorList[nodeType];
      }
      else
      {
          ConstructorInfo target = nodeType.GetConstructor(Type.EmptyTypes);
          DynamicMethod ASTNodeMethod = new DynamicMethod("New", nodeType, null);
          ILGenerator il = ASTNodeMethod.GetILGenerator();
          il.Emit(OpCodes.Newobj, target);
          il.Emit(OpCodes.Ret);
          ASTNodeConstructor = (NewASTNode)ASTNodeMethod.CreateDelegate(typeof(NewASTNode));
          ASTNodeConstructorList.Add(nodeType, ASTNodeConstructor);
      }
      nodeInfo.AstNode = ASTNodeConstructor.Invoke();

So either I have missed something or there is a lot of overhead somewhere else?

What would you like to do? Any suggestions?  Would you really want to include this and have the additional complexity or stick with the simpler Activator command for the moment?

 

Coordinator
Jul 20, 2011 at 1:06 AM

There's one suspicious place: the call to Invoke method in the last line. Invoke is inherited from MethodBase and is in fact using reflection to make a call. 

I think you should call directly:

AstNodeConstructor(); 

 

see code example here:

http://msdn.microsoft.com/en-us/library/system.reflection.emit.dynamicmethod.aspx

 

Jul 20, 2011 at 11:09 AM

Interesting spot.  Thank you.  I have used the Invoke command consistently on my other projects. So if I am using reflection, it can be nowhere near the same performance hit as Activator.  What I find interesting is that the code example on your link also uses Invoke, and I cannot see any examples of a direct call.  But then again, it is a very difficult example to follow because of all the writes to console.  In previous tests I witnessed a very slight deterioration in performance (10%) over calling "new" directly, which I had always put down to using a delegate.

I have made the change you suggest and it has made no detectable difference.  I still run at 78ms.  I will produce a slightly more scientific test later to directly test the difference between Activator, dynamic method with Invoke and dynamic method called directly.

Coordinator
Jul 20, 2011 at 3:55 PM

Hi

About the MSDN sample - search for the following lines:

HelloDelegate hi = 
            (HelloDelegate) hello.CreateDelegate(typeof(HelloDelegate));

        // Use the delegate to execute the dynamic method.
        Console.WriteLine("\r\nUse the delegate to execute the dynamic method:");
        int retval = hi("\r\nHello, World!", 42);

And right after that, there's use of invoke and a comment: 

// Invoke the dynamic method using the arguments. This is much
        // slower than using the delegate, because you must create an
        // array to contain the arguments, and value-type arguments
        // must be boxed.
        object objRet = hello.Invoke(null, BindingFlags.ExactBinding, null, invokeArgs, new CultureInfo("en-us"));

 

So it should be faster to invoke directly, even if marginally. Also, I think even with Invoke it should be faster than using Activator.

About not seeing perf improvements. Note that in your version you lookup the delegate in the dictionary - this is substantial effort. Additionally the dictionary lookup is not efficient. First, use of "if dict.Contains(key) then v=dict[key]" pattern requires 2 dictionary lookups - better use TryGetValue which requires a single lookup. Secondly, using AstNode as a key in a dictionary - this requires looking at AstNode.GetHashCode() implementation. If you use the one inherited from Object, then it is really slow and not very good hash. So the perf results show that the cost of using Activator is about the same as two dictionary lookups

So let's get rid of the dictionary. Please declare a delegate, add a field (of delegate type) to BnfTerm, assign it at parser startup for all terms, and then use it for creating nodes. Let's see if it brings any visible perf improvement.

Overall, I wouldn't worry even if we don't see perf differences. First, the perf of this piece is obscured by the load of all other parser activities. But mainly I would prefer to do this change anyway, because this old code using Activator is an "itching point", some obviously "can be done better" spot for any person reading the code, and making bad impression of the project as a whole. So let's go for it. Let me know the results please

Thank you

Roman

 

 

Jul 20, 2011 at 5:09 PM

Thanks for the reply!

The pattern I have used is one I use elsewhere.  Even with Invoke I can assure you that it is a lot faster.  I can say with some confidence that I know the Dictionary + Invoke + dynamic method pattern that I have used runs at a minimum 5 to 10 times faster than Activator.  But I will do some direct performance tests and post them.  But having said that, there is still about a factor of 3 improvement to be found to match a direct "new" statement.  So if we can get that as well then this has been worthwhile.....  An "itching point" is as good an excuse as I need.  I just wanted to make sure that was what you wanted.  Others I know wouldnt.

Bad impression on the project as a whole?  Not in my book. 

What I expect is that performance improvement = cost of initally producing the IL delegates, rather than the two dictionary lookups.  Which means that on smaller parse trees it is a performance hit and on larger parse trees it is a performance gain.

MSDN sample: oh yes, I see it now.  Yes that will be marginally faster. Useful to know and I have already modified both this code and my other projects!

Dictionary lookups: TryGet Value is just a pattern I have never got into using.  A very good point on the double lookup and I will have a look at that.  In the past I have measured the "ContainsKey" method to take about 1/3 of the time of the v=dict[key] lookup which is why I never really went looking for an alternative such as TryGetValue.  But that is still time that can be recovered.

Object hashes: I am using Type as the dictionary key, not the AST Node.  So I was expecting (is that sensible?) that to be a good hash, a direct method and to not require boxing.  But I take your point, it is still a hash operation and then an equality operation. Twice.

Getting rid of the Dictionary: I didnt want to put it in until I noticed what the GetAstNodeType function was doing.   Apparently for StringLiterals and template strings?  Without understanding more and knowing that even with a dictionary I will still get a massive performance improvement, I put it in to just get it working,  I was working on the assumption that I would only know for sure the type I need to construct after GetAstNodeType. 

Am I guaranteed that the AST node type will be totally predictable at parser startup even on the StringLiterals? (I worked on the assumption "GetAstNodeType" method has a purpose?)  If I have to do an equality operation on the type before constructing then it is probably worthwhile just leaving the dictionary in there?

Thanks for the clarification and the results are on the way...