When NOT to Performance Tune Your Application

On a recent project I was told by a colleague about a certain SQL query generated by entity framework, that was ridiculously out of hand. Entity Framework allows you to pretty easily create a simple Data Access to the Table Per (Sub) Type pattern.

What this means is that you may have an inheritance of both a Student and an Instructor, derived from a Person, and query to retrieve a strongly typed object. So here’s where performance & optimization comes in. There’s a couple of ways to query against this data model.

Method 1: Implicit Typing

var query = from p in Persons
            where p.PersonID.Equals(_personID)
            select p;

Method 2: Explicit Typing

var query = from p in Persons.OfType<Instructor>()
            where p.PersonID.Equals(_personID)
            select p;

They seem pretty similar, however there’s quite a significant difference in what gets generated. By using method 2, you wind up letting Entity Framework know exactly what table it’s querying against. Which means your SQL code looks something like this:

SELECT
    PersonID,
    Column1,
    Column2,
    Column3
FROM
    Instructor
WHERE
    PersonID = @PersonID

However when you don’t specify the type, Entity Framework constructs a SQL query intended to make SQL go and figure it out (keep in mind that there’s no automatic discriminator column – it figures out type based off of the primary key – the ID column. More on this in a minute). The generated code looks something like this:

SELECT
    PersonID,
    Column1 as [0x01],
    Column2 as [0x02],
    CASE WHEN [1x01] IS NOT NULL THEN CAST(INT, [1x01])
    ... many more case, casts, for every column in every table ...
    
FROM
    Person
    UNION ALL SELECT 
        PersonID as [1x01],
        Column1 as [1x02],
        Column2 as [1x03]
        UNION ALL SELECT
                ... many more unions for every table ...
WHERE
    PersonID = @PersonID
        

Now you can see that this query gets very complex as a product of:

a) the number of subtypes

b) the number of columns for each type

The query generated gets the Cartesian product of all columns, and looks for the one where the key isn’t null – that’s the “winner” subtype. I imagine (haven’t yet tried this) that having a nested subtype involved here (like BusinessStudent in the linked example above) would cause an even more ugly nesting of the union within another union statement.

Now back to the point of this article – performance. How bad is what we see above? In an empirical example, thanks to JetBrains’ Dottrace and nunit tests I observed averages of:

Method 1: 126ms for the query to run

Method 2: 25ms for the query to run

I had then discovered the benefits of precompiling Entity Framework view code to optimize the SQL generation. This bought me roughly 26% gain in performance for the specific empirical examples.

Method 1: 100ms

Method 2: 20ms

Now we have roughly 80ms to play with – if the code to get from Method 1 to Method 2 (we don’t know the type that we’re retrieving, however we want the optimized query of Method 2) is more than 80ms, then the performance “fix” will be worse than the problem.

So far, given the constraint of EF (for now), and the Table Per (Sub) Type pattern, the only solution that comes to mind is reflection – this would involve a stored type as a discriminator column of sorts, then reflecting on that type, and calling the generic Person.OfType<T>() method via reflection. This costs us an extra query and reflection – neither of which are cheap. A separate empirical example (not the same code as the first) brings the total cost to ~350ms, a net performance loss of 250ms.

Method 1’s performance would have to degrade (through additional columns/subtypes) by ~250ms more in order to justify rolling a custom discriminator and reflecting to grab the subtype.

This was a pretty interesting exercise in when not to make performance optimizations that you know will need to be done long-term.

Posted on 4/30/2009 7:01:00 AM by Jason Nadal

Permalink | Comments |

Categories: development | performance | software

Tags:

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Javascript Function Body Equality Checking

If you need to check to see if two functions are the same instance, you can just check the variables for equality:

 

var f = new function() { 
  alert("hello world");
}; 

var f1 = new f();
var f2 = new f();
var boolTest = (f1 == f2);

 

However, if you need to check to see if the bodies of those functions are equal (or both contain some text, etc -- essentially, just working with the actual language of the function), you can just cast as string to get the text of the function, then check for equivalency:

function fnsAreEqual(f1, f2){
  return String(f1) === String(f2);
} 

var boolTest2 = fnsAreEqual(
  function(){ alert("sameFn"); }, 
  function(){ alert("sameFn"); }
  ); 

updated: fixed JavaScript formatting

Posted on 4/27/2009 7:40:00 AM by Jason Nadal

Permalink | Comments |

Categories: javaScript | performance

Tags:

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Is my code smelly?

I've run my open source FileCombiner app through NDepend and dotTrace, and the results were on one hand startling, and on the other hand, expected.

Here are the first impressions:

dotTrace:

  • Discovered huge performance hit surrounding my WriteByte code. While attempting to make something that's extremely granular for Unit Testing, I obviously lost site of the overall speed goal.
  • Essentialy what's going on is 29 million individual calls to the WriteByte code -- I need to optimize this (probably by using a buffer), but now I have measured performance as a benchmark to improve.
  • I can now prove that improvements to code occurred, that they have a net positive effect, and the scale by which those improvements exist.

NDepend:

  • This one's tougher. A lot of the measures here I don't yet fully understand.
  • The good:
    • 28% comment ratio (1:1 would be 50%)
    • Distance is extremely low (0.07). Code nicely straddles the safe area between the "zone of uselessness" and the "zone of pain"
    • Most classes have great numbers
    The Bad:
    • App is marked as high instability
    • High instability seems centered around the FilePartJoiner class
    • Trial/Open Source edition of NDepend does not allow import of NCover reports (that was a disappointment)
    • High levels of Efferent Coupling -- need to discover why. Generally this means that a class is tightly coupled to another class, but NDepend states that it filters out framework classes. What's interesting is that this is the main WinForm class. I wonder if the correct behavior here is to remove that class from processing? I wonder how that would affect the net numbers
    • The attribute that I've declared "CoverageExcludeAttribute" that NCover is set up to ignore is showing up as incredibly evil to NDepend. I need to figure out how to exclude that attribute from processing in NDepend! This may be two birds with one stone if I am able to make NDepend recognize that attribute as those classes marked with that attribute seem to be the pain points in the app.

My next steps are to resolve the performance issue, as well as reconfigure NDepend to avoide the CoverageExclude attributes & rerun the NDepend results. This may be followed by moving all library-type classes out to a separate DLL. This was going to be a later stage, but I may be going against the grain of proper dependancy standards by holding off until later.

I'm definitely liking the process here, but it'll be interesting to attempt this same evaluation on an enterprise level application.

 

Posted on 10/8/2008 8:37:00 AM by Jason Nadal

Permalink | Comments |

Categories: development | performance

Tags: , , , , , ,

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5