Wednesday, May 15, 2013

Yet Another ORM Micro Benchmark, part 1/3, introduction

I’ve been experimenting with Entity Framework 5 Code First lately and one of my concerns was the performance on bare SELECTs as most of the time my applications just select the data.

In past, Entity Framework was never a fast mapper. In fact, my eariler experiments suggested that it is one of the most sluggish. As it turned out, there are quite a lot of factors a reliable benchmark has to consider regarding Entity Framework:

  1. EF relies on the metadata. A reliable test should not measure the time to create actual metadata but rather focus only on reading and materializing data.
  2. EF relies on “pregenerated” views, which are your actual queries. Similarily to metadata, view generation takes time so it should not be considered a factor.
  3. EF should supposedly be faster on .NET 4.5 while my earlier tests were performed on .NET 4.0
  4. It is important whether EF tracks changes to your entities or not

My fair tests should then consider all these factors and make sure that there is no single issue that could possibly spoil the results.

What other, similar benchmarks seem to omit is the fact that reading data behaves differently depending on the amount of data you actually read. There is a big difference in reading 1 entity vs reading 100 or 10000 entities when you compare different approaches.

I then decided to create a test environment with a database with two simple tables:

CREATE TABLE [dbo].[Child](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [ID_PARENT] [int] NOT NULL,
    [ChildName] [nvarchar](150) NOT NULL,
 CONSTRAINT [PK_Child] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)
 
CREATE TABLE [dbo].[Parent](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [ParentName] [nvarchar](150) NOT NULL,
 CONSTRAINT [PK_Parent] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)
 
ALTER TABLE [dbo].[Child] WITH CHECK 
  ADD CONSTRAINT [FK_Child_Parent] FOREIGN KEY([ID_PARENT])
    REFERENCES [dbo].[Parent] ([ID])

As you can see there is Parent table and Child table and a foreign key between these two. Both tables of the database were filled with over 100k of random records.

Now introducing the contestants:

  1. (ADO) Bare ADO.NET SqlDataReader which materializes objects by fetching columns from the data reader and creating new instances manually
  2. (Linq2SQL) A Linq2SQL data context created from the database by sqlmetal.exe
  3. (EF5 Model First) A EF5 Object Context created from the database by using the designer
  4. (EF5 Code First) A EF5 DbContext created manually with entities created manually to match the database structure
  5. (nHibernate) A nHibernate model created manually to match the database structure

I am gonna implement tests for all contestants using C#+VS2012. Each test will implement a simple interface so that I can create and run them in a single environment:

public interface IPerformanceTest
{
    string Category { get; }
    string Name     { get; }
 
    void PerformTest( out DateTime StartedAt, out DateTime EndedAt );
}
The idea of my tests it not only just to select the data but perform selects to fetch TOP 1, 10, 100, 1000 and 10000 entities using Linq where applicable. The assumption is that no matter which tool you use, you expect a consistent query language, Linq.

Also, because selecting 1 or 10 records takes is too quick to measure reliably, my test runner will repeat tests 1000 times. Because of this, it will probably be possible to observe how different query mechanisms behave when the number of fetched rows changes.

As for the test environment – it is Win7x64 with SQL Server 2012, 8GB RAM and 4 core i7 processor. Tests will be performed on .NET 4.5 in Release mode, the runner is executed from the OS shell directly rather than from VS2012.

In the next post I will show how all tests are implemented. Then, in the 3rd post we will analyze and comment the results.

No comments: