Friday, October 11, 2013

Soft Delete pattern for Entity Framework Code First

Some time ago I’ve blogged on how to implement the Soft Delete pattern with NHibernate. This time I am going to show how to do the same with Entity Framework Code First.

(a side note: I really like the EFCF, I like its simplicity the and migration infrastructure. I tend to favor EFCF over other ORMs lately)

I’ve spent some time looking for a working solution and/or trying to come up with something on my own. There are solution that almost work, like the one by Zoran Maksimovic from his post “Entity Framework – Applying Global Filters”. Zoran’s approach involves cleverly replacing DbSets by FilteredDbSets internally in the DbContext. These FilteredDbSets have filtering predicates attached so that filtering occurs upon data retrieval. Unfortunately, this approaches missing the fact that filtering should also be applied to navigation properties. Specifically, this works correctly in Zoran’s approach

// both loop correctly over non-deleted entities only
foreach ( var child in context.Child ) ...
foreach ( var parent in context.Parent )...

but this fails miserably

foreach ( var parent in ctx.Parent )       // ok
  foreach ( var child in parent.Children ) // oops, deleted entities are included!
     ...

However, another solution has been proposed by a StackOverflow user Colin. This solution involves a discriminator column which normally is used when mapping class hierarchies to mark different types of entities mapped to the same table. There is the link to the original entry.

My job here is merely:

  • cleaning this up so that it complies
  • making it a little bit more general as the original approach makes some assumptions (a common base class for all entities where the primary key is always called “ID”)
  • adding a cache for the metadata so that all the metadata searching doesn’t have to be repeated over and over

All the credit goes to Colin, though.

Let’s start with entities:

public class Child
{
    public long ID { get; set; }
 
    public string ChildName { get; set; }
 
    public bool IsDeleted { get; set; }
 
    public virtual Parent Parent { get; set; }
}
 
public class Parent
{
    public long ID { get; set; }
 
    public string ParentName { get; set; }
 
    public bool IsDeleted { get; set; }
 
    public virtual ICollection<Child> Children { get; set; }
}

Nothing unusual as all the Soft Delete stuff is in the DbContext:

/// <summary>
/// http://stackoverflow.com/questions/19246067/ef-5-conditional-mapping/19248216#19248216
/// http://stackoverflow.com/questions/12698793/soft-delete-entity-framework-code-first/18985828#18985828
/// </summary>
public class Context : DbContext
{
    public DbSet<Child>  Child { get; set; }
    public DbSet<Parent> Parent { get; set; }
 
    public Context()
    {
        Database.SetInitializer<Context>( new MigrateDatabaseToLatestVersion<Context, Configuration>() );
    }
 
    protected override void OnModelCreating( DbModelBuilder modelBuilder )
    {
        modelBuilder.Entity<Child>()
            .Map( m => m.Requires( "IsDeleted" ).HasValue( false ) )
            .Ignore( m => m.IsDeleted );
        modelBuilder.Entity<Parent>()
            .Map( m => m.Requires( "IsDeleted" ).HasValue( false ) )
            .Ignore( m => m.IsDeleted );
 
        modelBuilder.Conventions.Remove<System.Data.Entity.ModelConfiguration.Conventions.PluralizingTableNameConvention>();
    }
 
    public override int SaveChanges()
    {
        foreach ( var entry in ChangeTracker.Entries()
                  .Where( p => p.State == EntityState.Deleted ) )
            SoftDelete( entry );
 
        return base.SaveChanges();
    }
 
    private void SoftDelete( DbEntityEntry entry )
    {
        Type entryEntityType = entry.Entity.GetType();
 
        string tableName      = GetTableName( entryEntityType );
        string primaryKeyName = GetPrimaryKeyName( entryEntityType );
 
        string deletequery =
            string.Format(
                "UPDATE {0} SET IsDeleted = 1 WHERE {1} = @id",
                    tableName, primaryKeyName );
 
        Database.ExecuteSqlCommand(
            deletequery,
            new SqlParameter( "@id", entry.OriginalValues[primaryKeyName] ) );
 
        //Marking it Unchanged prevents the hard delete
        //entry.State = EntityState.Unchanged;
        //So does setting it to Detached:
        //And that is what EF does when it deletes an item
        //http://msdn.microsoft.com/en-us/data/jj592676.aspx
        entry.State = EntityState.Detached;
    }
 
    private static Dictionary<Type, EntitySetBase> _mappingCache = 
       new Dictionary<Type, EntitySetBase>();
 
    private EntitySetBase GetEntitySet( Type type )
    {
        if ( !_mappingCache.ContainsKey( type ) )
        {
            ObjectContext octx = ( (IObjectContextAdapter)this ).ObjectContext;
 
            string typeName = ObjectContext.GetObjectType( type ).Name;
 
            var es = octx.MetadataWorkspace
                            .GetItemCollection( DataSpace.SSpace )
                            .GetItems<EntityContainer>()
                            .SelectMany( c => c.BaseEntitySets
                                            .Where( e => e.Name == typeName ) )
                            .FirstOrDefault();
 
            if ( es == null )
                throw new ArgumentException( "Entity type not found in GetTableName", typeName );
 
            _mappingCache.Add( type, es );
        }
 
        return _mappingCache[type];
    }
 
    private string GetTableName( Type type )
    {
        EntitySetBase es = GetEntitySet( type );
 
        return string.Format( "[{0}].[{1}]", 
            es.MetadataProperties["Schema"].Value, 
            es.MetadataProperties["Table"].Value );
    }
 
    private string GetPrimaryKeyName( Type type )
    {
        EntitySetBase es = GetEntitySet( type );
 
        return es.ElementType.KeyMembers[0].Name;
    }
}

A couple of explanations.

First, the mapping. Note that the discriminator column is used to force EF to focus on undeleted entities. This adds the filtering predicate to all queries, including queries involving navigation properties.

modelBuilder.Entity<Child>()
    .Map( m => m.Requires( "IsDeleted" ).HasValue( false ) )
    ...

But then the discriminator column has to be removed from the mapping:

modelBuilder.Entity<Child>()
    ...
    .Ignore( m => m.IsDeleted );

This is enough to make EF generate correct queries, you can ignore the following stuff for a moment and just try it.

Second, the data saving. It is not enough to be able to filter the data, the Soft Delete also requires that deleting should actually only mark data as deleted. This is done in the overridden SaveChanges method. For each entity that is internally marked as deleted in the EF’s object cache, we manually update it in the database and then mark them as unattached (just like EF’s SaveChanges does).

Third, the caching stuff, GetEntitySet/GetTableName/GetPrimaryKeyName. These are for reading metadata so that the query that marks the data can include correct table name and correct primary key name for given entity type.

And this is it, deleting the data

var child = ctx.Child.FirstOrDefault( c => c.ID == 123 );
ctx.Child.Remove( child );
correctly updates its state to deleted (IsDeleted=1) instead of physically deleting it from the database.