Object-Oriented Software Development: November 2007

Wednesday, November 21, 2007

The Definition of an Antipattern

A friend of mine has studied antipatterns extensively and asked me "What would be the best way to identify an antipattern in my code"?

To increase his confusion, I said "Almost everything from your code you've just shown to me is considered an antipattern".

Then I've added "Relax, this is not true. It's just you falling into paranoia of antipatterns".

Tuesday, November 20, 2007

Loose coupling of embedded DetailsViews

I often embed a DetailsView inside yet another DetailsView. Typical scenario - there is a "Person DetailsView" and "Address DetailsView". The former one has one TemplateField with the latter embedded.

Since I always work with objects, I also use ObjectDataSources. The ObjectDataSource will provide a business object as the data source for the control if and only if the underlying data provider provides the correct business object.

Usually I just add three sections to the ObjectDataSource - SelectParameters, UpdateParamteters and DeleteParameters each one providing exactly one parameter, the ID of the underlying object.

In case of one DetailsView embedded inside another one this does not work - the first DetailsView consumes the ID of the person supplied by the Select Parameter but the embedded DetailsView should use the Person's ID_ADDRESS value as the ID.

To overcome this difficulty I've just implemented a simple solution where one of the Page's controls has to implement following interface:

   1: namespace Vulcan.Uczniowie.UILayer

   2: {

   3:     public interface IParamValueProvider

   4:     {

   5:         object ProvideValue( string ParamName );

   6:     }

   7: }

and the SelectParameter just retrieves the value supplied by the control:

   1: namespace Vulcan.Uczniowie.UIHelpers

   2: {

   3:     [DefaultProperty( "ParamName" )]

   4:     public class ParamValueProviderParameter :

   5:        QueryStringParameter

   6:     {

   7:         private string paramName;

   8:         public string ParamName

   9:         {

  10:             get

  11:             {

  12:                 return paramName;

  13:             }

  14:             set

  15:             {

  16:                 paramName = value;

  17:             }

  18:         }

19:

  20:         protected override object Evaluate(

  21:             System.Web.HttpContext context,

  22:             System.Web.UI.Control control )

  23:         {

  24:             if ( context != null &&

  25:                  context.Handler is Page

  26:                 )

  27:             {

  28:                 IParamValueProvider Provider =

  29:                     FindParamValueProvider( context.Handler as Page );

  30:                 if ( Provider != null )

  31:                     return Provider.ProvideValue( ParamName );

32:

  33:                 return null;

  34:             }

35:

  36:             return null;

  37:         }

38:

  39:         private IParamValueProvider FindParamValueProvider( Control Control )

  40:         {

  41:             if ( Control != null )

  42:                 foreach ( Control Child in Control.Controls )

  43:                 {

  44:                     if ( Child is IParamValueProvider )

  45:                         return (IParamValueProvider)Child;

46:

  47:                     IParamValueProvider RecursiveRet = FindParamValueProvider( Child );

  48:                     if ( RecursiveRet != null )

  49:                         return RecursiveRet;

  50:                 }

51:

  52:             return null;

  53:         }

  54:     }

  55: }

The top level Details View just implements the interface:

   1: #region IParamValueProvider Members

2:

   3:    // implemented in the top level DetailsView

   4:    public object ProvideValue( string ParamName )

   5:    {

   6:        switch ( ParamName )

   7:        {

   8:            // somehow provide ID_ADDRESS for the embedded DetailsView

   9:            case "ID_ADDRESS" :

  10:            if ( ParameterHelper.GetParameterValue(

  11:                   this.DataSource.SelectParameters["ID"] ) != null )

  12:            {

  13:               int id = Convert.ToInt32(

  14:                 ParameterHelper.GetParameterValue(

  15:                   this.DataSource.SelectParameters["ID"] ) );

  16:               Person Item = Person.Retrieve( Global.Model, id );

17:

  18:               return Item.ID_ADDRESS;

  19:            }

  20:            break;

  21:        }

22:

  23:        return null;

  24:    }

25:

  26: #endregion

and the embedded one defines the SelectParameter as:

   1: <SelectParameters>

   2:     <vlc:ParamValueProviderParameter Name="ID" ParamName="ID_ADDRESS" />

   3: </SelectParameters>

This way we have a perfect loose-coupling between the top level and embedded views - both communicate using the interface IParamValueProvider.

To complete the example I have to explain and provide code for the ParameterHelper.GetParameterValue method.

You see, the Select/Update/Delete parameters are directly available only inside the ObjectDataSource's Selecting/Inserting/Deleting event handlers! If you are to retrieve the current value of a parameter outside one of these handlers you will learn that there's no CurrentValue property on the Parameter class. Or rather, there is but it's private!

Therefore the ParameterHelper aims to retrieve the value so that it is available from any context, not only from within one of handlers I've mentioned.

   1: namespace Vulcan.Application.UIHelpers

   2: {

   3:     public class ParameterHelper

   4:     {

   5:         public static object GetParameterValue( Parameter Parameter )

   6:         {

   7:             return Parameter.GetType().InvokeMember(

   8:                 "ParameterValue",

   9:                 System.Reflection.BindingFlags.GetProperty |

  10:                 System.Reflection.BindingFlags.NonPublic |

  11:                 System.Reflection.BindingFlags.Instance,

  12:                 null, Parameter, null );

  13:         }

  14:     }

  15: }

PersistentStatePage with Event Validation

Few months ago I've came upon a great article "Persisting the state of a web page" by Martin Jericho.

The idea is very cool and useful - the viewstate of a web page can be stored on the server-side and restored on demand. Please refer to the article for further explanations if you are not familiar with Martin's solution.

As a kind of a hack, the solution is not perfect. There are few small problems with the implementation (for example, I've added ".ToLower()" in few places where addresses are compared) but the main issue with Martin's approach is connected with the ASP.NET event validation mechanism.

You see, to reduce the risk of cross-site request forgery attack, ASP.NET is rather paranoic when validating the input coming from the browser. Specifically, the validation signature is placed in the hidden __EVENTVALIDATION field and the signature is used to validate if the request is vaild.

Since in Martin's solution the viewstate is restored from the external resource, the event validation must be turned off, otherwise you get the "Invalid postback or callback argument" exception. However, turning off the event validation is a huge security risk.

I belive I have found a way to have the event validation turned on and still be able to use Martin's solution.

What's the problem, doc?

First of all, why it does not work with event validation turned on?

It seems that this just does not work:

   1: private void LoadPostData(Control control, ArrayList modifiedControls) {

   2: {

   3:   ..

   4:   // Call the framework's LoadPostData on this control using the name

   5:   // attribute as the post data key:

   6:   if (((IPostBackDataHandler)control).LoadPostData(nameAttribute,PostData))

   7:       modifiedControls.Add(control);

The LoadPostData method, implemented internally in the .NET library, does the validation and just does not accept the external viewstate provided upon the state retrieval.

Solution? Well, almost

If we just were able to disable the validation only for these request which involve the persisted viewstate and keep validation on for all other valid requests... Unfortunately, it seems that the EnableEventValidation property is another paranoiac - if you see the inner implementation

   1: public virtual void set_EnableEventValidation(bool value)

   2: {

   3:     if (base.ControlState > ControlState.FrameworkInitialized)

   4:     {

   5:         throw new InvalidOperationException( ... );

   6:     }

   7:     this._enableEventValidation = value;

   8: }

then you will realise that the property is somewhat special - it cannot be switched always on your demand but rather before the engine initializes.

Few experiments reveal that the constructor is a good place to switch validation on/off and this is what Martin does - he turns the validation off in the constructor.

However, yet another place where you are allowed to turn the validation on/off is the DeterminePostBackMode method. The new implementation would be:

   1: protected override NameValueCollection DeterminePostBackMode() {

   2:     pageState=LoadPageState(Request.Url);

3:

   4:     // this line turns the validation off but only when the state

   5:     // is actually restored. otherwise the validation should remain

   6:     // turned on.

   7:     if ( IsRestoredPageState ) EnableEventValidation = false;

8:

   9:     NameValueCollection normalReturnObject = base.DeterminePostBackMode();

10:

What's interesting is that after this small enhancement, the code works as expected. Well, almost.

Yet another issue

It looks like turning the validation off does in fact two things. Not only it prevents the validation of the request on the server side but also prevents the __EVENTVALIDATION hidden field to be appended to the response.

This makes the above solution only partially succesfull - even though the restored page is accepted on the server side, since the response does not contain the validation signature, the page will likely fail on another postback! The engine will just see that the validation is turned on (it is, by default), there is no state to restore (it has been restored one postback ago) but the __EVENTVALIDATION is missing (since the validation was turned off last time the page had been processed on the server). Guess what? You will get the "Invalid postback or callback"!

What we need is then not only to turn the validation off before the state is retrieved but also to turn it on after it is retrieved, so that the __EVENTVALIDATION signature is appended to the response!

But how do we enable the validation if the EnableEventValiation property cannot be altered after the state is retrieved?

Well, using ... reflection.

I know, the hack is dirty but it works. It seems that the EnableEventValidation property is just a wrapper on the _enableEventProperty internal boolean field. So just after the state is retrieved, we just turn on the validation:

   1: override protected void OnLoad(EventArgs e) {

   2:     // this is Martin's code

   3:     if (IsRestoredPageState) {

   4:         // Populate controls with PostData, saving a list of those that were modified:

   5:         ArrayList modifiedControls=new ArrayList();

   6:         LoadPostData(this,modifiedControls);

   7:         // Raise PostDataChanged event on all modified controls:

   8:         foreach (IPostBackDataHandler control in modifiedControls)

   9:             control.RaisePostDataChangedEvent();

10:

  11:         // and this is my dirty hack which turns the validation on

  12:         // after the state is retrieved so that the

  13:         // __EVENTVALIDATION is correctly appended to the response

  14:         FieldInfo fi = typeof(Page)

  15:            .GetField( "_enableEventValidation",

  16:                 BindingFlags.NonPublic | BindingFlags.Instance );

  17:         if ( fi != null )

  18:             fi.SetValue( this, true );

  19:     }

  20:     base.OnLoad(e);

  21: }

Well, this is all. Two small modifications, the first one in the DeterminePostbackMode and second one in the OnLoad. It works correctly in my test application, please feel free to share your experiences.

Friday, November 16, 2007

Complete OLAP infrastructure without Microsoft Analysis Services, part 1

Few weeks spent in the OLAP world searching for answer for following question: How to be able to do OLAP with no additional tools available?

Specifically: are we able to build a full OLAP infrastructure for clients who do not have Microsoft Analysis Services or any other enterprise OLAP severs?

For few last years our company used Microsoft Excel to perform online analysis of the relational data. As you know, Excel's capable of building pivot tables for any kind of relational data. There are three problems with Excel:

it does not scale well (usually, several hundred thousands of records in the data source is enough to kill Excel)
it does not like to be executed inside the application server (due to security and scalability reasons)
you do rely on Excel and this can be too much for some clients

Few years ago Microsoft released Microsoft SQL Server 2000 PivotTable Services, a COM-based solution which could be used to built offline analytical databases (offline cubes) and process them with no Analysis Services available. The library is however not maintaned anymore.

Today, we have a new king of the ring: the Microsoft ADOMD.NET library. This one, and other core components of the infrastructure I will present are available on the Feature Pack for Microsoft SQL Server 2005 page.

First, to visualize our goals, let's examine following picture:

The left-most path is obvious and preferable for these implementations where the Analysis Services can be used as the analytical processing tool and any fully fledged external tool can be used to browse the analytical data.

The right-most path would be ideal for these implementations where neither Analytical Services nor any other external tool can be used but there is still demand to be able to do OLAP in favor of direct relational database querying.

Are you interested in the right-most path? Do you think it is impossible? Or you did try it but failed somewhere and were unable to complete the scenario? Or maybe you are just interested how to do it?

Just a remark before we start - this is not going to be the OLAP tutorial. I assume that you are aware of what OLAP is and why this is important to be able to browse multidimensional analytical data instead of heavy and ineffective queries on a relational database. If you are not aware of OLAP advantages over traditional database querying then stop reading right now because you will just not understand what I am talking about and why this is so effective and so cool to see how OLAP works and how it works with no Analysis Services in the background.

Thursday, November 15, 2007

Three typical maintenance queries for SQL Server vs Oracle

1. How to get the name of user currently connected to the engine?

MSSQL: SELECT SYSTEM_USER

Oracle: select user from dual

2. How to list schemas/databases available to user currently connected to the engine?

MSSQL: sp_helpdb

Oracle: SELECT * FROM ALL_USERS

3. How to retrieve the size of selected schema/database?

MSSQL: look at the column [db_size] returned by sp_helpdb

Oracle: select Sum(bytes) from dba_segments where owner=SCHEMANAME group by tablespace_name,owner

Tuesday, November 13, 2007

Seven Oracle Database peculiarities for Microsor SQL Server Developer

Few weeks ago I've finally managed to complete the Oracle Database driver for my custom Object Relational Mapping engine (I hope to write a tutorial on writing your own ORM engines soon) and the next release of one of our applications finally supports three database systems: Microsoft SQL Server, Oracle Database and the Microsoft JET Engine. Three different database systems from one executable, switchable in the runtime.

'What's the big deal' - you could ask - 'there are plenty of ORM engines here so why did you waste you time to implement yet another one'?

Well, I hope to write more on it soon, nevertheless the core of the engine dates 2003 when I had no general idea of the ORM pattern and there were definitely no .NET ORMs available.

The engine evolved through years and has been used in few succesive business application widely used. In the same time I've used NHibernate, Willson's ORM and DevExpress XPO for other projects and 've been playing with few more so I think I can clearly see the pros and cons of variety of engines.

But back to the main topic. I suppose that you reading this are the .NET developer who usually works with the Microsoft SQL Server. However your goal, given to you by someone else or by yourself, is to finally get in touch with this mysterious world of Oracle Databases.

Do not worry, this does not hurt. After just few experiments you will notice that the standard System.Data.OracleClient library works correctly in most cases so there are no major technical problems.

What will surely surprise you are the peculiarities or rather fundeamental differences between Oracle and Microsoft databases, which will make you rethink several important solutions you are accustomed to.

Let's then start enumerating these differences:

1. Oracle Database Instance = Microsoft SQL Server Instance

In the world of Oracle the two terms, a server and a database have completely different meaning which is really confusing SQL Server Developers. When you run something you would normally call "a server instance" it is in fact called "the database instance". Inside this "database instance" ("server instance" in SQL Server terminology) you have "schemas" ("databases" in SQL Server terminology). The rest works in the same way - "schemas" ("databases") are colletions of tables.

Yet another confusing issue however, is the equivalence of Oracle terms "schema" and "user". This stems from the fact that each physical user connecting to the Oracle database using his/her own credentials owns his/her set of tables. Everytime you refer to a table in a SQL query you have to provide its full name consisting of the name of table's owner (schema or user whatever you call it) and the table name. You just cannot write:

SELECT * FROM THETABLE

or actually you can but this will always refer to THETABLE in your own schema (your own "database" in SQL Server terminology).

Whenever you plan to use different schemas ("databases" in SQL Server) which is quite common practice, you have to remember that no matter what identity you use to connect to the database, you have to fully qualify the names of objects in different schemas (databases). So you always have:

SELECT * FROM SCHEMANAME.THETABLE

Why this is peculiar? It seems that the typical question:

"How do you allow you application to create different databases and let the user select one database to use in his session"? makes Oracle guys helpless. It seems that the idea of granting users rights to access other databases is rather uncommon. Later I will show you how to grant users access to other schemas (databases) and how to retrieve a list of available schemas (databases).

2. Oracle names cannot exceed 30 characters

This is really annoying one. Having once my SQL Server database with only about 70 tables, I've found about 50 objects (table names, column names, primary and foreign key names) which were longer than 30 characters.

A typical example. How would you call a many-to-many table linking A_THE_TABLE_WITH_NAME and B_THE_ANOTHER_TABLE?

MM_THE_TABLE_WITH_NAME_THE_ANOTHER_TABLE?

Wrong! Oracle will not like you.

The refactorization of your database to meet this peculiar limitation will surely hurt. Be aware that converters (for example the Migration Workbench) usually just trim the names without even making any notes. Later on you'll just find out that the application does not work on the Oracle because you got

MM_THE_TABLE_WITH_NAME_THE_ANO

I hope I've "emulated" the trimming correctly :)

3. Empty literals are stored as NULLs

What would you expect from

INSERT INTO THETABLE (name) VALUES ('')

with [name] as the NOT NULL column?

If you'd expect an empty literal inserted into the table, you are wrong. Oracle magically converts empty literals into NULLs and in the above example you'll just get an error saying that NULLs are not allowed here. There is a long discussion whether this is or is not a bug. Please refer to Google to find out what both sides of the discussion think about it.

I prefer to solve the issue by replacing empty strings with the hard space (\u00A0) but nevertheless, you have to take care of it by yourself.

4. There is no IDENTITY datatype

Accustomed to defining a primary key as the INT IDENTITY (1,1) NOT NULL?

Forget it. There's no auto identity in Oracle. Instead there are sequences. A sequence acts like a table which can be asked for a nextval which makes the sequence increment its current counter. A typical usage is like this:

SELECT SCHEMANAME.SEQNAME.nextval FROM dual;

There are unfortunately different strategies of modelling the SQL Server's IDENTITES in Oracle using sequences. One of the most widely suggested approaches is to define a trigger which modifies the ID field automatically after a record is inserted into the database with just dummy ID value. How are you supposed to get the inserted value in such approach? Well, you can ask for the current value of the sequence but what if someone else managed to retieve few values from the sequence in the meantime?

Frankly, I do not like this approach just because Oracle guys are not able to calm my doubts on the concurrency here.

What I prefer is to:

ask the sequence for a next value
use this value inside the INSERT query

This approach has its own distadvantage - it requires TWO queries to insert anything into the database. The good thing is that the concurrency issues are gone.

5. Structural queries are not transactional

Application usually live long and many times you will upgrade the database structure.

My typical approach is to store the database version numer in one of tables and prepare set of scripts in the application. When the app starts it retrieves the version number from the database and executes scripts in a sequence, one after the other.

Scripts are usually complicated. They create tables, columns, drop objects and manipulate the data. In SQL Server you are safe just when you put the whole script inside a transaction - no matter what you do, all changes are rollbacked in case of an error.

Oracle does not work like this. If a query modifies the structure of the database, it will not be rolled back. When a script fails, every query that modifies the structure remains commited and every query that manipulates the data is rolled back.

What is the state of the database AFTER such rollback? God only knows.

6. There are no different numerical types

In Oracle Database all numeric types are of type NUMBER(X,Y) where X and Y define the precision. In your client code you'll then see every number as System.Decimal even though you'd expect bytes, shorts or ints like are accustomed to.

and my favourite 7. All names of objects are converted to upper case by default

Be aware that

CREATE TABLE MySchema.TheTable ...

will not create TheTable table but THETABLE table. The same rule applies to all other database objects. How are you supposed to refer to objects that use CamelCasing for example?

Well, just remember to use ' around the names:

CREATE TABLE 'MySchema'.'TheTable'.

Again, Oracle guys seem to be rather surprised that you'd expect the database system to keep your own casing rather than converting everyting to uppercase.

'It makes you less error prone' - the say.

'It confuses me' - I say.

Conclusion

If I am wrong in any of above statements, please feel free to correct me. After a year of experiences I think I slowly start to live with all these issues, however the first contact with the Oracle Database was not easy. I wonder if I would feel the same but doing the opposite ie. moving from Oracle to SQL Server.

Object-Oriented Software Development