Friday, June 27, 2008

Complete OLAP infrastructure without Microsoft Analysis Services, part 3

In previous part of this tutorial we've built an analytical database using Microsoft Analysis Services. We've been able to browse the database using SQL Management Studio and execute MDX queries on it.

In this part of the tutorial we will:

  • build an offline static cube (*.cub file) from the relational database using C# and ADOMD.NET
  • query the static cube using Microsoft Excell as static cube browser
  • query the static cube with MDX queries using C# and ADOMD.NET
Building static cubes from relational data using ADOMD.NET

This step occured to be the most difficult one. It seems that building static offline cubes with the PivotTable Services technology I've mentioned before is well known and used. However, the same approach fails with ADOMD.NET since the ADOMD.NET API does not directly correspond do the PivotTable Services API. It turns out that completely different approach must be taken. The approach will use XMLA (Xml for Analysis) together with ADOMD.NET as primary tools.

What you should do is to write an XMLA script which would create the database and execute this script as command on the static offline cube the ADOMD.NET is connected to.

The question is: how to obtain the correct XMLA script?

The answer is: since XMLA is just an XML, you can write the script manually. However, having the Analysis Services elsewhere you can:

  • connect to the Analysis Services using Management Studio
  • select the database in the Object Explorer
  • invoke Script Database/CREATE To action available from the context menu
  • save the script elsewhere

Note, that the XMLA script will be rather unreadable and long. The script generated for the analytical database we've built for our tutorial is 57kB long!

<Create xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
    <ObjectDefinition>
        <Database xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2">
            <ID>CubeDemo</ID>
            <Name>CubeDemo</Name>
            <Language>1033</Language>
            <Collation>Polish_CI_AS</Collation>
            <DataSourceImpersonationInfo>
                <ImpersonationMode>Default</ImpersonationMode>
            </DataSourceImpersonationInfo>
            <Dimensions>
                <Dimension>
                    <ID>PERSON</ID>
                    <Name>PERSON</Name>
                    <Annotations>
                        <Annotation>
                            <Name>http://schemas.microsoft.com/DataWarehouse/Designer/1.0:DiagramLayout</Name>
                        </Annotation>
                        <Annotation>
                            <Name>http://schemas.microsoft.com/DataWarehouse/Designer/1.0:ShowFriendlyNames</Name>
                            <Value>true</Value>
                        </Annotation>
                        <Annotation>
                            <Name>http://schemas.microsoft.com/DataWarehouse/Designer/1.0:ShowRelationshipNames</Name>
                            <Value>false</Value>
                        </Annotation>
                        <Annotation>
... a lot of XMLA code follows

What is really important is one specific section of the script:



... a lot of XMLA stuff above
<DataSources>
      <DataSource xsi:type="RelationalDataSource">
          <ID>Cube Source</ID>
          <Name>Cube Source</Name>
          <ConnectionString>Provider=SQLNCLI.1;Data Source=WZYCHLA\SQL2005;Integrated Security=SSPI;Initial Catalog=CubeSource</ConnectionString>
          <ImpersonationInfo>
              <ImpersonationMode>ImpersonateServiceAccount</ImpersonationMode>
          </ImpersonationInfo>
          <Timeout>PT0S</Timeout>
      </DataSource>
  </DataSources>

This particular section of the script defines the connection to the relational database serving as the source for the analytical database. In real application, you read the script generated in your development environment containing the connection string to your relational datasource and dynamically change the connection to a correct one, valid in production environment! You can event connect to other data providers like Oracle or PostreSQL, just build the analytical database on the Analysis Services and check the corresponding DataSource node in the generated script (scripts for different DBMSes seem to be quite different, it's not that you only replace the connection string!)


This XMLA script, however, is still not enough to correctly create static offline cube. You have to wrap it in a stub script, defining an actual action for the analytical engine.


The wrapper script is presented below:



<Batch xmlns='http://schemas.microsoft.com/analysisservices/2003/engine'>
 
CUBESCRIPTHERE
 
<Parallel>
  <Process xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
    <Object>
      <DatabaseID>CubeDemo</DatabaseID>
    </Object>
    <Type>ProcessFull</Type>
    <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
  </Process>
</Parallel>
</Batch>
 

There's one crucial issue here: note that the XMLA script generated from the Analytical Database contains Create/ObjectDefinition/Database/Name node and the stub contains Parallel/Process/Object/DatabaseID node. Values of these nodes must match! Otherwise you'll get nothing since the actual command will try to create the database for which there'll be no definition available in the script above!


Building the offline static cube using C# and ADOMD.NET

Now, let's switch to C#, assuming that you have installed all the components I've mentioned previously. Specifically, these components are required:



  • Microsoft ADOMD.NET

  • Microsoft Core XML Services (MSXML) 6.0

  • Microsoft SQL Server Native Client

  • Microsoft SQL Server 2005 Analysis Services 9.0 OLE DB Provider (very, very important!)

Create a console application and add a reference to the Microsoft.AnalysisServices.AdomdClient library which is a part of the ADOMD.


Now, we are ready to create a static cube (note that the ResourceManager is my custom class which reads resources embedded in assembiles. You have to replace my ResourceManager to your own code):



AdomdConnection conn = new AdomdConnection();
/* note that the data source points to a FILE */
conn.ConnectionString = 
    string.Format( "Provider=MSOLAP;Data Source={0}", "c:\\000\\cubedemo.cub" );
 
try
{
    conn.Open();
 
    AdomdCommand cmd = new AdomdCommand();
    cmd.Connection = conn;
 
    /* this returns the stub XMLA */
    string xmlastub = ResourceManager.GetString( "xmlastub.txt" );
    /* this return the XMLA which contains scripted database */
    string xmlacube = ResourceManager.GetString( "xmlacube.txt" );
 
    /* replace the CUBESCRIPTHERE with actual script */
    cmd.CommandText = xmlastub.Replace( "CUBESCRIPTHERE", xmlacube );
 
    /* this does the job */
    cmd.Execute();
 
    Console.WriteLine( "ok" );
}
catch ( Exception ex )
{
    Console.WriteLine( ex.Message );
}
finally
{
    conn.Dispose();
}
 
Console.ReadLine();

And this is it! If you've installed all required components, you'll eventually end with a static cube located in the C:\000\cubedemo.cub file!


Using Microsoft Excel to browse static offline cubes

If you have the Microsoft SQL Server 2005 Analysis Services 9.0 OLE DB Provider installed on your client OS, you can browse the static offline cube with Microsoft Excel easily. Just invoke Open command from Excel menu, select All files filter, so that the *.cub file becomes visible and just select it. Excel will happily open the offline cube and you'll be presented with an interface allowing you to build pivot tables:



Using C#, ADOMD.NET and MDX to browse static offline cubes

Using ADOMD.NET you can connect to any analytical database and execute MDX queries on it (in fact, you can also execute XMLA queries with MDX statements embedded inside, the XMLA extends MDX by allowing MDX statements to be included inside). The schema of the code is similar to the previous case of creating cubes, with the only difference that you specify the MDX statement and read the data using AdomdDataReader or fill DataSets using AdomdDataAdapter (just like any ADO.NET provider would do).


Create a C# Windows Forms application, put a Button and a DataGridView on it and in a click event handler of the button paste the following code:



AdomdConnection conn = new AdomdConnection();
 conn.ConnectionString =
     string.Format( "Provider=MSOLAP;Data Source={0}", "c:\\000\\cubedemo.cub" );
 
 try
 {
     conn.Open();
 
     AdomdCommand cmd = new AdomdCommand();
     cmd.Connection = conn;
 
     /* XMLA query used previously in Management Studio */
     cmd.CommandText =
         "select " +
         "[DIVISION].[Name].MEMBERS on ROWS, " +
         "[PERSON].[Name].MEMBERS on COLUMNS " +
         "from [Cube Source] " +
         "where [Measures].[SALE Count] ";
 
     /* ADOMD is just an ADO.NET provider and so 
      * principles are the same
      */
     AdomdDataAdapter da = new AdomdDataAdapter( cmd );
     DataSet ds = new DataSet();
 
     da.Fill( ds );
 
     dataGridView1.DataSource = ds.Tables[0];

This yields following visual effect:


 


Other ADOMD.NET applications

The ADOMD.NET library is not developed to support static cubes only. In fact, it supports any analytical data source - you can also create or browse data in the Microsoft Analysis Services as well.


It then means that you can build an analytical layer on your application using ADOMD.NET and only choose whether to use full Microsoft Analysis Services or free static offline cubes according to project conditions.

Complete OLAP infrastructure without Microsoft Analysis Services, part 2

Over a half of a year ago I've promised to write a tutorial on OLAP which would require no analytical server. Let's then go back to the issue.

In this part of the tutorial we:

  • prepare a source relational database
  • build an analytical database with Analysis Services from the source database
  • browse the analytical database
Building Source Relational Database

The source database is any relational database. I choose SQL Server 2005 for this tutorial, mainly because of SQL Management Studio but, in fact, an analytical database can be built using any relational dbms as the source.

In our source database we'll have three tables: DIVISION, PERSON and SALE:

Ultimately, the PERSON and DIVISION will became dimensions in analytical database and the SALE will became a measure.

Building Analytical Database from Relational Database

Launch Visual Studio 2005 and create new Business Intelligence Projects/Analysis Services Project. There are three steps to follow, easy if you follow the order presented in the Solution Explorer:

  • define the data source so that the analytical database connects to the relational database for the relational data (Data Sources/New Data Source in the Solution Explorer)
  • define the data source view (Data Source Views/New Data Source View) adding all three tables, PERSON, DIVISION and SALE
  • define a cube (Cubes/New Cube) just following the wizard or making any adjustments
  • deploy the analytical database

Browsing the Analytical Database

The analytical database can be browsed in many ways. Below is a screenshot of the cube seen in the Management Studio:

and an example MDX query:

select 
    [PERSON].[Name].MEMBERS on ROWS,
    [DIVISION].[Name].MEMBERS on COLUMNS
    from [Cube Source]
    where [Measures].[SALE Count]

Thursday, June 26, 2008

C# Puzzle No.10 (beginner)

Is it possible that class A has direct access to private members of class B?

No, of course not. This is against encapsulation - this is the most common answer. However, there is at least one case where this actually happens and is even very important feature of the language.

So the question is : in C#, under which circumstances class A has direct access to private members of another class B?

Wednesday, June 25, 2008

C# Puzzle No.9 (beginner)

Consider a delegate declaration which takes a function and returns another function:

delegate Func<int, int> FuncConversionDelegate( Func<int, int> Func );

Your goal is to write a lambda expression defining a converter which would negate values returned by original method:



FuncConversionDelegate Negation = 
    /* 
       replace this with a lambda expression 
       which takes function f and returns function g 
       so that g(x) = -f(x)
    */;

Specifically, you have to replace the commented expression above to a correct definition.


To test the converter, take the identity function:



static Func<int, int> Identity = x => x;

and apply Negation to it:



Negation( Identity )( 5 )
expecting the value of -5 to be returned.

Tuesday, June 24, 2008

C# Puzzle No.8 (beginner)

It is interesting to note what dilemmas the compiler has to actually resolve during the compilation. In the code below there are two possibilities : the compiler will either pick the method from the base class having the same signature or the method from actual class for which the conversion of the parameter is required.

class A
  {
      public void Foo( int n ) 
      {
          Console.WriteLine( "A::Foo" );
      }
  }
 
  class B : A
  {
      /* note that A::Foo and B::Foo are not related at all */
      public void Foo( double n ) 
      {
          Console.WriteLine( "B::Foo" );
      }
  }
 
 
  static void Main( string[] args )
  {
      B b = new B();
      /* which Foo is chosen? */
      b.Foo( 5 );
  }
Could you explain the compiler behaviour without running the code?

Friday, June 20, 2008

C# Puzzle No.7 (intermediate)

After two previous puzzles, you'll surely have no problems to determine the output of following code snippet without actually running it. Specifically, how many times the X will be printed?

List<int> list = new List<int>() { 1, 2, 3 };
 
list.GroupBy ( i => { Console.Write( "X" ); return i; } );
list.ToLookup( i => { Console.Write( "X" ); return i; } );
This Puzzle will let you think about differences (if any) between two seemingly identical methods, GroupBy and ToLookup.

C# Puzzle No.6 (intermediate)

If you have correctly solved the 5th C# puzzle, you should have no problems in predicting the output of following code snippet (without actually running it):

List<int> list = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
 
list.Where( i => { Console.WriteLine( i ); return i < 5; } );

Thursday, June 19, 2008

C# Puzzle No.5 (intermediate)

Can you predict the output of the code above without actually running it?

List<int> list = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
 
list.FindAll( i => { Console.WriteLine( i ); return i < 5; } );

C# Puzzle No.4 (intermediate)

It is common to use the lambda expression syntax. Whenever a method is required expecting a parameter and returning something, you can provide

parameter => expression

instead of boring



ReturnType Foo( ParameterType parameter )
{
    expression;
}

and then compiler will interfere types for you for both parameter and expression.


The question is, how do you provide lambda expressions for delegates which expects no parameters?



delegate int FooDelegate();
static void Foo( FooDelegate Delegate )
{
    Console.WriteLine( Delegate() );
}
 
static void TheProblem()
{
   Foo( * );
}

Your goal is to fill the * above with the lambda expression equivalent to following delegate:



int ConvertThisToLambdaExpression()
{
  return 5;
}

C# Lambda Expressions Can Contain Method Bodies

I know a lot of people who love and use the lambda expression syntax provided in C# 3.0 as a shortcut for explicit delegate definitions.

Suprisingly however, quite few of them actually know that lambda expression can not only have a suprisingly variable => expression form like:

x => x < 5

but can also contain method bodies as expressions like



x => { Console.WriteLine( x ); return x < 5; }