Feeds:
Posts
Comments

Archive for the ‘OLAP’ Category

If you follow the Kimball dimensional modeling approach, most dimensions will be star dimensions, some might be snowflake. Sometimes a business process will not fit this model, where one fact is associated with a single dimension member. This is when many to many dimensional modeling can be applied.

The example for this post will be the joint bank account scenario (accounts with multiple owners). I know this has been used over and over again, but hey this gives you the chance to make a direct comparison of the implementation in timeXtender with BIDS.

The joint accounts example

I will look at Jane and John, a couple with both personal and joint accounts

  • Jane has a personal checking account (balance 1000)
  • John has a personal checking account (balance 1000)
  • Jane and John have a joint savings account (balance 1000)

In this example I want the balance of the joint saving account to appear under both Jane and John. The main purpose here is to show how the many to many model affects the aggregation over the dimensions (the total is not equal to simply the sum of all members).

The model can be further refined by i.e. adding a ownership calculation to split the balance, but this will not be he focus of this post.

The following Excel view of the finished cube shows that Jane and John have a balance of 1000 each on their personal checking accounts, and both have access to the 1000 on the joint savings account. At the same time, the total balance of all accounts is still only 3000.

 

M2M Excel Pivot

 

The data model

The data model for the joint accounts can be seen below. In a real life scenario I would use surrogate keys, but to keep the focus on solving the main problem here, I will be using natural keys.

By examining the structure it appears that there is no direct relationship between the fact and the customer dimension tables (fact Account Balance and dim Customers). This relationship can only be established using the many to many join table (factLess CustomerAccount).

 

M2M DW Tables

The OLAP model

I trust that you already know the basics of how to create dimensions and cubes in timeXtender, thus I will only cover the specifics needed to add the many to many relationship.

The OLAP model will consist of the following

  • Account dimension
  • Customer dimension
  • Account Balance cube with fact tables (measure groups)
    • fact Account Balance (for balance measure)
    • factLess CustomerAccount (intermediate fact w/dummy measure)

Prior to adding the many to many relationship, the model will look this in timeXtender

image

 

Please notice that the Customer dimension is not related to the fact Account Balance fact table

M2M Dimension Relations

 

Adding the many to many relationship

The steps to add the many to many relationship are

  1. Right-click the Customer dimension under cube dimensions. On the context menu, select the Add Many To Many Relation
  2. In the dialog that appears, select the fact and factless tables as shown below

M2M Many 2 Many Join

 

The many to many relationship now appears under the cube dimension

 

M2M OLAP Model Complete

Read Full Post »

In this post I will show you how to use prototyping to demonstrate business requirements and get rapid and actionable feedback. How you got the business requirements to demonstrate, is not important for the purpose of this post, but assuming that you have read some of my earlier blog posts, it should be clear that I recommend taking an agile approach.

The goals for the prototype

  • Demonstrate business requirements
  • Deliver fast, using minimum resources
  • Minimize rework

I am going to build a prototype using the dimensional model approach as per best practice. Although not strictly necessary for the purpose of a prototype, I am even using surrogate keys, as I will most likely reuse the prototype for the permanent implementation after one or more very short iterations.

The four easy steps

  • Build the data warehouse star schema
  • Enter sample data in the model
  • Build the dimensional model in OLAP
  • Build a test front end in Excel using Pivot Tables

Step 1 – Build the data warehouse star schema

In a dimensional model, the business requirements can be illustrated as this star schema (also known as a star join schema). For those unfamiliar with dimensional modeling, the yellow table is called a fact table. It is the center of a star schema and represents the numeric measurements of an event and has keys (links) to the dimensions that describe the event. The blue tables are called dimension tables. In this example we can describe an event: A sale and describe it in the context of customer, product sold and the point in time at which the event took place.

Remembering that the purpose is to demonstrate the business requirements as fast as possible and with a minimum of rework, I will implement the dimensional model above with manually entered sample data in the data warehouse database. When I build the dimensional model in the timeXtender meta layer it will look like this. Please note the red fields at the bottom of the fact table. These fields are lookup fields that are used to translate the natural key joins between my dimension and fact tables into surrogate based joins.

Step 2 – Enter sample data in the model

Using the custom data feature in timeXtender, I will enter just enough data to be able to demonstrate the model. As a side note, you would be amazed at the difference it makes to your business stakeholders, if you use recognizable data compared to just more or less random values.

At this point in time, the model including the manually entered demo data only exists in the timeXtender metal model in memory. If I use the deploy and execute features at this point, timeXtender wll generate all necessary SQL code in the physical data warehouse database, load it with demo data and save the project definition to the timeXtender project repository database. The processed fact table will look like this.

  • If you do not need a multidimensional representation for your data (an OLAP cube), the data are at this point already accessible from your front end of choice.

Step 3 – Build the dimensional model in OLAP (Analysis Services)

I will use a drag and drop approach to create a very simple OLAP representation of the star schema. This will include standard dimensions, standard measures and will look like this in timeXtender

Step 4 – Build a test front end in Excel

I will use the most common of all BI front end tools to build a view on the data that I have just built. A Pivot Table is an excellent way to get a feel of the dimensions and measures of a cube.

What is next?

The Excel workbook I just created can now be delivered to my business stakeholders. They can then use this to verify the business requirements and provide instant feedback. The business stakeholders might come back with extensions or modifications to the dimensional model after seeing the first result. This might lead me down one or more additional iterations of the above steps, adjusting the model according to more detailed requirements. This is all fine and exactly why prototyping is such an efficient way to interact between data warehouse developers and stakeholders. The time I spend at this early point in the process will save me from late and costly rework.

Once the model has been verified I can continue to identify data sources needed to load actual data into the dimensional model just built. I will most likely need to make minor changes to the dimensional model on the data warehouse, such as different data types for non system fields. Except for these minor changes, the entire dimensional model including the cube can be reused and put into production after proper testing with real life data.

 

Video link

My colleague Thomas Lørup Duun has made a video on this topic available on YouTube. The video is actually only 15 minutes in real time, so the header in this post is a bit conservative :-)

Read Full Post »

One of the most challenging parts of building cubes is mastering the MDX language, as it really does not come natural to most of us. It is a very powerful language with almost unlimited possibilities when it comes to creating calculations. I normally say that “you can split the atom in MDX, you just need to figure out how to do it”. And that “how to do it” is the hard part, learning to master the language.

I work with a lot of very smart people in our network of partners and customers, people with years of experience in ETL, data warehousing and cubes. Only a very few of those would consider themselves as experts or even proficient when it comes to MDX.

I am no different than most of these people, thus using myself as an example I can divide my experience with MDX calculations into the following

  • Simple calculations like year to date, adding/dividing/subtracting a couple of measures. The syntax is easy, I can write them off the top of my head.
  • A few frequently used more advanced measures, using MDX functions that requires more than a single parameter. The syntax is known/remembered, I can write them off the top of my head.
  • A lot of previously used calculations of diverse complexity, for instance combining multiple MDX functions. I remember part of the syntax, but probably not the exact sequence of parameter and most likely not all the functions being used. I will look at previous projects, copy the calculation and adjust according to my current need.
  • New semi advanced to advanced calculations. I may have a solutions strategy, but will most certainly not remember the syntax for all functions needed and may need functions never used before. I will look in my books and use my friend Google. Using a trial and error approach, I will get the calculation right and be proud :-)

At timeXtender the product vision has always been driven by our commitment to speed in every aspect of building and maintaining data warehouses and cubes. Clearly we needed to find a more efficient way to deal with MDX calculations, to make it easier to reuse a calculation once sweat, blood and tears have been invested in finding the right, well performing formula for a calculation.

So what exactly is a business function?

A business function is best considered as a parameterized wrapper around a MDX statement used as a calculated measure in a SQL Server Analysis Services (SSAS) cube.

Use business functions to

  • Encapsulate complex MDX and expose using parameters
  • Enable easy reuse of calculations within a single project, across multiple projects and multiple people
  • Ensure consistent use of calculations (a reusable single definition)
  • Lower barrier for use of MDX
  • Eliminate broken MDX caused by i.e. renaming a dimension or measure

Create a business function

The basis for any business function, is a working MDX formula that will be entered and parameterized in the business function definition. I will walkthrough using the following example

I have created a calculated measure for a sales cube. This is used to calculate how every member contributes to each parent. In this specific example, I am using a hierarchy on a customer dimension, that shows my customers using a drill down path of all customers –> country –> city –> customer name

It will show i.e. with which percentage each country contributes to the total. When looking at a specific country, it will show how each city in that country contributes to the country total.

The MDX statement for the measure looks like this

IIF(([Measures].[Net Amount], [Customers – Invoice].[Customers by geography].CurrentMember.Parent) = 0,
    1,
    [Measures].[Net Amount]
    /
    ([Measures].[Net Amount], [Customers – Invoice].[Customers by geography].CurrentMember.Parent)
)

As it turns out, this is actually a quite generic calculation that can be used for any measure with any dimension hierarchy. Consequently I will create a reusable business function that can be used every time I need a measure to show how a value contributes to the totals within a dimension.

A business function definition

I will create a business function Relative Dimension Contribution, using the MDX statement above. By examining he statement, I have identified two parameters

  • The measure (value) for which the calculation is done
  • The dimension hierarchy used for the calculation

The parameters was renamed from the concrete statement above, using more generic names for a standard and reusable function. I have also added a description for the business function, making it easier for the next person to understand how to use this function.

Business Function

As you may have noticed, there is a library path at the bottom of the business function definition dialog. This allows me to organize the business functions into a logical structure, again to make it easier to reuse for myself and others.

Using a business function

Now that I have a business function, I will use this as the basis for a new measure which is needed to show another measure Physical Value over the Products by group hierarchy on the Products dimension.

The steps required to add a business function based measure are

  1. Right-click on the measures folder in a cube. On the context menu, select the Add business function option
  2. Find the appropriated business function in the list of measures
  3. Specify a name for the new measure
  4. Map parameters to the business function

Business Function Mapped

Sensitivity in object names

Business functions are utilizing the meta model of timeXtender using a simple markup language, that replaces parameters in the scripts with the values specified for parameters. This does not only make I significantly easier to change a parameter (without having to carefully walk through the code for every small change). It also means that a simple, common operation like renaming a measure, dimension, hierarchy etc. does not brake the MDX code.

In comparison, please consider what would happen to all manually written calculated measures, should someone rename any element used in the mdx statement. A calculated measure is simply a script persisted as text, thus very sensible to every small change done to object names.

Exporting business functions

Please be aware that business functions are stored at the repository level, but consumed at project level, making a business function available to all projects within a repository. It can be moved using the export and import features, that will export into a XML document.

Business function library

At timeXtender we have utilized the power of business functions and created a library of business functions based on our own field experience and valuable input from partners and customers. The library currently contains 28 reusable business functions and can be obtained via the partner portal. Customers with a maintenance agreement can get the library through their timeXtender implementation partner or may request it by email at support@timeXtender.com

The following shows the import dialog for the business function library

Business Function Library

Read Full Post »

%d bloggers like this: