Feeds:
Posts
Comments

Archive for March, 2011

The ability to track historically accurate values for dimension attributes, is a fundamental requirement in ETL and data warehousing. This is commonly referred to as slowly changing dimensions (SCD), of which the  most common variants are type 1 and 2. If you’re not already familiar with the concept of SCD, the books, articles and design tips from the Kimball Group are highly recommendable.

In short, the techniques used in handling changes are

Type 1: Overwrite the attribute
Type 2: Add a new dimension row

In a pure type 1 scenario, you will only have one instance of a record, with the attributes being overwritten as changes are brought in from the source systems.

If you’re not already using surrogate keys for your dimensions, now would be a good time to make that change. Once you enable SCD type 2 attributes, multiple instances of the same logical record will exist over time, thus invalidating the business key to be used as key for dimension members.

How to (manually) implement SCD on SQL Server

There are countless ways of implementing on SQL Server, including T-SQL stored procedures, custom components and built-in SSIS components . The predominating implementations are both SSIS based, using either the built-in Slowly Changing Dimension Transformation or the Kimball Method SCD component. The later is a huge improvement over the first one, especially when it comes to performance. This post will not discuss how to configure SCD using these components, but as the documentation and videos at the links above will show, it is a bit complex and requires some experience in both SSIS and T-SQL.

How to enable SCD type 1 and 2 in timeXtender

The default behavior for any table in timeXtender is what we call snapshot load. The table is truncated and fully loaded from the source(s) at each execution, thus no attempt is made to track historical values. But don’t worry, it will take you less than 5 minutes to enable type 1 and/or type 2 changes to be tracked.

To handle type 1 and 2 changes do the following

  1. Enable Slowly Changing Dimensions under Advanced Settings for the table
  2. Define the business (natural) key
  3. Choose type 1 attributes
  4. Choose type 2 attributes
SCD Table Settings

The configuration of the natural key and SCD attributes is done in a single, easy to use dialog

SCD Attribute Settings
And that’s all you need to do to enable SCD type 1 and 2 on a table in timeXtender. Hit the deploy button and the code to handle the actual data will be generated on SQL Server.

Read Full Post »

One of the most common explanations to why data warehousing and BI projects fail to deliver as expected is that what a user wants at the beginning, rarely matches what he or she really wants in the end. Is this because they don’t know what they really want, because they are incapable of knowing what they want, or is it something else?

In my opinion the root cause is that the process is too often approached by traditional, phased project management methodologies such as the waterfall model. This will inevitable lead to a lengthly process with late deliveries as one of the results. This approach also implies that there is a fixed end result; thus forcing the users into trying to predict all future questions at the beginning of the process.

The agile approach

Agile (or any other iterative approach) works well in the BI area because users can be tighter involved in the process. You will be able to see the reports or whatever form is used for visualization, comment, modify and evolve in short iterations. With a rapid deployment tool to implement the physical data warehouse, iterations can be as short as a couple of days, leading to much faster results. In addition to providing very early actionable  information, this approach will also drive down costs by reducing the amount of rework.

The more you know, the more you know you don’t know

The need to solve business problems and be able to make better, faster and more relevant decisions, is constantly driving changes to the enterprise data warehouse. Once you acknowledge that you won’t be able to predict all future needs up front, it should become clear that you shouldn’t deal with these constant changes using the traditional project management methodologies. Taking an agile approach will make you much more efficient in today’s volatile business environment.

Read Full Post »

%d bloggers like this: