IN2BI, Microsoft Business Intelligence
data, stories & insights
rss | email | twitter

DWH DECK Part 6: Data Vault

Adapted for version 1.1 on 18/04/2011

This is the sixth part of a series I'm writing on the DWH DECK. A tool that will allow you to build and maintain your data warehouse better and faster. This series will act as documentation and will give you some insides into the way the program works.
Overview of this series
 
Many Tables in your data warehouse
Using the "simple" demo database Northwind with only 13 source tables DWH DECK will build 57-110 tables depending on your choices. A breakdown by table type:
13 Staging tables
10 Hub tables
7 Link tables
17-70 Satellite tables
7 Dimension tables
3 Fact tables

At first hand this seems to be overkill. Especially the many relative small tables in the data vault will be met skeptical by some. The reason we do this is that a data warehouse should be built with the future in mind… Your company will make major changes in their erp and crm systems… They will acquire new companies… They will start new initiatives… And management will change (their focus) …And all this will lead to changes in your data warehouse.
 
Utilizing a technique like the data vault with many 'small' tables will give you the flexibility to adapt better and faster to these changes. This is for me the most import reason for using the data vault method. But of course there are many more. Dan Lindstedt the inventor of the data vault describes these advantages in:
http://danlinstedt.com/datavault/about-the-data-vault/
 
The Data Vault in essence consists of three types of tables:
 
Hub Entities (Hubs)
A hub is a unique list of business keys. The business key is the key that business users use in their daily operations to describe an object. (e.g. account number for general ledger accounts, the VIN (Vehicle Identification Number) of cars)
It's not always easy to determine the business key and sometimes you will have to use more than one key.
(e.g. ZIP Code and House number or Last Name and First Name)
 
By default the DWH Deck assigns the primary key of a table as a business key because these are unique. But that's often not the best choice! Speak to the business users and check the profile information of each column. Columns with distinct value percentage of 100% are good candidates for the business key of the table.
You can easily change the business key by checking and unchecking the business key checkbox:
 
 
Link Entities (Links)
Links represent the relationships between hubs and other links. In the DWH Deck links are created based on the relationships in the source system. If these relationships are all well-defined and build in the source system you won't have to do anything. If not there are several ways to create them. (e.g. click the button "New Primary relation" or "New Foreign relation" and selected the relevant table and column to create the relationship.
 
Satellite Entities (Satellites)
Satellites contain descriptive information of the hubs and links. They provide context to the hubs and links. By default all candidate satellite columns are placed in their own satellite table. The DWH Deck uses the Satellite textbox for the satellite's table name. This results in a table for almost every column.
Often it is better to group related satellites like address into the same table. This can be accomplished by using the same name in the Satellite Group textbox. (e.g. use address for street, housenumber, zipcode, city etc)

» Similar Posts

  1. DWH DECK Part 7: Data Mart
  2. DWH DECK Part 3: Importing Source Information
  3. Dimensional modeling and Data Vault – a happy marriage?

» Trackbacks & Pingbacks

    No trackbacks yet.

» Comments

    There are no comments.

Comments are closed