Sunday, 21 July 2019

Data Warehouse: Next phase of development:


I now have a dimension view ready to put into a dimension table. General wisdom states you shouldn’t go straight into the dimension table, equally I don’t have a requirement of SCD (slowly changing dimension) for this DIM. So in this instance I am going to go into a staging table and then into a DIM table that will be its final resting place.

This seems simples however as I want to create a generic system I am going to put some processes in place to cater for different scenarios without developing complex code for lots of different things. I have seen several different approaches to this in my career and the easiest to maintain is the second of the 2 options below, it requires far more set up but is well worth it.

1) Create long insert statements for each table. These are in a package and all get called as needed to get the end result.

2) Use a table of meta data to call views as insert statements into the required tables. Lots of set up needed and templates etc. but overall maintenance is much simpler.

3) Use something like ODI or SSIS. Whilst these are very good they would not be the custom build approach I am try to achieve. Plus I have interviewed people who have experience with these tools but it does nothing to show they can actually code in SQL / T-SQL / PL-SQL.

So for my home data warehouse project I am going to go with the implementation plan of 2 above. For this I need to design a template for the meta-data tables. To start with I will go basic and they will probably increase in complexity as I go.
I guess my next post will be the structure and code to create the meta data tables. Once I have these in place I will start to work on a control process, actually 2 control process.

1) Control the numbers – PL/SQL package
2) Control the processes and their flows – Hopefully a fun piece of python, though this might take a while to develop as I am not sure a whiz with python as SQL.

No comments:

Post a Comment