In the previous blog post, Introduction to DataWarehouse & DataMart, we had a first approach to the concepts of DataWarehouse and DataMart where, as an introduction, I presented a basic definition of these two fundamental components in the architecture of any BI project. As you will remember, I ended up explaining that when implementing them there are different approaches to their characteristics and functions, the most widespread being those of Bill Inmon and Ralph Kimball.
To gradually delve into its main differences and be able to determine which option is the most appropriate in our projects, I will expose in this post the most outstanding features of the Inmon approach.
For him, a DataWarehouse has to be understood as a unique and global data warehouse for the entire company.
A repository that centralizes the data of the different operational systems of organizations so that they are validated and integrated into a single database.
In this model, the premise is that the information is stored at the highest level of detail (guaranteeing the future exploration of the data), remaining invariable and non-volatile, so that the changes suffered by the data over time are recorded without the possibility of being modified or deleted.
These are the fundamental keys of the architecture defended by Inmon, known as ‘Corporate Information Factory (CIF)’, where the DataWarehouse centralizes all the company’s data to feed, then small thematic DataMarts, which will be the access points for reporting tools In this sense, each department will have its own DataMart, supplied with the data from the DataWarehouse, ready for analysis and exploitation.
This Inmon approach is often referred to as a “Top-Down” work methodology, since it focuses first on a global vision of the company, to be dismembered in small departmental data sets. With this architecture, all the DataMarts of the organization are connected to the DataWarehouse, avoiding the appearance of inconsistencies and anomalies when comparing the data between different departments.
The structure of the DataWarehouse
As for the internal structure of the DataWarehouse, for Inmon the priority is that the data model is constructed in a normal third way. As a brief explanation of what this means, the normalization process consists in applying a series of rules or norms when establishing the relationships between the different objects within the database. With this standardization process, many benefits are achieved, such as avoiding data redundancy, maintaining referential integrity, facilitating the maintenance of tables and reducing the size of the database. However, unlike denormalized DataWarehouses, consultations require the use of much more complex queries, which makes it difficult to directly analyse information and use reporting tools. Hence, the need to build the DataMarts that, as I said, are based on dimensional star or snowflake models, easily exploitable designs by these data analysis tools.
In the next post I will expose Ralph Kimball’s approach to then be able to make a comparison of the highlights of both visions and establish the basis for determining which scheme best suits our needs when implementing a Business Intelligence Project.