Knowing what a DataWarehouse and a DataMart are and, above all, understanding its purpose and the growing need of organizations to implement them is really important to get to understand, from a global point of view, what Business Intelligence is and to be able to undertake a project of these characteristics.
If we look back at our blog, we will see that they are two terms already mentioned when we talked about the components involved in the architecture of a Business Intelligence project. And as we will see, there is a great discussion here, since there are different types of BI architectures according to the purpose and scope that is given to the datawarehouse and, consequently, to the datamart, with solid differences in the definition and implementation of both.
The concept of DataWarehouse was born in the 80s due to the need to develop a data storage system that would guarantee the fluidity, order and easy handling of data and which, at the same time, meant savings in time and budget for companies compared to the systems used so far.
A DataWarehouse is, therefore, a container in which data from different sources that may exist in an organization are stored, being integrated, debugged and sorted in a single centralized database. In this warehouse, the data will be stored for the period of time required to meet the needs of each organization.
With this system, companies manage to have all the data of their different business processes integrated in a single container, ready to be analysed using the exploitation and reporting tools.
But let’s not forget the DataMart, whose definition is quite similar to that of the DataWarehouse, its scope being the main difference between these two types of databases.
While a DataWarehouse contains all the data of an organization, a DataMart only collects a subset of these, focusing on a specific area within the business. Its objective is to cover the needs of a specific department within the organization, so it could be defined as a departmental data warehouse.
The DataMart is a query-oriented system, whose internal data distribution is clear and there is no doubt about it, being structured in dimensional star or snowflake models. However, I cannot say the same thing about the DataWarehouse, for which there are different approaches in terms of its features and functions. In this sense and referring to the beginning of this post where I commented that there are different types of architectures, an open debate has taken place since the 1990s on the basis of the DataWarehouse.
There are other approaches to the internal structure and construction of the DataWarehouse, but the most important are those of Bill Inmon and Ralph Kimball.
In the next blog post I will talk about their views when developing a DataWarehouse and the type of architecture that each one defends, so that we can choose the most appropriate approach in each case when we have to address different BI projects.