Data mart

Data marts are designated to fulfill the role of strategic decision support for managers responsible for a specific business area.

Data warehouse operates on an enterprise level and contains all data used for reporting and analysis, while data mart is used by a specific business department and are focused on a specific subject (business area).

A scheduled ETL process populates data marts within the subject specific data warehouse information.

The typical approach for maintaining a data warehouse environment with data marts is to have one Enterprise Data Warehouse which comprises divisional and regional data warehouse instances together with a set of dependent data marts which derive the information directly from the data warehouse.

It is crucial to keep data marts consistent with the enterprise-wide data warehouse system as this will ensure that they are properly defined, constituted and managed. Otherwise the DW environment mission of being "the single version of the truth" becomes a myth. However, in data warehouse systems there are cases where developing an independent data mart is the only way to get the required figures out of the DW environment. Developing independent data marts, which are not 100% reconciled with the data warehouse environment and in most cases includes a supplementary source of data, must be clearly understood and all the associated risks must be identified.

Data marts are usually maintained and made available in the same environment as the data warehouse (systems like Oracle, Teradata, MS SQL Server, SAS) and are smaller in size than the enterprise data warehouse.
There are also many cases when data marts are created and refreshed on a server and distributed to the end users using shared drives or email and stored locally. This approach generates high maintenance costs, however makes it possible to keep data marts available offline.

There are two approaches to organize data in data marts:

  • Database datamart tables or its extracts represented by text files - one-dimensional, not aggregated data set; in most cases the data is processed and summarized many times by the reporting application.
  • Multidimensional database (MDDB) - aggregated data organized in multidimensional structure. The data is aggregated only once and is ready for business analysis right away.

    In the next stage, the data from data marts is usually gathered by a reporting or analytic processing (OLAP) tool, such as Cognos, Business Objects, Hyperion, Pentaho BI, Microsoft Excel and made available for business analysis.

    Usually, a company maintains multiple data marts serving the needs of finance, marketing, sales, operations, IT and other departments upon needs.

    Sample use of data marts in an organization: CRM reporting, customer migration analysis, production planning, monitoring of marketing campaigns, performance indicators, internal ratings and scoring, risk management, integration with other systems (systems which use the processed DW data) and more uses specific to the individual business.