Data allocation

Goal

Populate data for a daily sales report which indicates a profit margin for each invoice in the data warehouse.
This means that it will be feasible to get an information on how much revenue is generated by each invoice line.

Financial background

In absolute terms the profit margin can be illustrated with the following expression:
Profit margin = sales amount - costs - sales deductions (discounts) - rebates
Net profit margin = profit margin - taxes
Sales amount is the gross total sales figure listed on an invoice and paid by a customer
Sales deductions is a discount given during a sales transaction (listed on an invoice)
Costs include variable and fixed costs (provided on a monthly and yearly basis)
Rebates and customer bonus are usually given to a customer and calculated on a monthly, quarterly and yearly basis.

Data allocation concept


Data allocation (technique also referred to as filling gaps) is useful when dealing with data which has a different level of detail (granularity) and there are gaps for some measures.
In data warehousing systems, the allocation technique is in many cases compulsory and used widely in order to get a consistent and complete set of data.

The concept of data allocation is closely related to the granularity of the data. In data warehousing, data granularity refers to the level of detail in a given fact table. The tables below illustrate various levels of granularity.

Coarse-grained data (low granularity)
Date	value
2007 	1000
2008	2000
2009	1500
...

Fine-grained (high granularity)
Date		value
20080101		 8	
20080102		15
20080103		12
20080107		14
20080109		11
...


Sample measures that are very often allocated in a data warehouse are: costs, operational forecasts, sales plans, customer rebates and bonuses, etc.

There are two approaches for data allocation:
  • Dynamic allocation (weighted or proportional allocation) - values are allocated using calculated subtotals of another value. The weighted type of allocation is often used in real-life data warehouses environments. Sample uses of dynamic allocation: designate portions of a budget pool, allocate manufacturing costs to products, etc.
  • Fixed allocation - which means that there is a constant value assigned to all records included in the allocation group. Be aware that this approach might be risky and confusing as those values cannot be summarized. Sample uses of fixed allocation are: storing values that do not change often (for example credit card limit) or cannot be allocated dynamically.

    It is also important to keep in mind that in some business cases allocation is unsuitable. Prior to using allocation it is necessary to analyze the data thoroughly and make sure it fits into a business logic.

    Scenario details

    The company's data warehouse stores the sales data down to the invoice line level of detail and the costs which are calculated on a monthly (variable costs) and quarterly basis (fixed costs).

    - The variable costs total value is assigned per year, month and product group.
    - The fixed costs figure is a grand total assigned per year and month.

    The aim is to compare revenue to fixed and variable costs in all time dimension levels available.

    The source data has the following table structure:
    Date_id;Invc_head;invc_line;prod_id;prod_grp;cust_id;quantity;price;sales_amount

    Solution outline

    The data allocation ETL process will be realized in a few steps:

    1. Load technical table invoices - the table contains all data related to the invoices, including gross sales and net sales
    2. Load updated monthly and yearly costs into a separate costs table
    3. Create another technical table which assigned importance levels and the following figures which are populated using a fixed allocation mechanism mentioned above for groups of data records: variable costs, fixed costs, sales invoice total
    4. Load the DW invoices table - with costs figures allocated accordingly and calculated profit margin

    Solutions and sample implementations

  • Data allocation in Pentaho Data Integration - sample ETL processing in PDI based on a production/manufacturing data
  • For further analysis please also refer to the Cognos measure allocation example. Cognos Business Intelligence applications provide an automated, built-in mechanism to implement the data allocation technique.



    Back to the Data Warehousing tutorial home