Data allocation
Goal
Populate data for a daily sales report which indicates a profit margin for each invoice in the data warehouse.
This means that it will be feasible to get an information on how much revenue is generated by each invoice line.
Financial background
In absolute terms the profit margin can be illustrated with the following expression:
Profit margin = sales amount - costs - sales deductions (discounts) - rebates
Net profit margin = profit margin - taxes
Sales amount is the gross total sales figure listed on an invoice and paid by a customer
Sales deductions is a discount given during a sales transaction (listed on an invoice)
Costs include variable and fixed costs (provided on a monthly and yearly basis)
Rebates and customer bonus are usually given to a customer and calculated on a monthly, quarterly and yearly basis.
Data allocation concept
Data allocation (technique also referred to as filling gaps) is useful when dealing with data which has a different level of detail (granularity) and there are gaps for some measures.
In data warehousing systems, the allocation technique is in many cases compulsory and used widely in order to get a consistent and complete set of data.
The concept of data allocation is closely related to the granularity of the data. In data warehousing, data granularity refers to the level of detail in a given fact table. The tables below illustrate various levels of granularity.
Coarse-grained data (low granularity) Date value 2007 1000 2008 2000 2009 1500 ... Fine-grained (high granularity) Date value 20080101 8 20080102 15 20080103 12 20080107 14 20080109 11 ...
Sample measures that are very often allocated in a data warehouse are: costs, operational forecasts, sales plans, customer rebates and bonuses, etc.
There are two approaches for data allocation:
It is also important to keep in mind that in some business cases allocation is unsuitable. Prior to using allocation it is necessary to analyze the data thoroughly and make sure it fits into a business logic.
Scenario details
The company's data warehouse stores the sales data down to the invoice line level of detail and the costs which are calculated on a monthly (variable costs) and quarterly basis (fixed costs).
- The variable costs total value is assigned per year, month and product group.
- The fixed costs figure is a grand total assigned per year and month.
The aim is to compare revenue to fixed and variable costs in all time dimension levels available.
The source data has the following table structure:
Date_id;Invc_head;invc_line;prod_id;prod_grp;cust_id;quantity;price;sales_amount
Solution outline
The data allocation ETL process will be realized in a few steps:
- Load technical table invoices - the table contains all data related to the invoices, including gross sales and net sales
- Load updated monthly and yearly costs into a separate costs table
- Create another technical table which assigned importance levels and the following figures which are populated using a fixed allocation mechanism mentioned above for groups of data records: variable costs, fixed costs, sales invoice total
- Load the DW invoices table - with costs figures allocated accordingly and calculated profit margin