Processing a header and trailer textfile in Datastage

Example case

The input file in our scenario is a stream of records representing invoices. It is a text file in a header-trailer format and has the following structure:
- A header starts with a letter H which is followed by an invoice number, a customer number, a date and invoice currency
- Every item starts with a letter I which is followed by product ID, product age, quantity and net value
- And the trailer starts with a T and contains two values which fullfill the role of a checksum: the total number of invoice lines and total net value.

Header and trailer input textfile in Datastage:

Solution

Datastage job design which solves the problem of loading an extract structured into headers and items:

Detailed illustration of each component of the job

trsfGetKinds transformer - a type of record is extracted from each input line and assigned and written into a kind column

The transformer reads first character of each line to mark records and divide them into kinds:

trsfAssignHeaders transformer assigns invoice headers to each line of the input flow. Invoice header is a 8 character key which is extracted from the dataline using the following formula: L02.data_line[1,8]

A datastage transformer which assigns headers using a stage variable:

trsfReformat transformer splits the data flow into invoice headers and invoice lines. A InvoiceLineID stage variable is used to populate a sequence number for each invoice line. In order to correctly load net value, the ereplace function replaces commas with dots in this numeric field.

The following transformer splits the flow into headers and items and reformats records into a corresponding structure:

Target oracle tables with correctly loaded invoice headers and lines:

Header and trailer processing example main page

Back to the Data Warehousing tutorial home