Creating Datastage Jobs

Types of jobs supported by Datastage

Datastage supports a few types of jobs:

Server jobs- basic jobs written on an Informix framework, they use Datastage hash files and Datastage basic routines.
Parallel jobs - A parallel job is an executable Datastage program, created using a graphical user interface in Datastage Designer and scheduled, executed and monitored in Datastage Director.
Datastage parallel jobs compile into Orchestrate shell language (OSH) and object code from C++, which makes them fully scalable (parallel jobs implement partitioning and pipelining).
Parallel jobs implement more stages and have a lot more settings than Server jobs.
Jobs sequences - jobs that control execution of other jobs. Sequences can be made of both server and parallel jobs.

Datastage job development process

Import technical metadata which defines all sources and targets - an import tool available in Datastage Designer can be used.
Add stages defining data extractions and data loading (sequential file stages, datasets, filesets, database connection stages). Rename the stages so they match the development naming standards.
Add data transformation stages (transformers, lookups, aggregators, sorts, joins, etc.)
Define the data flow from sources to targets by adding links.
Save and compile the job
Run the job in Designer or Director

Defining job parameters

It is very useful and flexible to use job parameters when designing Datastage jobs. Datastage 8 implements also job parameters sets which let users group the DataStage and QualityStage job parameters and store default values in files.

Job parameters are defined in job properties windows

Parameters can be used in directory and file names, to specify property values and in constraints and derivations

Parameters are defined at runtime

Surround parameters with the pound sign (#) to use parameters as file names and properties

Job parameters can reference system environment variables

Back to the Infosphere Datastage tutorial