Creating Datastage Jobs
Types of jobs supported by Datastage
Datastage supports a few types of jobs:
- Server jobs- basic jobs written on an Informix framework, they use Datastage hash files and Datastage basic routines.
- Parallel jobs - A parallel job is an executable Datastage program, created using a graphical user interface in Datastage Designer and scheduled, executed and monitored in Datastage Director.
Datastage parallel jobs compile into Orchestrate shell language (OSH) and object code from C++, which makes them fully scalable (parallel jobs implement partitioning and pipelining).
Parallel jobs implement more stages and have a lot more settings than Server jobs.
- Jobs sequences - jobs that control execution of other jobs. Sequences can be made of both server and parallel jobs.
Datastage job development process
- Import technical metadata which defines all sources and targets - an import tool available in Datastage Designer can be used.
- Add stages defining data extractions and data loading (sequential file stages, datasets, filesets, database connection stages). Rename the stages so they match the development naming standards.
- Add data transformation stages (transformers, lookups, aggregators, sorts, joins, etc.)
- Define the data flow from sources to targets by adding links.
- Save and compile the job
- Run the job in Designer or Director
Defining job parameters
It is very useful and flexible to use job parameters when designing Datastage jobs. Datastage 8 implements also job parameters sets which let users group the DataStage and QualityStage job parameters and store default values in files.
Job parameters are defined in job properties windows
Parameters can be used in directory and file names, to specify property values and in constraints and derivations
Parameters are defined at runtime
Surround parameters with the pound sign (#) to use parameters as file names and properties
Job parameters can reference system environment variables
|
Add a comment