DataStage EE environment variables
The default environment variables settings are provided during the Datastage installation (common for all users).
Users have a few options to override the default settings with Datastage client applications:
With Datastage Administrator - project-wide defaults for general environment variables, set per project in the Projects tab under Properties -> General Tab -> Environment...
With Datastage Designer - settings at the job level in Job Properties
With Datastage Director - settings per run, overrides all other settings and is very useful for testing and debuging.
The Datastage environment variables are grouped and each variable falls into one of categories.
Basically the default values set up during an installation are resonable and in most cases there is no need to modify them.
Setting environment variables for parallel execution in Datastage Administrator
Environment variables overview
Listed below are only environment variables that are candidates to adjustment in real-life project deployments. Please refer to the datastage help for details on variables not listed here.
General variables
LD_LIBRARY_PATH - specifies the location of dynamic libraries on Unix
PATH - Unix shell search path
TMPDIR - temporary directory
Parallel properties
APT_CONFIG_FILE - the parallel job configuration file. It points to the active configuration file on the server. Please refer to Datastage EE configuration guide for more details on creating a config file.
APT_DISABLE_COMBINATION - prevents operators (stages) from being combined into one process. Used mainly for benchmarks.
APT_ORCHHOME - home path for parallel content.
APT_STRING_PADCHAR - defines a pad character which is used when a varchar is converted to a fixed length string
Operator specific
The operator specific variables under parallel properties are stage specific settings and usually set during an installation. The settings apply to the supported parallel database engines (DB2, Oracle, Sas and Teradata).
APT_DBNAME - default DB2 database name to use
APT_RDBMS_COMMIT_ROWS - RDBMS commit interval
Reporting
The reporting variables control logging options and take True/False values only.
APT_DUMP_SCORE - shows operators, datasets, nodes, partitions, combinations and processes used in a job.
APT_RECORD_COUNTS - helps detect and analyze load imbalance. It prints the number of records consumed by getRecord() and produced by putRecord()
OSH_PRINT_SCHEMAS - shows unformatted metadata for all stages (interface schema) and datasets (record schema). OSH_PRINT_SCHEMAS environment variable should be set to verify that runtime schemas match the job design column definitions (especially from Oracle).
OSH_DUMP - shows an OSH script and produces a verbose description of a step before executing it
APT_NO_JOBMON - disables performance statistics and process metadata reporting in Designer.
Compiler
APT_COMPILER - path to the C++ compiler needed to compile transformer stages