Workflow

Purpose

Workflow, designed in Workflow Manager, is a collection of tasks that descibe runtime ETL processes. Speaking the IBM Infosphere Datastage language, Worflows are Job Sequences, Flows in Ab Initio and Jobs in Pentaho Data Integration.

Examples / useful tips

  • Use a parameter file to define the values for parameters and variables used in a workflow, worklet, mapping, or session. A parameter file can be created with any text editor such as Notepad or Vi.
  • When developing a sequential workflow, it is a good idea to use the Workflow Wizard to create Sessions in sequence. Dependencies between the sessions can be created.
  • Session parameters must be defined in a parameter file. Since session parameters do not have default values, when the Integration Service cannot locate the value of a session parameter in the parameter hie, the session initialization fails.
  • On under-utilized hardware systems, it may be possible to improve performance by processing partitioned data sets in parallel in multiple threads of the same session instance running on the Integration Service node. However, parallel execution may impair performance on over-utilized systems or systems with smaller I/O capacity
  • Incremental aggregation is useful for applying captured changes in the source to aggregate calculations in a session.
  • From the Workflow Manager Tools menu, select Options and select the option to 'Show full names of task'. This will show the entire name of all tasks in the workflow.