DataStage overview

DataStage is on of the leading ETL products on the BI market. The tool allows integration of the data across multiple systems and processing high volumes of the data. Datastage has an user-friendly graphical frontend to designing jobs which manage collecting, transforming, validating and loading data from multiple sources, such as the enterprise applications like Oracle, SAP, PeopleSoft and mainframes, to the data warehouse systems.
The application is capable of integrating meta data across the data environment to maintain consistent analytic interpretations. Datastage provides data quality and reliability for accurate business analysis and reporting.

Datastage history

Datastage was formerly known as Ardent DataStage followed by Ascential DataStage and in 2005 was acquired by IBM and added to the WebSphere family. Starting from 2006 its official name is IBM WebSphere Datastage and in 2008 it has been renamed to IBM InfoSphere Datastage.

Datastage versions

Datastage is available and fully supported under windows and unix environments.

Datastage Editions:

  • Server Edition - contains and supports server jobs and job sequences. Jobs are compiled into Basic.
  • Datastage Enterprise Edition - includes parallel jobs, server jobs and job sequences. Jobs are compiled into OSH and the application is much more scalable than the server edition. The following product names also apply to this version of Datastage: IBM Websphere Datastage, IBM Websphere Information Server, IBM InfoSphere Information Server, IBM InfoSphere DataStage.
  • MVS Edition - for mainframe systems. Jobs are developed on a Windows or Unix platform, compiled into COBOL and transferred to the Mainframe and executed outside of Datastage.
  • DataStage for PeopleSoft - a server edition with prebuilt PeopleSoft EPM jobs.

    DataStage components

    The core DataStage client applications are common in all versions of Datastage; those are:

    • Administrator - Administers DataStage projects, manages global settings and interacts with the system
    • Designer - used to create DataStage jobs and job sequences which are compiled into executable programs. It is a main module for developers.
    • Director - manages running and monitoring DataStage jobs. It is mainly used by operators and testers.
    • Manager - for managing, browsing and editing the data warehouse metadata repository.

    Infosphere Datastage 8 tutorial and certification study guides

    Datastage Enterprise Edition tutorial - Datastage and Qualitystage tutorial based on Information Server 8.1 and Datastage 7.5 EE
    Infosphere certification - Datastage, Qualitystage and Information Analyzer certification materials and study guides