Infosphere Datastage Enterprise Edition installation
This IBM InfoSphere Datastage Enterprise Edition (EE) installation guide is based on a Datastage 7.5.1 EE setup on Red Hat Enterprise Linux and Datastage 8.x installation on Windows 2003 server
Installing Datastage EE 7.5.x (on Red Hat Enterprise Linux)
A few things need to be taken into consideration and analyzed when starting the Datastage EE installation process:
- Operating system analysis:
- Check whether the Datastage installation supports the operation system and its version. Datastage EE is compatible with Linux, AIX, Solaris, Windows 2003 server (Datastage ver. 7.5x2 or higher), HP-UX, Tru64, USS (z/OS)
- If necessary, apply any operating system level patches or updates before the installation
- Make sure an appropriate C++ Compiler is available. On Microsoft Windows 2003 Server, installing Microsoft Visual Studio will be necessary.
- For Datastage 8, prepare a database which will serve as a metadata repository. The Datastage 8 installation comes with a dedicated DB2 engine, however Oracle and Microsoft SQL Server are also supported.
- Check the disk space availability and disk performance
- Install and configure the database specific client libraries if needed.
- Before-installation checklist:
- Get the superuser (root / administrator) access to the server. Note that starting from Datastage 7.0, the installation process does not have to be started as root.
- Check disk space and available partitions
- Prepare the licensing information (check the operating system, number of CPU's and expiration date)
- Prepare the database connection information (connection strings, methods, name of the instances, etc.)
- Create the administration user (dsadm by default) - the administration user must be created before installing or upgrading DataStage.
Also, an operating system group needs to be created (dstage by default), it needs to be the primary group of the administration user and all DataStage users must be added to that group. - Unpack the Datastage installation image to a temporary location (/tmp for instance). Note that approx. 700MB of disk space is required to unpack the DataStage 7.5 installation image. For Datastage 8.1, the installation files occupy over 3GB of disk space.
- Install Datastage server - the server software is installed by executing the install.sh program. It is quite straightforward and in most cases following the screen instructions will lead to a successfull Datastage EE installation.
- Configure the Datastage EE configuration file - the file pointed by $APT_CONFIG_FILE datastage environment variable which defines the parallel system resources and architecture.
Sample configurations shown in the next tutorial lesson. - Install Datastage clients (Administrator, Manager, Director and Designer) on the client windows machines.
- Verify the installation by checking whether a dsrcpd unix background process is running (ps aux |grep dsrcpd).
- Try to log on to the server with Datastage Administrator
- Log on to Datastage Desinger, create a simple EE job in Designer (for instance, a job containing a row generator stage and peek stage only) and run it in Director
For more information please refer to the DataStage Install and Upgrade Guide provided on a Datastage client cd.
IBM Information Server 8.1 with Oracle 10g on Windows 2003 server
The guide below shows a step-by-step instructions on how to install IBM Information Server 8.1 (with Infosphere Datastage and Qualitystage and Information Analyzer) on a Windows 2003 Server machine. Read the installation steps for Red Hat Enterprise Linux above first as most of the tips apply for the installation of IBM IS 8.1.
- Install and configure an Oracle database. Oracle 10g was used in this guide.
- Create a dedicated oracle user, owner of a schema for the datastage repository, by running the following script from the installation media: (We used xmeta user).
...\is-ia-suite\DatabaseSupport\Windows\MetadataRepository\Oracle10g\create_xmeta_db.cmd
- Usage: create_xmeta_db OracleSystemUser OracleSystemPassword ServiceName XmetaUserName XmetaUserPassword XmetaTableSpaceName XmetaDatafilePath
- Example: create_xmeta_db SYSTEM ETLTOOLSINFO ORCL xmeta xmeta xmetaspace C:\oracle\product\10.2.0\oradata\orcl - If Information Analyzer will be used, a similiar script needs to be run to configure the database: ...\is-ia-suite\DatabaseSupport\Windows\InformationAnalyzer\Oracle10g\create_ia_db.cmd
- Usage: create_ia_db OracleSystemUser OracleSystemPassword ServiceName IAUserName IAUserPassword IATableSpaceName IADatafilePath
- Example: create_ia_db SYSTEM ETLTOOLSINFO ORCL ia ia iaspace C:\oracle\product\10.2.0\oradata\orcl - Run the IBM Infosphere installer - install.exe
- It is a good practice to disable the windows firewall for the Datastage installation. It can be re-enabled after a successful installation.
- Choose the default directory. By default it is C:\IBM\InformationServer.
- Select the Information Server applications (tiers) for the installation. The choices are the following:
- Client applications - Datastage and QualityStage Designer, Datastage and QualityStage Director, Datastage and QualityStage Administrator
- Engine - Datastage and QualityStage runtime components. One engine can be installed on a Microsoft Windows Server system
- Services - common agents and services which run in the WebSphere application server
- Metadata repository - it stores all the Information Server metadata. The repository will use the Oracle database specified earlier. Otherwise the installer will install a DB2 database on the system. Uncheck this option if you want to use Oracle or any other than DB2 database for the repository
- Documentation - a bundle of PDF documents about Information Server - Point to the Information Server license file (an xml file stored on a local disk)
- Select products and components to include in the installation. The available products are:
- Business Glossary (a web based tool to author and manage business metadata)
- DataStage and QualityStage (ETL, Data Integration and Data Quality Management tools)
- Documentation
- Information Server FastTrack (spreadsheet-like interface to capture source to target mappings)
- Information Analyzer (data profiling tool which help understand source data)
- Information Services Director (publisher information access and helps implement SOA)
- Information Server Manager (enhances the Datastage import and export utilities)
- Metadata Workbench (provides a web-based console for exploration of Information Server's metadata)
- Metadata repository (DB2 database which hosts the metadata repository; not used in this tutorial)
- Metadata server (metadata management processes) - Choose typical or custom installation. We recommend choosing the second option to have better control and understanding of the installation process.
- Provide details for the oracle metadata repository connection
- If available, configure an existing instance of a Websphere Application Server. Otherwise a new instance will be installed.
- Specify user credentials for a Websphere administrator
- Specify user credentials for Information Server administrator (isadmin in this tutorial)
- Create a new Datastage project (etltoolsinfo datastage project was used in this tutorial)
- Specify database logon information for the Information Analyzer user (ia in our tutorial)
- Configure an agent, Job Monitor, Resource tracker. We don't recommend making any changes here
- NLS support. Needs to be selected if non-english data will be used in the data integration processes.
- Choose to install a MQ plugin if required. This is for backward compatibility; for new jobs, a Websphere MQ Connector can be used.
- Review the installation summary and run the installation process. You can have a cup of coffee or a lunch break as this may take even a few hours for the installation process to complete.
- Restart the server
- Go to the Information Server Web console, log on with the IS Administrator credentials, go to Administration -> Domain management -> Datastage credentials. Specify the Default Datastage and QualityStage credentials and provide a system-os datastage user (dsadm by default).
- Run an Administrator client and try to log on to a newly created projects.