Compute resource options for DataStage in projects
To run DataStage jobs, you need to select an environment template for the DataStage parallel (PX) engine. The environment template specifies the configuration, number of virtual cores (vCPU), and memory to run the job. You can select different environment templates for any particular job.
- Environment types and locations
- Default environment templates
- Compute usage
- Runtime scope
- Changing the runtime
Environment types and locations
DataStage has one runtime, the DataStage PX engine. The DataStage PX engine can run on either an SMP or MPP environment type. Both environment types have S (Small), M (Medium), and L (Large) configurations.
You can use these types of environments with DataStage:
- SMP Environment (one conductor)
- MPP Environment Configurations (one conductor node and one or more compute nodes)
Default environment templates
So you can start quickly, your project has pre-loaded environment templates for the DataStage PX engine runtime environment. You can select one of these environment templates to run your job in IBM Cloud.
Compute usage is tracked by capacity unit hours (CUH) and different environments use different rates of capacity units per hour.
Name | Hardware configuration | Capacity units per hour (CUH) |
---|---|---|
Default DataStage PX S | 1 conductor: 2 vCPU and 8 GB RAM | 2 |
Default DataStage PX M | 1 conductor: 4 vCPU and 16 GB RAM | 4 |
Default DataStage PX L | 1 conductor: 8 vCPU and 32 GB RAM | 8 |
The number of capacity unit hours that are used for a DataStage job is based on the capacity unit rating of the environment and the number of seconds that the runtime was active.
The runtimes for DataStage stop automatically when processing is complete.
Compute usage in projects
You can monitor the total monthly amount of CUH consumption for the DataStage service on the Resource usage page on the Manage tab of your project.
Data quality rules run as DataStage flows and consume CUH. See Data quality rules.
Runtime scope
The runtime environment template that you select is specific to the job you selected for and the compute resources in that environment are not shared with any other DataStage jobs. You also have the flexibility to run the same job on different environment configurations by updating the environment that you want for that job.
To update the environment that you want to use:
- On the flow canvas, select the run settings icon and select the environment that you want to use.
- Select a job, edit the job configuration, and on the run settings tab, change the environment.
Changing the runtime
You can change the runtime for a DataStage job by editing the job definition. See Creating jobs in DataStage.
Learn more
- DataStage
- Jobs
- DataStage offering plans
- Monitoring account resource usage
- DataStage command-line tools
Parent topic: Choosing compute resources for tools