0 / 0
Migrating and constructing pipeline flows for DataStage
Last updated: Dec 09, 2024
Migrating and constructing pipeline flows for DataStage

The following steps and limitations apply to migrated Sequence Jobs and flows that are constructed directly with the pipeline canvas.

Migrated flows

For more information on each component, please see Pipeline components for DataStage.

Wait for file
Manually reselect or configure the file path. As a helper node for cross loop, the default timeout value is 23:59:59. Manually update value or set to 00:00:00 for no timeout.
Wait for all
Replaces Sequencer (all) and Nested condition.
Wait for any
Replaces Sequencer (any).
Terminate pipeline
Replaces Terminator.
Final message text is not supported.
Loop in sequence
Replaces Start/end loop.
Run DataStage job
Replaces Job activity for parallel jobs.
The List type is mapped to Enum. Path is mapped to the File type. For information see Configuring global objects for Orchestration Pipelines.
Run Pipelines job
Replaces Job activity for sequence jobs. For information see Run Pipelines job.
Run Bash script
You must replace single quotes around environment variables with double quotes so they are not treated as string literals.

In DataStage, mounting volumes to copy the scripts or files for pipeline in Bash node is not supported. To reference the files in Bash node, see referencing_files_in_bashnode.html.

Set user variables
Replaces User variable. User variables are defined on the global level.
Error handling
Replaces Exception handler.
Use error.status and error.status_message to get the failed node's information. Use ds.GetErrorSource() and ds.GetErrorNumber() to get the error source and error number.
General pipeline issues
  • Unsupported functions return "1" or "unsupported."
  • When outside nodes are accessed inside a loop or exception handler, migration adds an extra Set user variables node.
  • When the main pipeline has an Exception handler node or when a Loop node has links that point out to the main pipeline, migration creates extra nodes and a local parameter, MigrationTempFolder. A Run Bash script node is created inside the Loop or Exception handler. This node creates a linkage file under the mounted path provided in MigrationTempFolder, which must include a / at the end. Outside of the Loop or Exception handler, a Wait for file node is created which waits for the linkage file to be created. If the mounted path is incorrect the node will wait 24 hours. Another Run Bash script node deletes the linkage file afterward.

  • When Automatically handle activities that fail is selected, all migrated run nodes will be set to Fail on pipeline error, unless a condition has been defined on a link for when an error is thrown. If a condition has been defined or Automatically handle activities that fail is not defined, nodes will be set to Continue pipeline on error.
  • During migration, all parallel and sequence jobs along with all their dependencies need to be included in the ISX file. If one sequence job depends on another sequence job, but the dependent sequence job is not included in the migrated ISX file, migration marks the dependent job as Run DataStage job node instead of Run Pipeline job node. Also, migration creates extra nodes for any missing parameters.
  • If the option Add checkpoints so sequence is restartable on failure is set on a sequence job level, the job migrates with the Enable caching for specific nodes in node properties panel caching method. In the cache usage section, the migration also sets the Use cache when all selected conditions are met as a default option where both Retrying from a previous failed run and Pipeline version is unchanged from previous run are set. If you do not enable checkpoint run for your job at the node level, migration creates data cache in the selected node. For more information on node caches, see Manage default settings.

Set and get user status

To set user status in a DataStage job, you can call the built-in function SetUserStatus from the Expression builder in the Transformer stage. When you go to Triggers in the Transformer and call SetUserStatus, it cannot be used on input column derivations.

To get the status in a pipeline that calls the DataStage job with a Run DataStage job node, you can use the built-in function ds.GetUserStatus(tasks.<node name>) with the name of the Run DataStage job node. You can also access it in the job results with tasks.<node name>.user_status. To set user status in a pipeline, you must add it as a variable with the Set user variables node and select Make user variable value available as a pipeline result, which makes it an output parameter that other pipelines can access. Another pipeline can use a Run pipeline job node to call the pipeline that set the user status, and then get the user status using tasks.<node name>.results.output_parameters.<user status parameter name>.

If SetUserStatus is called in a child pipeline, migration creates a global user variable named user_status and selects the option Make user variable value available as a pipeline result. In the parent pipeline, it also replaces the expression that gets the status of the child pipeline, .$UserStatus, with tasks.results.output_parameters.user_status.

Constructed flows

Run DataStage job
To select the runtime environment for a specific job run, set the DSJobRunEnvironmentName variable in the Run DataStage Job node under the Environment Variables section. The variable overrides the default runtime environment that is specified on a project level for all jobs. For example, if you want to change the default px-large runtime environment for a specific job, replace it with the px-small environment runtime value in the Input tab.
Run Bash script
Echo statements must use double quotes to access the value of a variable. For example, echo "variablename" will replace "variablename" with the value of the variable. echo 'variablename' will just echo the name of the variable.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more