Tail stage in DataStage
The Tail Stage selects the last N records from each partition of an input data set and copies the selected records to an output data set.
The Tail Stage is a development and debug stage that helps you to sample data. The Tail Stage can have a single input link and a single output link. It is one of a number of stages that IBM® DataStage® provides to help you sample data, see also:
- Head stage, Head stage in DataStage.
- Sample stage, Sample stage in DataStage.
- Peek stage, Peek stage in DataStage.
The Tail Stage selects the last N records from each partition of an input data set and copies the selected records to an output data set. You determine which records are copied by setting properties which allow you to specify:
- The number of records to copy
- The partition from which the records are copied
This stage is helpful in testing and debugging applications with large data sets. For example, the Partition property lets you see data from a single partition to determine if the data is being partitioned as you want it to be. The Skip property lets you access a certain portion of a data set.
When you double-click the Tail stage, the properties panel opens. The properties panel has three tabs:
- Stage. This is always present and is used to specify general information about the stage.
- Input. This is where you specify the details about the single input set from which you are selecting records.
- Output. This is where you specify details about the processed data being output from the stage.
Input tab
The Columns section specifies the column definitions of incoming data. The Advanced section allows you to change the default buffering settings for the input link.
Output tab
The Tail stage can have only one output link.
The Columns tab specifies the column definitions of the data. The Mapping tab allows you to specify the relationship between the columns being input to the Tail stage and the Output columns. The Advanced tab allows you to change the default buffering settings for the output link.
The Columns section specifies the column definitions of incoming data. Click Edit at the bottom of the Columns section to specify mapping information in the Map from column input column section. Mapping specifies the relationship between the columns being input to the Tail stage and the Output columns. The Advanced section allows you to change the default buffering settings for the output link.