0 / 0
Configuring the data source in the Excel stage of DataStage

Configuring the Excel stage as a source (DataStage)

You can use Excel stage to design jobs that read Excel data from Microsoft Excel files.

About this task

Configure the Excel stage as a source to extract data and generate the output in the data type that you want.

Procedure

  1. On the canvas, double-click the Excel stage. On the Stage tab, open the Properties section.
  2. Specify the file name details in the Data source section:
    1. In the File name field, specify the name of the file from which you want to read the data.
      Note: If you added the Excel stage to the canvas by using the asset browser, the file name is automatically added to the field.
    2. Optional: If you specify wildcard characters in the file name, select Use template file for design time and specify a template file name.
      Template file is used for subsequent configuration steps, and not used at runtime.
  3. Specify the data range details to read from the Microsoft Excel file in the Range option section.
    1. Optional: Specify a value for the Read option. If you select Specify the start row, you need to specify only the first row. Excel stage then identifies the last row of the data range at a run time. If you select Specify the entire data range, you need to specify both the first row and last row.
    2. Specify Range expression.Excel stage searches the entire document and lists the candidates of data range in the Template data range list box in the Map section. The Range expression property is set with the appropriate value when you click Import in the Import pane.
    3. Optional: If you want to skip any sheet names from range expression, then specify the name in the Sheet name to skip field. Use this field when the sheet names are omitted from the range expression.
    4. Optional: Specify the First row of data ranges. If you select None at design time, Microsoft Excel column names are expressed in the format: COLUMN_LETTER (First value), for example A (EMP NO) in the Map section. If you select First row is header, then the first row value is displayed in the Map pane. If you select None at run time, the first row is extracted. If you select First row is header, the first row is skipped.
  4. Specify data range details to import.
    1. In the Map section, click Edit.
    2. If the file already exists as a project asset, in the Map page, click Import columns.

      Changing the file name or opening the stage retrieves the information about the file if the file already exists as a project asset. This information is then used under the Map section on the Stage tab.

      If the file does not exist as a project asset, but is available at /ds-storage/PXRuntime/Projects/<projectid>, you can use the file to run a job. To use the file, manually add columns that are in the specified data range.

    3. Select one data range from Template data range.
    4. Optional: If you want to extract more Microsoft Excel items such as document properties, use the Properties drop-down list to select the items.
    5. Optional: If you want to extract the value of a particular cell, click a cell in the Excel item column to add Custom property or Custom cell properties. Enter a value or name in the Excel item column of the row.
    6. Select the columns to import in the Excel columns section.
    7. Click Import.
      The Map page opens.
  5. Define the mapping between the DataStage® columns and the Microsoft Excel items in the Map page.
    1. In the DataStage column name column, specify the DataStage column name for each Microsoft Excel item. In the Excel item, select the item that you want to map to the DataStage column.
    2. Click Apply and return.
  6. Click Save.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more