Import methods for governance artifacts
You can import governance artifacts with a file. You can import one type of governance artifact at a time, or import all governance artifacts from another IBM Knowledge Catalog instance.
- Compatibility between deployment environments
- Comparison of import methods
- Governance artifacts that you can import
- Methods for merging imported and existing artifacts
- Security considerations
Compatibility between deployment environments
You can export and then import governance artifacts between IBM Knowledge Catalog instances on the following deployment environments:
- Cloud Pak for Data 3.5
- Cloud Pak for Data 4.x
- Cloud Pak for Data as a Service
The values of Stewards are not compatible between IBM Knowledge Catalog instances Cloud Pak for Data as a Service and Cloud Pak for Data 3.5 or 4.x.
You can import governance artifacts from IBM InfoSphere Information Governance Catalog to IBM Knowledge Catalog instances on Cloud Pak for Data 3.5 and 4.x. To import governance artifacts from IBM InfoSphere Information Governance Catalog to IBM Knowledge Catalog instances on Cloud Pak for Data as a Service, you must edit each CSV file to conform to the format of the IBM Knowledge Catalog artifact CSV files. For example, you might need to make the following types of edits:
- Remove unsupported columns
- Separate different artifact types into multiple CSV files
- Modify supported columns
- Add required columns
Comparison of import methods
Choose the appropriate import method for your goals and circumstances.
- Import a single type of artifact
-
You can import a single type of governance artifact at a time with a CSV file.
-
This method is useful in the following types of circumstances:
- You want the imported artifacts to be subject to workflow.
- You want to add values for a property to one type of governance artifact. Export that artifact type as a CSV file, edit the CSV file, and then import it. For example, you can use this method to add a custom attribute to your business terms.
- You want to define artifacts in another program. Create CSV files for each artifact type. For example, you can use this method to define artifacts in a spreadsheet program and then import them.
-
See Importing governance artifacts by type with CSV files and CSV file format for importing governance artifacts.
- Import multiple types of artifacts
-
You can import multiple types of governance artifacts with a ZIP file that you created by exporting multiple types of existing governance artifacts from a IBM Knowledge Catalog instance. The ZIP file contains CSV files for categories and every exported artifact type. The CSV files match the format for the CSV import file, except for:
- The extra Artifact ID column, which contains identifiers for artifacts instead of identifying artifacts by name and category path.
- Related artifacts are defined with artifact IDs instead of context and name.
-
This method is useful in the following types of circumstances:
- You want to move all governance artifacts from one IBM Knowledge Catalog instance to another.
-
See Importing multiple types of governance artifacts from an instance with a ZIP file.
The following table summarizes the differences between importing artifacts with CSV files or a ZIP file.
Characteristics | CSV file | ZIP file |
---|---|---|
File creation | • Export one type of existing artifacts • Create a file in a spreadsheet program • Export artifacts from IBM InfoSphere Information Governance Catalog and adjust the format |
Export multiple types of artifacts from an instance |
Number of artifact types | Categories or one artifact type per file. | Multiple types of artifacts, with categories and each type of artifact in a separate CSV file. |
Import methods | • Through the UI • API request |
API request |
Workflow | All artifacts are imported as draft and are subject to workflow. Categories are published immediately because they are not subject to workflow. | All artifacts and categories are published immediately. |
Required permissions | Permissions to create or edit categories. You must be at least an Editor in the category you are importing to. For details see Required permissions. | The Manage glossary permission |
Governance artifacts that you can import
With both import methods, you can import categories and the following types of governance artifacts:
Restrictions:
- You can import values for all properties of these types of governance artifacts, including relationships with other artifacts. However, relationships are imported only when the related artifact exists or is defined in the same import process. To add relationships that the import process skipped, first publish all imported draft artifacts and then run the import process again.
- You can't use CSV to move governance artifacts and their relationships between Cloud Pak for Data instances. For example, if you try to export data classes with matching method Match to reference data to CSV, and then import it into another Cloud Pak for Data instance, the import fails, because Artifact ID is not included in CSV imports and exports. Use ZIP import instead.
- When importing a reference data set from a CSV file, the reference data values from that set are not imported. You must use a separate CSV to import the values into the data set. Alternatively, you can use a ZIP import to import both the reference data set and its reference data values. For more information, see Importing files for reference data sets.
- You can't import data protection rules or data location rules.
Methods for merging imported and existing artifacts
Whether you import artifacts with CSV files or a ZIP file, you must choose what happens when you import governance artifacts that already exist and the values of the properties are different. The following table summarizes the three merge methods.
Merge method | API | Effect on original values | Effect on imported values |
---|---|---|---|
Replace all values | merge_option=all |
Discard all original values. | Accept all imported values, even empty values. |
Replace with defined values | merge_option=specified |
Retain original values if imported values are empty. | Accept all imported values, except empty values. |
Replace empty values | merge_option=empty |
Retain original values, except empty values. | Accept only imported values that replace empty values. |
For new artifacts, each of these methods produces the same results.
Replace all values
All the original values of the artifact are discarded and replaced by the values of the imported artifact. If the value of a property for the imported artifact is empty, any original values for that property are removed.
For example, suppose you have a published business term that is named release and you import a CSV file to modify it. The following table shows the effect of the Replace all values option:
Property | Original values | Values in the CSV file | Resulting values |
---|---|---|---|
Name | release | release | release |
Artifact type | glossary_term | glossary_term | glossary_term |
Category | marketing | marketing | marketing |
Description | example term | example term edited | example term edited |
Tags | beta | beta | |
Related terms | marketing>>version | marketing>>date | marketing>>date |
Classifications |
|
The resulting draft artifact has these changes to the original values:
- The original description is replaced by a new description.
- The original empty value for tags is replaced by a value.
- The original related term is replaced by a new related term.
- The original classification value is replaced by an empty value.
When using all
merge option, you must ensure that all CSV content is consistent regarding relationships between artifacts. For example, if the ZIP import file contains both a term and a data class connected together with a
relationship, then this relationship must be present in both data classes CSV and terms CSV. Otherwise the relationship import behavior is unpredictable, the relationship may be imported or not.
When importing ZIP files that contain reference data values, you must always use merge_option=all
in the API call.
Replace with defined values
Original and empty values of the artifact are replaced by the supplied values of the imported artifact. If the value of a property for the imported artifact is empty, any original values for that property are retained.
For example, suppose you have a published business term that is named release and you import a CSV file to modify it. The following table shows the effect of the Replace with defined values option:
Property | Original values | Values in the CSV file | Resulting values |
---|---|---|---|
Name | release | release | release |
Artifact type | glossary_term | glossary_term | glossary_term |
Category | marketing | marketing | marketing |
Description | example term | example term edited | example term edited |
Tags | beta | beta | |
Related terms | marketing>>version | marketing>>date | marketing>>date |
Classifications |
|
|
The resulting draft artifact has these changes to the original values:
- The original description is replaced by a new description.
- The original empty value for tags is replaced by a value.
- The original related term is replaced by a new related term.
Replace empty values
Empty values of the original artifact are replaced by the supplied values of the imported artifact.
For example, suppose you have a published business term that is named release and you import a CSV file to modify it. The following table shows the effect of the Replace empty values option:
Property | Original values | Values in the CSV file | Resulting values |
---|---|---|---|
Name | release | release | release |
Artifact type | glossary_term | glossary_term | glossary_term |
Category | marketing | marketing | marketing |
Description | example term | example term edited | example term |
Tags | beta | beta | |
Related terms | marketing>>version | marketing>>date | marketing>>version |
Classifications |
|
|
The resulting draft artifact has this change to the original values:
- The original empty value for tags is replaced by a value.
Security considerations
Governance data exported to CSV files is sanitized against known CSV Injection attacks, to be safe for those spreadsheet programs which automatically interpret CSV data. As a result, any text values which start with one of following characters:
- equals to (=)
- plus (+)
- minus (-)
- at (@)
are prefixed by a single quote character ('). To make the functionality consistent, imported CSV files are additionally parsed to automatically remove the single quote character ('). Sanitizing also applies when importing and exporting governance artifacts to ZIP files, as they contain CSV files.
To disable this functionality:
-
Edit IBM Knowledge Catalog Glossary Service deployment:
oc edit deployment wkc-glossary-service
-
Set the environment variable
ESCAPE_FORMULAS_IN_CSV_FILES
to valuefalse
.
For more information, see CSV Injection.
Learn more
- Importing governance artifacts by type with CSV files
- CSV file format for importing governance artifacts
- Importing all governance artifacts from an instance with a ZIP file
Parent topic: Managing governance artifacts