You create a governance framework to govern and enrich your data by implementing governance artifacts in collaborative workspaces called categories. Some types of governance artifacts act as metadata to enrich data assets. Other types of governance artifacts control access to data assets or to other artifacts.
- Required service
- IBM Knowledge Catalog
You use governance artifacts for these purposes:
- Enrichment: Artifacts can add knowledge and meaning to assets.
- Control access: Artifacts can control who sees what data or which artifacts.
- Identification: Artifacts can act as criteria to identify assets or data for other artifacts.
- Quality: Artifacts can be used to monitor data quality.
You can use categories and governance artifacts from any or all of these sources:
- Predefined governance artifacts that are provided with IBM Knowledge Catalog
- Industry-specific Knowledge Accelerators
- Custom governance artifacts that your governance team creates
The following table briefly describes categories and each type of governance artifact and indicates whether any of the items are predefined or available in Knowledge Accelerators.
Governance item | Description | Predefined items? | Provided by Knowledge Accelerators? |
---|---|---|---|
Categories | Categories organize governance artifacts in a hierarchical structure similar to folders. You can use category roles to define ownership of artifacts, control their authoring, and restrict their visibility. Examples: Business Performance Indicators, Business Scopes |
The [uncategorized] category, which contains the predefined data classes and classifications. The Locations category, which contains the predefined reference data sets. Limited: The Knowledge Accelerator Sample Personal Data category which contains predefined business terms. |
Each Knowledge Accelerator provides many categories. |
Business terms | Business terms implement a common enterprise vocabulary to describe the meaning of data. You create business terms to ensure clarity and compatibility among departments, projects, or products. Business terms are the core of your governance framework and typically form the bulk of your governance artifacts. You can manually assign business terms to data columns, tables, or files or automatically assign them during metadata enrichment. You can use business terms in governance rules and enforceable rules to identify the affected data. Examples: Customer lifetime value, Work phone number |
Limited: Predefined business terms and the Knowledge Accelerator Sample Personal Data category that includes them are available only if you create a Watson Knowledge Catalog service instance with a Lite or Standard plan after 7 October 2022. For more information see Predefined business terms. | Each Knowledge Accelerator provides many business terms. |
Data classes | Data classes classify data based on the structure, format, and range of values of the data. Data classes are automatically assigned to matching data columns during profiling and metadata enrichment. You can create data classes by defining matching criteria with an expression or a reference data set. You can create relationships between data classes and business terms to link data format with business meaning. Related business terms are automatically assigned to data along with their related data classes. How well columns conform to their data class criteria contributes to data quality analysis. Before you have a robust set of business terms, you can use data classes in enforceable rules to identify the affected data. Examples: Phone number, Email address |
Over 150 predefined data classes in the [uncategorized] category. | Each Knowledge Accelerator provides data classes. |
Reference data sets | Reference data sets define standard values for specific types of data to classify data and measure consistency. Reference data sets act as lookup tables that map codes and values. You can include a reference data set in the definition of a data class as part of the data matching criteria. Some reference data sets are standardized by organizations, such as the International Organization for Standardization (ISO). Reference data can be hierarchical or mapped across related sets. Example: Country codes |
The Physical locations and Sovereign locations predefined reference data sets in the Locations category. | Each Knowledge Accelerator provides many reference data sets. |
Classifications | Classifications describe specific characteristics of the meaning of data. Predefined classifications describe the sensitivity of the data. You can create classifications to describe other characteristics of data or other governance items. For example, Knowledge Accelerators use classifications to classify business terms. You can use classifications to construct governance policies and rules. Typically, you relate multiple business terms to each classification and then data is indirectly classified through its assigned business terms. You can also manually assign a classification to a data asset. Example: Sensitive Personal Information |
Several predefined classifications in the [uncategorized] category. | Each Knowledge Accelerator provides classifications. |
Policies | Policies describe how to manage and protect data assets. You create policies by combining rules and subpolicies. You can include data protection rules and data location rules in policies to control and manage data. However, policies do not affect the enforcement of data protection rules and data location rules. You can include governance rules in policies to document standards and procedures. Example: Data sharing agreement |
None | None |
Governance rules | Governance rules describe how to apply a policy. Governance rules provide a natural-language description of the criteria that are used to determine whether data assets are compliant with business objectives. Governance rules are not enforced by IBM Knowledge Catalog. However, you can relate governance rules to enforceable rules, such as data protection rules and data quality rules. Example: Customer name must not be null. |
None | None |
Data protection rules | Data protection rules define how to control access to data based on users and asset properties and assigned governance artifacts. Data protection rules define who can see what data. Within data protection rules, you can include classifications, data classes, business terms, or tags to identify the data to control. You specify to deny access to data or to mask sensitive data values. Data protection rules are automatically enforced in governed catalogs only. Data protection rules are not organized or controlled by categories. Example: Mask columns that are assigned the Passport Identifier business term. |
None | None |
Data location rules (experimental) | Data location rules control access to data based on their physical and sovereign locations, on users and asset properties, and assigned governance artifacts. Data location rules control who can see what data. Within data location rules, you can specify the direction the data is leaving from or coming to a physical or sovereign location. You can also include classifications, data classes, business terms, or tags to identify the data to control. You specify to allow access to data or to mask sensitive data values. Data location rules are automatically enforced in all governed catalogs. Data location rules are not organized or controlled by categories. Example: Mask columns that are assigned the Personal Identifiable Information business term in a data asset leaving Germany and accessed in other countries. |
None | None |
Data quality SLA rules | Data quality SLA rules monitor the data quality of critical data elements for compliance with certain quality criteria and can trigger remediation workflows in case of violations. You select the data assets and columns that you want to monitor
by name or by assigned business terms. SLA compliance and violations are reported on a data asset's Data quality page. Data quality SLA rules are not organized or controlled by categories. Example: Report a violation if the completeness dimension score for column ACCOUNT_ID in data asset BANK_ACCOUNT falls below 99% and trigger the default remediation workflow. |
None | None |
Governance artifacts are scoped to IBM Knowledge Catalog catalogs in the same IBM Cloud account.
You must have the specific Cloud Pak for Data service permissions to work with governance artifacts. See Required permissions.
Some IBM Knowledge Catalog plans have limits on the number of governance artifacts of a specific type that you can create.
Watch this short video to learn about the policies features.
This video provides a visual method to learn the concepts and tasks in this documentation.
Learn more
- Planning to govern data
- Find and view governance artifacts
- Governance artifact properties and relationships
- Managing artifacts
- Workflow process for governance artifacts
- Knowledge Accelerators
- IBM Knowledge Catalog plans
- IBM Knowledge Catalog APIs
Parent topic: Data governance