IBM Cloud Data Engine connection

Last updated: Nov 27, 2024

To access your data in IBM Cloud Data Engine, create a connection asset for it.

Important:

The IBM Cloud Data Engine connector is deprecated and will be discontinued in a future release. For more information, see Deprecation of Data Engine.

IBM Cloud Data Engine is a service on IBM Cloud that you use to build, manage, and consume data lakes and their table assets in IBM Cloud Object Storage (COS). IBM Cloud Data Engine provides functions to load, prepare, and query big data that is stored in various formats. It also includes a metastore with table definitions. IBM Cloud Data Engine was formerly named "IBM Cloud SQL Query."

Prerequisites

An IBM Cloud Data Engine Standard plan instance is required in order to create tables or views.
Before you can run SQL queries, you need to have one or more Cloud Object Storage buckets to hold the data to be analyzed and to hold the query results. You have two choices for Cloud Object Storage:
- The Cloud Object Storage instance that you associated with your projects, deployment spaces, or catalogs in Cloud Pak for Data as a Service. For information, see Cloud Object Storage on Cloud Pak for Data as a Service.
- Provision a new instance of Cloud Object Storage For instructions, see Provisioning storage.

Create a connection to IBM Cloud Data Engine

If you have set up an integrated cloud service, select the service instance to automatically fill in the fields in the connection form. Confirm that all the fields are complete.

To create the connection asset, you need these connection details:

The Cloud Resource Name (CRN) of the IBM Cloud Data Engine instance. Go to the IBM Cloud Data Engine service instance in your resources list in your IBM Cloud dashboard and copy the value of the CRN from the deployment details.
Target Cloud Object Storage: A default location where IBM Cloud Data Engine stores query results. You can specify any Cloud Object Storage bucket that you have access to. You can also select the default Cloud Object Storage bucket that is created when you open the IBM Cloud Data Engine web console for the first time from IBM Cloud dashboard. See the Target location field in the IBM Cloud Data Engine web console.
IBM Cloud API key: An API key for a user or service ID that has access to your IBM Cloud Data Engine and Cloud Object Storage services (for both the Cloud Object Storage data that you want to query and the default target Cloud Object Storage location).

You can create a new API key for your own user:

In the IBM Cloud console, go to Manage > Access (IAM).
In the left navigation, select API keys.
Select Create an IBM Cloud API Key.

Credentials

IBM Cloud Data Engine uses the SSO credentials that are specified as a single API key, which authenticates a user or service ID.
The API key must have the following properties:

Manage permission for the IBM Cloud Data Engine instance
Read access to all Cloud Object Storage locations that you want to read from
Write access to the default Cloud Object Storage target location
Write access to the IBM Cloud Data Engine instance

Choose the method for creating a connection based on where you are in the platform

In a project: Click Assets > New asset > Connect to a data source. See Adding a connection to a project.
In a catalog: Click Add to catalog > Connection. See Adding a connection asset to a catalog.
In a deployment space: Click Import assets > Data access > Connection. See Adding data assets to a deployment space.
In the Platform assets catalog: Click New connection. See Adding platform connections.

Next step: Add data assets from the connection

Where you can use this connection

You can use IBM Cloud Data Engine connections in the following workspaces and tools:

Projects

Data Refinery (watsonx.ai Studio or IBM Knowledge Catalog)
DataStage (DataStage service). See Connecting to a data source in DataStage.
Metadata enrichment (IBM Knowledge Catalog)
Metadata import (IBM Knowledge Catalog)
Notebooks (watsonx.ai Studio). See the Notebook tutorial for using the IBM Cloud Data Engine (SQL Query) API to run SQL statements.
SPSS Modeler (watsonx.ai Studio)

Catalogs

Platform assets catalog
Other catalogs (IBM Knowledge Catalog)

Note:
Preview, profile, and masking are not certified for this connection in IBM Knowledge Catalog.

Restrictions

You can only use this connection for source data. You cannot write to data or export data with this connection.

IBM Cloud Data Engine setup

To set up IBM Cloud Data Engine on IBM Cloud Object Storage, see Getting started with IBM Cloud Data Engine.

Supported encryption

By default, all objects that are stored in IBM Cloud Object Storage are encrypted by using randomly generated keys and an all-or-nothing-transform (AONT). For details, see Encrypting your data. Additionally, you can use managed keys to encrypt the SQL query texts and error messages that are stored in the job information. See Encrypting SQL queries with Key Protect.

Running SQL statements

Video to learn how you can get started to run a basic query

Learn more

Parent topic: Supported connections

Was the topic helpful?

0/1000