Satellite locations overview

Cloud Pak for Data as a Service with IBM Cloud Satellite extends your managed data and AI workloads onto the compute infrastructure of third-party cloud data providers. Cloud Pak for Data as a Service with IBM Cloud Satellite eliminates the need to move or copy data from other public clouds to run notebooks. You can build machine learning models in the cloud and train these models in the same cloud region where your data is stored. By bringing AI workloads to the data, you can improve performance, satisfy data residency requirements, and lower data egress costs.

Cloud Pak for Data as a Service with IBM Cloud Satellite supports the Watson Studio runtime environments for Python and R notebooks and notebook jobs on prebuilt locations that are configured and managed by IBM. Environments for Satellite locations are available to users of paid Watson Studio Cloud plans in the Dallas region.

The advantages of using a Satellite location include:

  • Workloads run where your data is located; no need to move your data to Cloud Pak for Data as a Service. This colocation reduces latency, especially for large data volumes, and helps with ensuring data protection and confidentiality.
  • Little data egress charges are incurred as no data is in transit. For an example of egress charges for AWS data, see Amazon S3 pricing.

Cloud Pak for Data as a Service provides a prebuilt Satellite location on the Amazon Web Services (AWS) us-east-1 Region. A prebuilt location consists of a dedicated Virtual Private Cloud (VPC) owned and managed by IBM in the AWS Cloud Region. All required runtime images are installed and maintained by IBM. You only need to have your data in AWS and configure an access point. The prebuilt Satellite location provides access to data stored in:

  • Amazon Simple Storage Service (S3) object storage
  • Other data sources on AWS

To access the data from a notebook, you configure an S3 access point on AWS and include the access point information in your notebook. See Accessing data in AWS through access points for access point examples.

Watch this video for an overview of using Satellite locations to access data stored in Amazon Web Servicesthrough a Jupyter notebook.

This video provides a visual method as an alternative to following the written steps in this documentation.

Running workloads in Satellite locations

The following diagram illustrates where the workloads for a Jupyter notebook reside in a prebuilt Satellite location. The notebook code resides within Watson Studio on IBM Cloud, but the compute resources for the Watson Studio environment and the database reside in an AWS region. Complete benefits from the Satellite prebuilt location are obtained when your AWS data resides in the same region as the prebuilt Satellite location.

The Watson Studio runtimes reside in a Virtual Private Cloud (VPC) on AWS that is owned by IBM. A VPC created by a customer will remain as a separate, customer-owned VPC.

Watson Studio runtimes for a prebuilt location

Security considerations

Watson Studio access is controlled by the IBM Cloud account and the user’s project roles.

IBM strictly manages account and resource access to Satellite locations that use access control policies, procedures, and systems. These systems allow access only to those individuals with explicit access permission.

IBM does not move data out of Satellite locations. Although the IBM control plane might reside elsewhere, the data plane (where data is processed) is located in the IBM-owned AWS account and data never leaves that location.

The customer is responsible for configuring and managing security for the customer-owned VPC that contains the database on AWS.

Prebuilt Satellite locations are GDPR and Privacy Shield compliant.

The IBM VPC for the Satellite location on AWS is not considered HIPAA ready.

Learn more