High availability and disaster recovery in DataStage

Last updated: Nov 07, 2024

IBM® DataStage® on Cloud is highly available within multiple IBM Cloud locations, such as Dallas, Washington, DC, and Frankfurt, Germany. However, recovering from potential disasters that affect an entire location requires planning and preparation.

You are responsible for understanding your configuration, customization, and usage of the service. You are also responsible for being ready to re-create an instance of the service in a new location and to restore your data in any location. See How do I ensure zero downtime? for more information.

High availability

IBM DataStage supports high availability with no single point of failure. The service achieves high availability automatically and transparently by using the multi-zone region (MZR) feature provided by IBM Cloud.

IBM Cloud enables multiple zones that do not share a single point of failure within a single location. It also provides automatic load balancing across the zones within a region.

Disaster recovery

Disaster recovery can become an issue if an IBM Cloud location experiences a significant failure that includes the potential loss of data. Because MZR is not available across locations, you must wait for IBM to bring a location back online if it becomes unavailable. If underlying data services are compromised by the failure, you must also wait for IBM to restore those data services.

If a catastrophic failure occurs, IBM might not be able to recover data from database backups. In this case, you need to restore your data to return your service instance to its most recent state. You can restore the data to the same or to a different location.

Your disaster recovery plan includes knowing, preserving, and being prepared to restore all data that is maintained on IBM Cloud. This stored data includes your DataStage flows, data assets in your projects, and connections to various data sources used by your DataStage flows.

Disaster recovery for DataStage flows

Review your DataStage flows and determine which data can be backed up. Back up this data. In the event of a disaster, restore the flows that can be restored from the back-up. Recreate flows and artifacts that cannot be restored. Compile the rebuilt flows and run them in order to recreate the Job associated with each flow.

Backing up data

DataStage flows and associated data assets are stored in the projects. Backup your projects by following the instructions in Exporting a project. Verify that your export preferences include all of the artifacts you need, such as DataStage flows and associated data assets, such as files and connections.

Recreating data

Import the project from a local ZIP file that was previously exported. For more information, see Importing a project.