Managing hardware specifications for deployments
When you deploy certain assets in watsonx.ai Runtime, you can choose the type, size, and power of the hardware configuration that matches your computing needs.
Creating hardware specifications for deployments
You can create hardware specifications for your deployments in the following ways:
- Python client library: Use the
hardware_specifications.store
function from the Python client library. For more information, see Python client library reference - Data and AI Common Core API: Use
POST /v2/hardware_specifications
from the Environments list in the Data and AI Common Core API to create a hardware specification. For more information, see Environments API reference.
Deployment types that require hardware specifications
Selecting a hardware specification is available for all batch deployment types. For online deployments, you can select a specific hardware specification if you're deploying:
- Python Functions
- Tensorflow models
- Models with custom software specifications
Hardware configurations available for deploying assets
XS
: 1x4 = 1 vCPU and 4 GB RAMS
: 2x8 = 2 vCPU and 8 GB RAMM
: 4x16 = 4 vCPU and 16 GB RAML
: 8x32 = 8 vCPU and 32 GB RAMXL
: 16x64 = 16 vCPU and 64 GB RAM
You can use the XS
configuration to deploy:
- Python functions
- Python scripts
- R scripts
- Models based on custom libraries and custom images
For Decision Optimization deployments, you can use these hardware specifications:
S
M
L
XL
Hardware specifications for GPU inferencing
Beginning Cloud Pak for Data version 4.8.5, you can select GPU hardware specifications for CUDA software specifications from the user interface on x86
platform when you create a deployment. For more information, see Customizing a runtime to use a MIG-enabled profile.
Use the following predefined hardware specifications for GPU inferencing:
Learn more
Parent topic: Managing predictive deployments