Deploying models converted to ONNX format

Last updated: Feb 21, 2025

You can deploy and inference machine learning models from PyTorch or TensorFlow that are saved in different formats and converted to the Open Neural Network Exchange (ONNX) format. ONNX is an open-source format for representing deep learning models. Developers can use the ONNX format to train their models in one framework, such as PyTorch or TensorFlow, and then export it to run in another environment with different performance characteristics. The ONNX format provides a powerful solution for converting a machine learning model to ONNX and perform inferencing by using the ONNX runtime.

Benefits of converting models to `ONNX` runtime

Converting a model to ONNX runtime offers several benefits, especially in the context of machine learning and deep learning applications. Some of the advantages of converting models to ONNX runtime are as follows:

Cross-platform compatibility: ONNX provides a standard format for representing machine learning models, which makes it easier to deploy models across different frameworks such as PyTorch or Tensorflow. You can train models in one frameworks and deploy them in another framework that supports ONNX runtime.
Improved performance: ONNX runtime optimizes models for inferencing by applying various hardware and software-specific optimizaitons, such as graph optimizations. Also, it supports execution on diverse hardware, such as CPUs and GPUs, ensuring efficient utilization of resources.
Interoperability: ONNX provides a way to train models, such as PyTorch, TensorFlow, and scikit-learn in one framework and then export them to run in another environment, which streamlines the workflows. It breaks down the barriers between different deep learning frameworks, allowing developers to leverage the strengths of different libraries without getting locked into a single ecosystem.

Supported frameworks for conversion

You can convert machine learning models that use the following frameworks to ONNX format:

PyTorch
TensorFlow
CatBoost
LightGBM
XGBoost
Scikit-learn

Converting `PyTorch` models to `ONNX` format

Follow this process to convert your trained model in PyTorch to the ONNX format:

Import libraries: Start by importing the essential libraries, such as onnxruntime for running the model, torch for PyTorch functionalities, and other libraries required for your application.
Create or download PyTorch model: You can create a PyTorch model by using your own data set or use models provided by external open source model repositories like Hugging Face.
Convert PyTorch model to ONNX format: To convert the PyTorch model to ONNX format:

a. Prepare the model: Ensure that your PyTorch model is in evaluation mode by using model.eval() function. You may need a dummy input tensor to match the shape of the model.

b. Export the model: Use the torch.onnx.export function to convert the model to ONNX format.
Verify the conversion: After converting the model, verify that the model is functioning as expected by using the onnx library.

Converting `TensorFlow` models to `ONNX` format

Follow this process to convert your model TensorFlow to the ONNX format:

Import libraries: Start by importing the essential libraries, such as tf2onnx to facilitate conversion of TensorFlow models to ONNX, and other libraries required for your application.
Download TensorFlow model: You must download the externally created TensorFlow model and the data that is used for training the model.
Convert TensorFlow model to ONNX format: Use the tf2onnx.convert command to convert your TensorFlow model that is created in the SavedModel format to ONNX format. If you want to convert a TensorFlow Lite model, use the --tflite flag instead of the --saved-model flag.

Note:

Keras models and tf functions can be directly converted within Python by using the tf2onnx.convert.from_keras or tf2onnx.convert.from_function functions.

Verify the conversion: After converting the model, verify that the model is functioning as expected by using the onnx library.

Converting `CatBoost` models to ONNX format

Follow this process to convert your trained model in CatBoost to the ONNX format:

Import libraries: Start by importing the essential libraries, such as onnxruntime for running the model, catboost for CatBoost functionalities, and other libraries required for your application.
Create or download CatBoost model: You can create a CatBoost model by using your own data set or use models provided by external open source model repositories such as Hugging Face.

Convert CatBoost model to ONNX format: To convert the CatBoost model to ONNX format:

a. Load the CatBoost model: You can load the CatBoost model by using libraries such as pickle:

catboost_model = pickle.load(file)

b. Export the model: Use the catboost_model.save_model function with format parameter set to onnx to convert the model to ONNX format.

catboost_model.save_model(
    onnx_model_name,
    format="onnx",
    export_parameters={
        'onnx_domain': 'ai.catboost',
        'onnx_model_version': 1,
        'onnx_doc_string': 'test model for Regressor',
        'onnx_graph_name': 'CatBoostModel_for_Regression'
    }
)

Verify the conversion: After converting the model, verify that the model is functioning as expected by using the onnx library.

Converting `LightGBM` models to ONNX format

Follow this process to convert your trained model in LightGBM to the ONNX format:

Import libraries: Start by importing the essential libraries, such as onnxruntime for running the model, lightgbm for LightGBM functionalities, and onnxmltools for conversion and other libraries required for your application.
Create or download LightGBM model: You can create a LightGBM model by using your own data set or use models provided by external open source model repositories like Hugging Face.

Convert LightGBM model to ONNX format: To convert the LightGBM model to ONNX format:

a. Load the LightGBM model: You can load the LightGBM model by using libraries such as pickle:

lgbm_model = pickle.load(file)

b. Export the model: Use the convert_lightgbm function to convert the model to ONNX format.

from onnxmltools import convert_lightgbm
from skl2onnx.common.data_types import FloatTensorType
from onnxmltools.utils import save_model      

initial_types = [("float_input", FloatTensorType([None, X.shape[1]]))]
onnx_model = convert_lightgbm(model=lgbm_model, initial_types=initial_types)
onnx_model_filename = "lgbm_model.onnx"
save_model(onnx_model, onnx_model_filename)

Tip:

For troubleshooting problems with converting LightGBM models to ONNX format, see Troubleshooting watsonx.ai Runtime.

Verify the conversion: After converting the model, verify that the model is functioning as expected by using the onnx library.

Converting `XGBoost` models to `ONNX` format

Follow this process to convert your trained model in XGBoost to the ONNX format:

Import libraries: Start by importing the essential libraries, such as onnxruntime for running the model, xgboost for XGBoost functionalities, and other libraries required for your application.
Create or download XGBoost model: You can create a XGBoost model by using your own data set or use models provided by external open source model repositories like Hugging Face.

Convert XGBoost model to ONNX format: To convert the XGBoost model to ONNX format:

a. Load the XGboost model: You can load the XGBoost model by using libraries such as pickle:

xgboost_model = pickle.load(file)

b. Export the model: Use the convert_xgboost function to convert the model to ONNX format.

from onnxmltools import convert_xgboost
from onnxconverter_common.data_types import FloatTensorType
from onnxmltools.utils import save_model

initial_types = [("float_input", FloatTensorType([None, X.shape[1]]))]
onnx_model = convert_xgboost(xgboost_model, initial_types=initial_types)
onnx_model_filename = "xgboost_onnx_model.onnx"
save_model(onnx_model, onnx_model_filename)

Verify the conversion: After converting the model, verify that the model is functioning as expected by using the onnx library.

Converting `scikit-learn` models to `ONNX` format

Follow this process to convert your trained model in scikit-learn to the ONNX format:

Import libraries: Start by importing the essential libraries, such as onnxruntime for running the model, sklearn for scikit-learn functionalities, skl2onnx for conversion and other libraries required for your application.
Create or download scikit-learn model: You can create a scikit-learn model by using your own data set or use models provided by external open source model repositories like Hugging Face.
Convert scikit-learn model to ONNX format: To convert the scikit-learn model to ONNX format:

a. Load the scikit-learn model: You can load the scikit-learn model by using libraries such as pickle:
```
sklearn_model = pickle.load(file)
```
b. Export the model: Use the to_onnx function to convert the model to ONNX format.
```
from skl2onnx import to_onnx
onnx_model = to_onnx(sklearn_model, X, target_opset=19)

with open("sklearn_model.onnx", "wb") as f:
   f.write(onnx_model.SerializeToString())
```
Verify the conversion: After converting the model, verify that the model is functioning as expected by using the onnx library.

Additional considerations

Here are some additional considerations for converting your models to ONNX format:

Dynamic axes: Dynamic axes can be used by a model to handle variable input shapes, such as dynamic batch sizes or sequence lengths, which is useful for models deployed in application where the input dimensions may vary. Use dynamic axes if your model handles variable input sizes, such as dynamic batch size or sequence length.

Dynamic axes also reduce memory overhead as they can be used with multiple inputs and outputs to adapt dynamically without re-exporting the model. You can specify the dynamic axes during model export in PyTorch or TensorFlow.
Opset version: The opset version in ONNX determines the set of operations and their specifications that are supported by the model. It is a critical factor during model conversion and deployment.

Different ONNX runtimes and frameworks support specific opset versions. Older opset versions may lack features or optimizations present in newer versions. Incompatibility between a model's opset version and the ONNX runtime can cause errors during inferencing. You must ensure that the ONNX opset version that you choose is supported by your target runtime.

Deploying models converted to `ONNX` format

Use the onnxruntime_opset_19 software specification to deploy your machine learning model converted to ONNX format. You must specify the software specification and model type when you store the model to the watsonx.ai Runtime repository. For more information, see Supported software specifications.

To deploy models converted to ONNX format from the user interface, follow these steps:

In your deployment space, go to the Assets tab.
Find your model in the asset list, click the Menu icon Menu icon, and select Deploy.
Select the deployment type for your model. Choose between online and batch deployment options.
Enter a name for your deployment and optionally enter a serving name, description, and tags.
Note:
- Use the Serving name field to specify a name for your deployment instead of deployment ID.
- The serving name must be unique within the namespace.
- The serving name must contain only these characters: [a-z,0-9,_] and must be a maximum 36 characters long.
- In workflows where your custom foundation model is used periodically, consider assigning your model the same serving name each time you deploy it. This way, after you delete and then re-deploy the model, you can keep using the same endpoint in your code.
Select a hardware specification for your model.
Select a configuration and a software specification for your model.
Click Create.

Testing the model

Follow these steps to test your deployed models converted to ONNX format:

In your deployment space, open the Deployments tab and click the deployment name.
Click the Test tab to input prompt text and get a response from the deployed asset.
Enter test data in one of the following formats, depending on the type of asset that you deployed:
- Text: Enter text input data to generate a block of text as output.
- JSON: Enter JSON input data to generate output in JSON format.
Click Generate to get results that are based on your prompt.

Sample notebooks

The following sample notebooks demonstrate how to deploy machine learning models converted from PyTorch or TensorFlow to the ONNX format by using the Python client library:

Sample notebooks
Notebook	Framework	Description
Convert ONNX neural network from fixed axes to dynamic axes and use it with ibm-watsonx-ai	ONNX	Set up the environment Create and export basic ONNX model Convert model from fixed axes to dynamic axes Persist converted ONNX model Deploy and score ONNX model Clean up Summary and next steps
Use ONNX model converted from PyTorch with ibm-watsonx-ai	PyTorch, ONNX	Create PyTorch model with dataset. Convert PyTorch model to ONNX format Persist converted model in Watson Machine Learning repository. Deploy model for online scoring using client library. Score sample records using client library.
Use ONNX model converted from TensorFlow to recognize hand-written digits with ibm-watsonx-ai	Tensorflow, ONNX	Download an externally trained TensorFlow model with dataset. Convert TensorFlow model to ONNX format Persist converted model in Watson Machine Learning repository. Deploy model for online scoring using client library. Score sample records using client library.
Use ONNX model converted from CatBoost	CatBoost, ONNX	Trained CatBoost model. Convert CatBoost model to ONNX format. Persist converted model in Watson Machine Learning repository. Deploy model for online scoring using client library. Score sample records using client library.
Use ONNX model converted from LightGBM	LightGBM, ONNX	Train a LightGBM model Convert the LightGBM model to ONNX format Persist the converted model in the watsonx.ai Runtime repository Deploy the model for online scoring using client library Score sample records using the client library
Use ONNX model converted from XGBoost with ibm-watsonx-ai	XGBoost, ONNX	Train an XGBoost model. Convert the XGBoost model to the ONNX format. Persist the converted model in the watsonx.ai repository. Deploy model for online scoring using the APIClient instance. Score sample records using the APIClient instance.
Use ONNX model converted from scikit-learn with ibm-watsonx-ai	Scikit-learn, ONNX	Train a scikit-learn model Convert the native scikit-learn model to ONNX format Perform conversion for a custom scikit-learn model wrapped in a sklearn pipeline Persist the converted model in the watsonx.ai Runtime repository Deploy the model for online scoring using client library Score sample records using the client library