Migrating from project-lib for Python to ibm-watson-studio-lib
The ibm-watson-studio-lib
library is the successor of the project-lib library
. Although you can still continue using project-lib
API in your notebooks, you should think about migrating existing notebooks
to use the ibm-watson-studio-lib
library.
Advantages of using ibm-watson-studio-lib
include:
- The asset browsing API provides read-only access to all types of assets, not only those explicitly supported by the library.
ibm-watson-studio-lib
uses a constistent API naming convention that structures available functions according to their area of application.
The following sections describe the changes you need to make in existing Python notebooks to start using the ibm-watson-studio-lib
library.
Set up the library
You need to make the following changes in existing notebooks to start using ibm-watson-studio-lib
:
In code using project-lib
change:
from project_lib import Project
project = Project("<ProjectId>","<ProjectToken>")
To the following using ibm-watson-studio-lib
:
from ibm_watson_studio_lib import access_project_or_space
wslib = access_project_or_space({"token":"<ProjectToken>"})
Set up the library in Spark environments
You need to make the following changes in existing notebooks to start using ibm-watson-studio-lib
in Spark environments.
In code using project-lib
change:
from project_lib import Project
project = Project(sc,"<ProjectId>","<ProjectToken>")
To the following using ibm-watson-studio-lib
:
from ibm_watson_studio_lib import access_project_or_space
wslib = access_project_or_space({"token":"<ProjectToken>"})
wslib.spark.provide_spark_context(sc)
Library usage
The following sections describe the code changes that you need to make in your notebooks when migrating functions in project-lib
to the corresponding functions in ibm-watson-studio-lib
.
Get project information
To fetch project related information programmatically, you need to change the following functions:
List data connections
In code using project-lib
change:
project.get_connections()
To the following using ibm-watson-studio-lib
:
assets = wslib.list_connections()
wslib.show(assets)
Alternatively, with ibm-watson-studio-lib
, you can list connected data assets:
assets = wslib.list_connected_data()
wslib.show(assets)
List data files
This function returns the list of the data files in your project.
In code using project-lib
change using:
project.get_files()
To the following using ibm-watson-studio-lib
:
assets = wslib.list_stored_data()
wslib.show(assets)
Get name or description
In ibm-watson-studio-lib
, you can retrieve any metadata about the project, for example the name of a project or its description, via the entrypoint wslib.here
.
In code using project-lib
change:
name = project.get_name()
desc = project.get_description()
To the following using ibm-watson-studio-lib
:
name = wslib.here.get_name()
desc = wslib.here.get_description()
Get metadata
There is no replacement for get_matadata
in project-lib
:
project.get_metadata()
The function wslib.here
in ibm-watson-studio-lib
exposes parts of this information. To see what project metadata information is available, use:
help(wslib.here.API)
For example:
wslib.here.get_name()
: Returns the project namewslib.here.get_description()
: Returns the proejct descriptionwslib.here.get_ID()
: Returns the project IDwslib.here.get_storage()
: Returns the storage metadata
Get storage metadata
In code using project-lib
change:
project.get_storage_metadata()
To the following using ibm-watson-studio-lib
:
wslib.here.get_storage()
Fetch data
To access data in a file, you need to change the following functions.
In code using project-lib
change:
buffer = project.get_file("MyAssetName.csv")
# or, without direct storage access:
buffer = project.get_file("MyAssetName.csv", direct_storage=False)
# or:
buffer = project.get_file("MyAssetName.csv", direct_os_retrieval=False)
To the following using ibm-watson-studio-lib
:
buffer = wslib.load_data("MyAssetName.csv")
Additionally, ibm-watson-studio-lib
offers a function to download a data asset and store it in the local file system:
info = wslib.download_file("MyAssetName.csv", "MyLocalFile.csv")
Save data
To save data to a file, you need to change the following functions.
In code using project-lib
change (and for all variations of direct_store=False
and set_project_asset=True
):
project.save_data("NewAssetName.csv", data)
project.save_data("MyAssetName.csv", data, overwrite=True)
To the following using ibm-watson-studio-lib
:
asset = wslib.save_data("NewAssetName.csv", data)
wslib.show(asset)
asset = wslib.save_data("MyAssetName.csv", data, overwrite=True)
wslib.show(asset)
Additionally, ibm-watson-studio-lib
offers a function to upload a local file to the project storage and create a data asset:
asset = wslib.upload_file("MyLocalFile.csv", "MyAssetName.csv")
wslib.show(asset)
Get connection information
To return the metadata associated with a connection, you need to change the following functions.
In code using project-lib
change:
connprops = project.get_connection(name="MyConnection")
To the following using ibm-watson-studio-lib
:
connprops = wslib.get_connection("MyConnection")
Get connected data information
To return the metadata associated with a connected data asset, you need to change the following functions.
In code using project-lib
change:
dataprops = project.get_connected_data(name="MyConnectedData")
To the following using ibm-watson-studio-lib
:
dataprops = wslib.get_connected_data("MyConnectedData")
Access asset by ID instead of name
You can return the metadata of a connection or connected data asset by accessing the asset by ID instead of by name.
In project-lib
change:
connprops = project.get_connection(id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
# or:
connprops = project.get_connection("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
# or:
datapros = project.get_connected_data(id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
# or:
datapros = project.get_connected_data("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
To the following using ibm-watson-studio-lib
:
connprops = wslib.by_id.get_connection("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
dataprops = wslib.by_id.get_connected_data("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
In project-lib
, it is not possible to access files (stored data assets) by ID. You can only do this by name. The ibm-watson-studio-lib
library supports accessing files by ID. See Using ibm-watson-studio-lib.
Fetch assets by asset type
When you retrieve the list of all project assets, you can pass the optional parameter asset_type
to the function get_assets
which allows you to filter assets by type. The accepted values for this parameter in project-lib
are data_asset
, connection
and asset
.
In code using project-lib
change:
project.get_assets()
# Or, for a supported asset type:
project.get_assets("<asset_type>")
# Or:
project.get_assets(asset_type="<asset_type>")
To the following using ibm-watson-studio-lib
:
assets = wslib.assets.list_assets("asset")
wslib.show(assets)
# Or, for a specific asset type:
assets = wslib.assets.list_assets("<asset_type>")
# Example, list all notebooks:
notebook_assets = wslib.assets.list_assets("notebook")
wslib.show(notebook_assets)
To list the available asset types, use:
assettypes = wslib.assets.list_asset_types()
wslib.show(assettypes)
Spark support
To work with Spark, you need to change the functions that enable Spark support and retrieving the URL to a file.
Set up Spark support
To set up Spark support:
In code using project-lib
change:
# Provide SparkContext during setup
from project_lib import Project
project = Project(sc,"<ProjectId>","<ProjectToken>")
To the following using ibm-watson-studio-lib
:
from ibm_watson_studio_lib import access_project_or_space
wslib = access_project_or_space({'token':'<ProjectToken>'}
# provide SparkContext in a subsequent step
wslib.spark.provide_spark_context(sc)
Retrieve URL to access a file from Spark
To retrieve a URL to access a file referenced by an asset from Spark via Hadoop:
In code using project-lib
change:
url = project.get_file_url("MyAssetName.csv")
# or
url = project.get_file_url("MyAssetName.csv", direct_storage=False)
# or
url = project.get_file_url("MyAssetName.csv", direct_os_retrieval=False)
To the following using ibm-watson-studio-lib
:
url = wslib.spark.get_data_url("MyAssetName.csv")
Get file URL for usage with Spark
Retrieve a URL to access a file referenced by an asset from Spark via Hadoop.
In code using project-lib
change:
project.get_file_url("MyFileName.csv", direct_storage=True)
# or
project.get_file_url("MyFileName.csv", direct_os_retrieval=True)
To the following using ibm-watson-studio-lib
:
wslib.spark.storage.get_data_url("MyFileName.csv")
Access project storage directly
You can fetch data from the project storage or save data to the project storage without synchronising the project assets.
Fetch data
To fetch data from the project storage:
In code using project-lib
change:
project.get_file("MyFileName.csv", direct_storage=True)
# Or:
project.get_file("MyFileName.csv", direct_os_retrieval=True)
To the following using ibm-watson-studio-lib
:
wslib.storage.fetch_data("MyFileName.csv")
Save data
To save data to a file in the project storage:
In code using project-lib
change:
# Save and do not create an asset in a project
project.save_data("NewFileName.csv", data, direct_storage=True)
# Or:
project.save_data("NewFileName.csv", data, set_project_asset=False)
To the following using ibm-watson-studio-lib
:
wslib.storage.store_data("NewFileName.csv", data)
In code using project-lib
change:
# Save (and overwrite if file exists) and do not create an asset in the project
project.save_data("MyFileName.csv", data, direct_storage=True, overwrite=True)
# Or:
project.save_data("MyFileName.csv", data, set_project_asset=False, overwrite=True)
To the following using ibm-watson-studio-lib
:
wslib.storage.store_data("MyFileName.csv", data, overwrite=True)
Additionaly, ibm-watson-studio-lib
provides a function to download a file from the project storage to the local file system:
wslib.storage.download_file("MyStorageFile.csv", "MyLocalFile.csv")
You can also register a file in the project storage as data asset using:
wslib.storage.register_asset("MyStorageFile.csv", "MyAssetName.csv")
Learn more
To use the ibm-watson-studio-lib
library for Python in notebooks, see ibm-watson-studio-lib for Python.
Parent topic: Using ibm-watson-studio-lib