Last updated: Jan 17, 2024
With the Extension Export node, you can run R or Python for Spark scripts to export data.
Python for Spark example
import modeler.api
stream = modeler.script.stream()
node = stream.create("extension_export", "extension_export")
node.setPropertyValue("syntax_type", "Python")
python_script = """import spss.pyspark.runtime
from pyspark.sql import SQLContext
from pyspark.sql.types import *
cxt = spss.pyspark.runtime.getContext()
df = cxt.getSparkInputData()
print df.dtypes[:]
_newDF = df.select("Age","Drug")
print _newDF.dtypes[:]
df.select("Age", "Drug").write.save("/opt/IBM/SPSS/ModelerServer/Cloud/demos/Drug.json", format="json")
"""
node.setPropertyValue("python_syntax", python_script)
R example
node.setPropertyValue("syntax_type", "R")
node.setPropertyValue("r_syntax", """write.csv(modelerData, "/opt/IBM/SPSS/ModelerServer/Cloud/demos/ export.csv")""")
extensionexportnode properties |
Data type | Property description |
---|---|---|
syntax_type |
R Python | Specify which script runs: R or Python (R is the default). |
r_syntax |
string | The R scripting syntax to run. |
python_syntax |
string | The Python scripting syntax to run. |
convert_flags |
|
Option to convert flag fields. |
convert_missing |
flag | Option to convert missing values to the R NA value. |
convert_datetime |
flag | Option to convert variables with date or datetime formats to R date/time formats. |
convert_datetime_class |
|
Options to specify to what format variables with date or datetime formats are converted. |