cartnode properties | IBM Cloud Pak for Data as a Service

cartnode properties

C&R Tree node icon The Classification and Regression (C&R) Tree node generates a decision tree that allows you to predict or classify future observations. The method uses recursive partitioning to split the training records into segments by minimizing the impurity at each step, where a node in the tree is considered "pure" if 100% of cases in the node fall into a specific category of the target field. Target and input fields can be numeric ranges or categorical (nominal, ordinal, or flags); all splits are binary (only two subgroups).

Example

node = stream.createAt("cart", "My node", 200, 100)
# "Fields" tab
node.setPropertyValue("custom_fields", True)
node.setPropertyValue("target", "Drug")
node.setPropertyValue("inputs", ["Age", "BP", "Cholesterol"])
# "Build Options" tab, "Objective" panel
node.setPropertyValue("model_output_type", "InteractiveBuilder")
node.setPropertyValue("use_tree_directives", True)
node.setPropertyValue("tree_directives", """Grow Node Index 0 Children 1 2
Grow Node Index 2 Children 3 4""")
# "Build Options" tab, "Basics" panel
node.setPropertyValue("prune_tree", False)
node.setPropertyValue("use_std_err_rule", True)
node.setPropertyValue("std_err_multiplier", 3.0)
node.setPropertyValue("max_surrogates", 7)
# "Build Options" tab, "Stopping Rules" panel
node.setPropertyValue("use_percentage", True)
node.setPropertyValue("min_parent_records_pc", 5)
node.setPropertyValue("min_child_records_pc", 3)
# "Build Options" tab, "Advanced" panel
node.setPropertyValue("min_impurity", 0.0003)
node.setPropertyValue("impurity_measure", "Twoing")
# "Model Options" tab
node.setPropertyValue("use_model_name", True)
node.setPropertyValue("model_name", "Cart_Drug")

Table 1. cartnode properties
`cartnode` Properties	Values	Property description
`target`	field	C&R Tree models require a single target and one or more input fields. A frequency field can also be specified. See the topic Common modeling node properties for more information.
`continue_training_existing_model`	flag
`objective`	`Standard` `Boosting` `Bagging` `psm`	`psm` is used for very large datasets, and requires a Server connection.
`model_output_type`	`Single` `InteractiveBuilder`
`use_tree_directives`	flag
`tree_directives`	string	Specify directives for growing the tree. Directives can be wrapped in triple quotes to avoid escaping newlines or quotes. Note that directives may be highly sensitive to minor changes in data or modeling options and may not generalize to other datasets.
`use_max_depth`	`Default` `Custom`
`max_depth`	integer	Maximum tree depth, from 0 to 1000. Used only if `use_max_depth = Custom`.
`prune_tree`	flag	Prune tree to avoid overfitting.
`use_std_err`	flag	Use maximum difference in risk (in Standard Errors).
`std_err_multiplier`	number	Maximum difference.
`max_surrogates`	number	Maximum surrogates.
`use_percentage`	flag
`min_parent_records_pc`	number
`min_child_records_pc`	number
`min_parent_records_abs`	number
`min_child_records_abs`	number
`use_costs`	flag
`costs`	structured	Structured property.
`priors`	`Data` `Equal` `Custom`
`custom_priors`	structured	Structured property.
`adjust_priors`	flag
`trails`	number	Number of component models for boosting or bagging.
`set_ensemble_method`	`Voting` `HighestProbability` `HighestMeanProbability`	Default combining rule for categorical targets.
`range_ensemble_method`	`Mean` `Median`	Default combining rule for continuous targets.
`large_boost`	flag	Apply boosting to very large data sets.
`min_impurity`	number
`impurity_measure`	`Gini` `Twoing` `Ordered`
`train_pct`	number	Overfit prevention set.
`set_random_seed`	flag	Replicate results option.
`seed`	number
`calculate_variable_importance`	flag
`calculate_raw_propensities`	flag
`calculate_adjusted_propensities`	flag
`adjusted_propensity_partition`	`Test` `Validation`