autoclusternode properties

Auto Cluster node icon The Auto Cluster node estimates and compares clustering models, which identify groups of records that have similar characteristics. The node works in the same manner as other automated modeling nodes, allowing you to experiment with multiple combinations of options in a single modeling pass. Models can be compared using basic measures with which to attempt to filter and rank the usefulness of the cluster models, and provide a measure based on the importance of particular fields.

Example

node = stream.create("autocluster", "My node")
node.setPropertyValue("ranking_measure", "Silhouette")
node.setPropertyValue("ranking_dataset", "Training")
node.setPropertyValue("enable_silhouette_limit", True)
node.setPropertyValue("silhouette_limit", 5)

Table 1. autoclusternode properties
`autoclusternode` Properties	Values	Property description
`evaluation`	field	Note: Auto Cluster node only. Identifies the field for which an importance value will be calculated. Alternatively, can be used to identify how well the cluster differentiates the value of this field and, therefore, how well the model will predict this field.
`ranking_measure`	`Silhouette` `Num_clusters` `Size_smallest_cluster` `Size_largest_cluster` `Smallest_to_largest` `Importance`
`ranking_dataset`	`Training` `Test`
`summary_limit`	integer	Number of models to list in the report. Specify an integer between 1 and 100.
`enable_silhouette_limit`	flag
`silhouette_limit`	integer	Integer between 0 and 100.
`enable_number_less_limit`	flag
`number_less_limit`	number	Real number between 0.0 and 1.0.
`enable_number_greater_limit`	flag
`number_greater_limit`	number	Integer greater than 0.
`enable_smallest_cluster_limit`	flag
`smallest_cluster_units`	`Percentage` `Counts`
`smallest_cluster_limit_percentage`	number
`smallest_cluster_limit_count`	integer	Integer greater than 0.
`enable_largest_cluster_limit`	flag
`largest_cluster_units`	`Percentage` `Counts`
`largest_cluster_limit_percentage`	number
`largest_cluster_limit_count`	integer
`enable_smallest_largest_limit`	flag
`smallest_largest_limit`	number
`enable_importance_limit`	flag
`importance_limit_condition`	`Greater_than` `Less_than`
`importance_limit_greater_than`	number	Integer between 0 and 100.
`importance_limit_less_than`	number	Integer between 0 and 100.
`<algorithm>`	flag	Enables or disables the use of a specific algorithm.
`<algorithm>.<property>`	string	Sets a property value for a specific algorithm. See Setting algorithm properties for more information.
`number_of_models`	integer
`enable_model_build_time_limit`	boolean	(K-Means, Kohonen, TwoStep, SVM, KNN, Bayes Net and Decision List models only.) Sets a maximum time limit for any one model. For example, if a particular model requires an unexpectedly long time to train because of some complex interaction, you probably don't want it to hold up your entire modeling run.
`model_build_time_limit`	integer	Time spent on model build.
`enable_stop_after_time_limit`	boolean	(Neural Network, K-Means, Kohonen, TwoStep, SVM, KNN, Bayes Net and C&R Tree models only.) Stops a run after a specified number of hours. All models generated up to that point will be included in the model nugget, but no further models will be produced.
`stop_after_time_limit`	double	Run time limit (hours).
`stop_if_valid_model`	boolean	Stops a run when a model passes all criteria specified under the Discard settings.