Reference: Algorithm Options
Options you can specify for each machine learning algorithm.
The following tables lists all applicable parameters for each algorithm along with their default values.
Decision Stump
| Option Name | Data Type | Default Value | Description |
|---|---|---|---|
| batch_size | int | 100 | Set the preferred batch size for batch prediction. |
| do_not_check_capabilities | Boolean | false | Enable capability-check. |
| num_decimal_places | int | 2 | Set the number of decimal places. |
Decision Tree
| Option Name | Data Type | Default Value | Description |
|---|---|---|---|
| batch_size | int | 100 | Set the preferred batch size for batch prediction. |
| binary_splits | Boolean | false | Enable binary splits. |
| collapse_tree | Boolean | true | Set to allow collapsing the tree. |
| confidence_factor | float | 0.25 | Set the confidence threshold for pruning. |
| do_not_check_capabilities | Boolean | false | Enable capability-check. |
| do_not_mask_split_point_actual_value | Boolean | false | Set whether the split point actual value is to be masked. |
| min_num_obj | int | 2 | Set the minimum number of instances per leaf. |
| num_decimal_places | int | 2 | Set the number of decimal places. |
| num_folds | int | 3 | Set the number of folds for reduced error pruning. One fold is used as pruning set. |
| reduced_error_pruning | Boolean | false | Enable reduced error pruning. This is false if the unpruned parameter is set to true. |
| save_instance_data | Boolean | false | Set whether the instance data is to be saved. If set to true, it does not clean up after the tree has been built. |
| seed | int | 1 | Set the seed for random data shuffling. |
| subtree_raising | Boolean | true | Enable sub-tree raising. |
| unpruned | Boolean | false | Enable using unpruned tree. This is false if the reduced_error_pruningparameter is set to true. |
| use_laplace | Boolean | false | Allow Laplace smoothing for predicted probabilities. |
| use_md_correction | Boolean | true | Allow MDL correction for info gain on numeric attributes. |
K-Nearest Neighbors
| Option Name | Data Type | Default Value | Description |
|---|---|---|---|
| knn | int | 1 | Set the number of neighbors the learner will use. |
| batch_size | int | 100 | Set the preferred batch size for batch prediction. |
| cross_validate | Boolean | false | Set whether the hold-one-out cross-validation will be used to select the best k-value. |
| distance_weighting | enum [weight_none, weight_inverse, weight_similarity] | weight_none | Set the distance weighting method used. |
| do_not_check_capabilities | Boolean | false | Enable capability-check. |
| mean_squared | Boolean | false | Set whether the mean squared error is used rather than mean absolute error when doing cross-validation. |
| nearest_neighbours_search_algorithm | enum [LinearNNSearch, BallTree, CoverTree, KDTress, FilteredNeighbourSearch] | LinearNNSearch | Set the nearestNeighbourSearch algorithm to be used for finding nearest neighbor(s). |
| num_decimal_places | int | 2 | Set the number of decimal places. |
| window_size | int | 0 (no window size) | Set the maximum number of instances allowed in the training pool. |
Logistic Regression
| Option Name | Data Type | Default Value | Description |
|---|---|---|---|
| batch_size | int | 100 | Set the preferred batch size for batch prediction. |
| do_not_check_capabilities | Boolean | false | Enable capability-check. |
| matx_its | int | -1 | Set the maximum number of iterations. |
| num_decimal_places | int | 4 | Set the number of decimal places. |
| ridge | double | 1.00E-08 | Sets the ridge in the log-likelihood. |
| use_conjugate_gradient_descent | Boolean | false | Sets whether conjugate gradient descent is used. |
Multilayer Perceptron
| Option Name | Data Type | Default Value | Description |
|---|---|---|---|
| batch_size | int | 100 | Set the preferred batch size for batch prediction. |
| do_not_check_capabilities | Boolean | false | Enable capability-check. |
| learning_rate | double | 0.3 | Set the learning rate for the backward propagation. The value should be between 0 - 1. |
| momentum | double | 0.2 | Set the momentum rate for the backward propagation algorithm.The value should be between 0 - 1. |
| num_decimal_places | int | 2 | Set the number of decimal places. |
| seed | int | 0 | Set the value used to seed the random number generator. |
| validation_set_size | int | 0 | Set the percentage size of validation set to use to terminate training. The value should be between 0 - 1. |
| validation_threshold | int | 20 | Set the number of consecutive increases of error allowed for validation testing before training terminates. |
Naive Bayes
| Option Name | Data Type | Default Value | Description |
|---|---|---|---|
| batch_size | int | 100 | Set the preferred batch size for batch prediction. |
| do_not_check_capabilities | Boolean | false | Enable capability-check. |
| num_decimal_places | int | 2 | Set the number of decimal places. |
| use_kernel_estimator | Boolean | false | Use kernel density estimator rather than normal distribution for numeric attributes. This is false if use_supervised_discretization is set to true. |
| use_supervised_discretization | Boolean | false | Use supervised discretization to process numeric attributes. This is false if use_kernel_estimator is set to true. |
Random Forests
| Option Name | Data Type | Default Value | Description |
|---|---|---|---|
| bag_size_percent | int | 100 | Specify the size of each bag, as a percentage of the training set size. |
| batch_size | int | 100 | Set the preferred batch size for batch prediction. |
| break_ties_randomly | Boolean | false | Set whether to break ties randomly when several attributes look equally good. |
| calc_out_of_bag | Boolean | false | Set whether to calculate the out-of-bag error. |
| compute_attribute_importance | Boolean | false | Set whether compute and output attribute importance (mean impurity decrease method) |
| do_not_check_capabilities | Boolean | false | Enable capability-check. |
| max_depth | int | 0 | Set the maximum depth of the tree, 0 for unlimited. |
| num_decimal_places | int | 2 | Set the number of decimal places. |
| num_execution_slots | int | 1 | Set the number of execution slots. |
| num_features | int | 0 | Set the number of attributes to randomly investigate. |
| num_iterations | int | 100 | Set the number of iterations. |
| seed | int | 0 | Set the value used to seed the random number generator. |
| store_out_of_bag_predictions | Boolean | false | Set whether to store out-of-bag predictions in internal evaluation object. |
Support Vector Machines
| Option Name | Data Type | Default Value | Description |
|---|---|---|---|
| batch_size | int | 100 | Set the preferred batch size for batch prediction. |
| c | double | 1 | Set the complexity constant C. |
| build_calibration_models | Boolean | false | Set whether to fit calibration models to SVM outputs. |
| checks_turned_off | Boolean | false | Disable all checks. |
| do_not_check_capabilities | Boolean | false | Enable capability-check. |
| epsilon | double | 1.00E-12 | Set the epsilon for round-off error. |
| filter_type | enum [filter_none, filter_normalize, filter_standardize] | filter_normalize | Set how the training data will be transformed. |
| kernel | enum [PolyKernel, NormalizedPolyKernel, PrecomputedKernelMatrixKernel, Puk, RBFKernel, StringKernel] | PolyKernel | Specify the Kernel to use. |
| num_decimal_places | int | 2 | Set the number of decimal places. |
| num_folds | int | -1 | Set the number of folds for the internal cross-validation. |
| random_seed | int | 1 | Set the random number seed. |
| tolerance_parameter | double | 1.00E-03 | Set the tolerance parameter. |