Reference: Algorithm Options
Options you can specify for each machine learning algorithm.
The following tables lists all applicable parameters for each algorithm along with their default values.
Decision Stump
Option Name | Data Type | Default Value | Description |
---|---|---|---|
batch_size | int | 100 | Set the preferred batch size for batch prediction. |
do_not_check_capabilities | Boolean | false | Enable capability-check. |
num_decimal_places | int | 2 | Set the number of decimal places. |
Decision Tree
Option Name | Data Type | Default Value | Description |
---|---|---|---|
batch_size | int | 100 | Set the preferred batch size for batch prediction. |
binary_splits | Boolean | false | Enable binary splits. |
collapse_tree | Boolean | true | Set to allow collapsing the tree. |
confidence_factor | float | 0.25 | Set the confidence threshold for pruning. |
do_not_check_capabilities | Boolean | false | Enable capability-check. |
do_not_mask_split_point_actual_value | Boolean | false | Set whether the split point actual value is to be masked. |
min_num_obj | int | 2 | Set the minimum number of instances per leaf. |
num_decimal_places | int | 2 | Set the number of decimal places. |
num_folds | int | 3 | Set the number of folds for reduced error pruning. One fold is used as pruning set. |
reduced_error_pruning | Boolean | false | Enable reduced error pruning. This is false if the unpruned parameter is set to true. |
save_instance_data | Boolean | false | Set whether the instance data is to be saved. If set to true, it does not clean up after the tree has been built. |
seed | int | 1 | Set the seed for random data shuffling. |
subtree_raising | Boolean | true | Enable sub-tree raising. |
unpruned | Boolean | false | Enable using unpruned tree. This is false if the reduced_error_pruningparameter is set to true. |
use_laplace | Boolean | false | Allow Laplace smoothing for predicted probabilities. |
use_md_correction | Boolean | true | Allow MDL correction for info gain on numeric attributes. |
K-Nearest Neighbors
Option Name | Data Type | Default Value | Description |
---|---|---|---|
knn | int | 1 | Set the number of neighbors the learner will use. |
batch_size | int | 100 | Set the preferred batch size for batch prediction. |
cross_validate | Boolean | false | Set whether the hold-one-out cross-validation will be used to select the best k-value. |
distance_weighting | enum [weight_none, weight_inverse, weight_similarity] | weight_none | Set the distance weighting method used. |
do_not_check_capabilities | Boolean | false | Enable capability-check. |
mean_squared | Boolean | false | Set whether the mean squared error is used rather than mean absolute error when doing cross-validation. |
nearest_neighbours_search_algorithm | enum [LinearNNSearch, BallTree, CoverTree, KDTress, FilteredNeighbourSearch] | LinearNNSearch | Set the nearestNeighbourSearch algorithm to be used for finding nearest neighbor(s). |
num_decimal_places | int | 2 | Set the number of decimal places. |
window_size | int | 0 (no window size) | Set the maximum number of instances allowed in the training pool. |
Logistic Regression
Option Name | Data Type | Default Value | Description |
---|---|---|---|
batch_size | int | 100 | Set the preferred batch size for batch prediction. |
do_not_check_capabilities | Boolean | false | Enable capability-check. |
matx_its | int | -1 | Set the maximum number of iterations. |
num_decimal_places | int | 4 | Set the number of decimal places. |
ridge | double | 1.00E-08 | Sets the ridge in the log-likelihood. |
use_conjugate_gradient_descent | Boolean | false | Sets whether conjugate gradient descent is used. |
Multilayer Perceptron
Option Name | Data Type | Default Value | Description |
---|---|---|---|
batch_size | int | 100 | Set the preferred batch size for batch prediction. |
do_not_check_capabilities | Boolean | false | Enable capability-check. |
learning_rate | double | 0.3 | Set the learning rate for the backward propagation. The value should be between 0 - 1. |
momentum | double | 0.2 | Set the momentum rate for the backward propagation algorithm.The value should be between 0 - 1. |
num_decimal_places | int | 2 | Set the number of decimal places. |
seed | int | 0 | Set the value used to seed the random number generator. |
validation_set_size | int | 0 | Set the percentage size of validation set to use to terminate training. The value should be between 0 - 1. |
validation_threshold | int | 20 | Set the number of consecutive increases of error allowed for validation testing before training terminates. |
Naive Bayes
Option Name | Data Type | Default Value | Description |
---|---|---|---|
batch_size | int | 100 | Set the preferred batch size for batch prediction. |
do_not_check_capabilities | Boolean | false | Enable capability-check. |
num_decimal_places | int | 2 | Set the number of decimal places. |
use_kernel_estimator | Boolean | false | Use kernel density estimator rather than normal distribution for numeric attributes. This is false if use_supervised_discretization is set to true. |
use_supervised_discretization | Boolean | false | Use supervised discretization to process numeric attributes. This is false if use_kernel_estimator is set to true. |
Random Forests
Option Name | Data Type | Default Value | Description |
---|---|---|---|
bag_size_percent | int | 100 | Specify the size of each bag, as a percentage of the training set size. |
batch_size | int | 100 | Set the preferred batch size for batch prediction. |
break_ties_randomly | Boolean | false | Set whether to break ties randomly when several attributes look equally good. |
calc_out_of_bag | Boolean | false | Set whether to calculate the out-of-bag error. |
compute_attribute_importance | Boolean | false | Set whether compute and output attribute importance (mean impurity decrease method) |
do_not_check_capabilities | Boolean | false | Enable capability-check. |
max_depth | int | 0 | Set the maximum depth of the tree, 0 for unlimited. |
num_decimal_places | int | 2 | Set the number of decimal places. |
num_execution_slots | int | 1 | Set the number of execution slots. |
num_features | int | 0 | Set the number of attributes to randomly investigate. |
num_iterations | int | 100 | Set the number of iterations. |
seed | int | 0 | Set the value used to seed the random number generator. |
store_out_of_bag_predictions | Boolean | false | Set whether to store out-of-bag predictions in internal evaluation object. |
Support Vector Machines
Option Name | Data Type | Default Value | Description |
---|---|---|---|
batch_size | int | 100 | Set the preferred batch size for batch prediction. |
c | double | 1 | Set the complexity constant C. |
build_calibration_models | Boolean | false | Set whether to fit calibration models to SVM outputs. |
checks_turned_off | Boolean | false | Disable all checks. |
do_not_check_capabilities | Boolean | false | Enable capability-check. |
epsilon | double | 1.00E-12 | Set the epsilon for round-off error. |
filter_type | enum [filter_none, filter_normalize, filter_standardize] | filter_normalize | Set how the training data will be transformed. |
kernel | enum [PolyKernel, NormalizedPolyKernel, PrecomputedKernelMatrixKernel, Puk, RBFKernel, StringKernel] | PolyKernel | Specify the Kernel to use. |
num_decimal_places | int | 2 | Set the number of decimal places. |
num_folds | int | -1 | Set the number of folds for the internal cross-validation. |
random_seed | int | 1 | Set the random number seed. |
tolerance_parameter | double | 1.00E-03 | Set the tolerance parameter. |