Cross Validator (Regression)
performs K-fold Cross Validation on a regression dataset.
Overview
Transform-type Snap
-
Does not support Ultra Pipelines
Cross Validator (Regression) performs K-fold Cross Validation on a regression dataset.
If you want to perform K-fold Cross Validation on a classification dataset, use the Cross Validator (Classification) Snap instead.

Prerequisites
- The data from upstream Snap must be in tabular format (no nested structure).
- This Snap automatically derives the schema (field names and types) from the first document. Therefore, the first document must not have any missing values.
Limitations and known issues
None.
Snap views
View | Description | Examples of upstream and downstream Snaps |
---|---|---|
Input | The regression dataset. | Any Snap that generates a dataset document.
Examples:
|
Output | Statistical information about the performance of the selected algorithm on the dataset. | CSV Formatter/JSON Formatter Snap and File Writer Snap can be used to write the output statistics to file. |
Error |
Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:
Learn more about Error handling in Pipelines. |
Snap settings
-
Suggestion icon (
): Indicates a list that is dynamically populated based on the configuration.
-
Expression icon (
): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
-
Add icon (
): Indicates that you can add fields in the field set.
-
Remove icon (
): Indicates that you can remove fields from the field set.
Field / Field set | Type | Description |
---|---|---|
Label | String | Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if there are more than one of the same Snap in the pipeline. |
Label field | Required. The label or output field in the dataset that the model will be trained to predict.
This value must be numeric.
Example: |
|
Algorithm | Required. The regression algorithm that builds the model.
Valid values:
Default value: K-Nearest Neighbors |
|
Options | The parameters to be applied on the selected algorithm.
Each algorithm has a different set of parameters to be configured in this property.
If specifying multiple parameters, separate them with a comma ",". If blank, the default values are applied for all the parameters. Valid values: Refer to Options for Algorithms. Default value: None Examples:
|
|
Fold | integer | Required. The number of folds.
Valid values: 2 through the maximum integer Default value: 10 |
Use random seed | checkbox | If selected, the value of Random seed is applied to the randomizer to get reproducible results.
Default status: Selected |
Random seed | integer | Required. A number used as the static seed for the randomizer.
Default value: 12345 |
Snap execution | Dropdown list | Select one of the three modes in which the Snap executes.
Available options are:
|
Troubleshooting
None.