Data Automation
SnapLogic’s Data Automation solution for cloud data platforms accelerates the movement of data from legacy systems, applications, and many other sources into the cloud data warehouse (CDW). This flexible solution delivers traditional ETL integrations to achieve complex transformations of data in-flight, as well as ELT capabilities to transform data in-place, such that data does not move out of the CDW. Coupled with comprehensive API management capabilities, SnapLogic’s Data Automation solution drives full end-to-end automation of data processes, accelerates the loading/transformation of data into the CDW, and ultimately streamlines business processes for quicker data-to-decision delivery.
Key New Features
- ELT Snap Pack
- Added the following Snaps:
- ELT Router
- ELT Cast Function
- ELT String Function
- Enhanced the ELT SCD2 Snap to improve its flexibility and efficiency.
- Added the following Snaps:
- FDL
- Added Google Sheets as a source endpoint.
New Snaps
ELT
- Introduced a new Snap ELT Router to enable routing input SQL queries into multiple output views based on the given conditional expressions.
- Introduced ELT Cast Function Snap to convert a data type of a column in the input SQL string into other supported data types.
- Introduced ELT String Function Snap to support the various string functions supported by the different databases.
Fast Data Loader
- Added Google Sheets as a source endpoint in Fast Data Loader (FDL). Google Sheets enables you to read data from the spreadsheet and write the data to the specified destination endpoint such as SAP-DWC or Snowflake. Learn how: Configuring your Google Sheets Account as a Data Source
Enhancements
- Enhanced the following Snaps to display the final SQL query in their output preview upon Pipeline validation.
- ELT Load
- ELT Merge Into
- ELT Insert-Select
- Enhanced the Configuring ELT Database Accounts to support OAuth2-based authentication on the target Snowflake database.
- Enhanced the ELT Select,
ELT Insert-Select,
ELT SCD2,
ELT Merge Into, and
ELT Load Snaps
to display suggestions on the Schema Name field based on the Default Database Name provided in the Snap Account configuration
when the Database Name is not specified in the respective Snap.
- Improved usability of the suggestions features for these Snaps by making them case-insensitive.
For example, typing default in the Schema Name field displays both
default
andDEFAULT
, if they co-exist. You do not need to type DEFAULT to invoke and select the schema nameDEFAULT
from the suggestions list.
- Improved usability of the suggestions features for these Snaps by making them case-insensitive.
For example, typing default in the Schema Name field displays both
- Enhanced the ELT SCD2 Snap to address different feature requests and issues raised by multiple customers.
These changes provide more flexibility in configuring your SCD2 operations using this Snap.
- Removed Check for nulls and duplicates in source field and added two dropdown lists - Null Value Behavior and Invalid Row Handling.
- Refer to the ELT SCD2 Scenarios to learn more.
- Made the following items in the Meaning field of the Target Table Temporal Fields fieldset mandatory while making the Invalid historical rows flag optional.
- Current row
- Historical row
- Enhanced the ELT Aggregate and ELT Window Functions Snaps to support the following functions across all supported CDWs:
- KURTOSIS
- MODE
- SKEW
Note: Although these functions are not natively supported in Redshift, Azure Synapse, DLP and BigQuery, SnapLogic enables you to use these functions on all the supported target CDWs. - Enhanced the ELT Aggregate Snap to support the following GROUP BY features across all supported CDWs:
- Group by Cube
- Group by Grouping Sets
- Group by Rollup
- Automatic GROUP BY for all input columns.
- Fixed an issue with ELT Merge Into Snap where the Snap erroneously modified the target table column name when the column name contained the target table name.
- Fixed an issue in ELT SCD2 Snap where the Snap causes incorrect results with Snowflake targets, when:
- The Historical Row End Date value is provided.
- Nulls and Invalid rows are recognized, but one or more start dates in the source are null.
- Fixed the issue in ELT Transform Snap where the Output Schema of the Snap does not populate all the column names from its Input Schema.
Known Issues
- [This issue is fixed in the July 2022 Release]: In the case of BigQuery, the ELT Load Snap fails to load data from files in S3 or Redshift when you specify more than one individual file in the File List fieldset.
This is due to the API restrictions in Google BigQuery for loading separate files/tables from S3 and Redshift.
Alternatively, you can use the File Name Pattern field to load data from multiple files in S3 or Redshift to your BigQuery instance.
Note: For the best performance of your load operation, we recommend that you always use the File Name Pattern field to load data from multiple files at a source location (S3 or Redshift) instead of specifying the individual file names/paths in the File List fieldset.
- If you have dropped one or more columns from your target table outside of the ELT Load Snap and then use this Snap to add these column(s) back and load data into the table, the Snap adds the new columns to the target table, but not the corresponding data from the source file.
- When running without Sub-Query Pushdown Optimization (SPDO), ELT Pipelines that contain an ELT Aggregate Snap configured with one ore more GROUP BY ROLLUP fields do not verify the column data types while inserting the Snap output values in the target table. This may lead to incorrect data written to the target table. However, as long as SPDO is on, the same Pipeline runs without this issue.
- In the case of Azure Synapse, if you are configuring the Start_Date column as a Target Table Natural Key in the ELT SCD2 Snap, the Snap fails in each of the following scenarios:
- The source table/file contains one or more null values.
- The target table is empty.
- The End date of current row or End date of historical row has a static value, for example:
2021-01-01
.
- [This issue is fixed in the June 2022 Release]: When you are enclosing the column names within backticks (
`<Column_Name>`
) for creating a new target table in a DLP instance using the ELT Load Snap — typically to ensure that the column names are used verbatim — you may encounter the error: ERROR processing query/statement: no viable alternative at input 'FROM'(line X, pos XX). To prevent this error, edit the header in your source file to exclude any special characters, thereby avoiding the use of backticks in the target table column names. - [This issue is fixed in the July 2022 Release]: The ELT Select Snap does not honor the LIMIT clause included to query a BigQuery instance, during Pipeline validation. However, this issue does not occur during Pipeline runtime.
- [This issue is fixed in the July 2022 Release]: In the case of Azure Synapse, BigQuery, DLP, and Redshift CDWs, the SKEW function available in the General Aggregate Functions List of the ELT Aggregate Snap returns
null
instead of the actual division-by-zero exception for certain ranges of values (for example, for a series of exactly same values).