This example demonstrates how to use the Feature Synthesis Snap to enrich a customer dataset
by generating features from a related transaction dataset.
The pipeline combines customer and transaction datasets using the Feature Synthesis Snap. It automatically
synthesizes features such as total purchase amount and item count per customer by joining on
the common $customer_id field.
Download this pipeline.
-
Configure the
CSV Generator
Snaps to generate base and reference datasets.
The Customers dataset contains:
$customer_id
$firstname
$lastname
$create_time
The Transactions dataset contains:
$transaction_id
$customer_id
$source
$num_item
$total
-
Connect each
CSV Generator
Snap to a Type Converter Snap to normalize data
types.
Correct data typing is required for the Feature Synthesis Snap to detect relationships
and generate features accurately.
-
Configure the Feature Synthesis Snap
with the following settings:
- Base view:
customer
- Reference view:
transaction
- Join field:
$customer_id
The Snap identifies one-to-many relationships and synthesizes aggregated features from
the transaction dataset per customer.
-
Review the output from the Feature Synthesis Snap.
The Snap appends new fields to each customer record, such as:
- Total number of transactions
- Sum of
$total
- Average
$num_item per transaction
A JSON-formatted preview is also available to inspect the complete feature set per customer.
To successfully reuse pipelines:
- Download and import the pipeline in to the SnapLogic Platform.
- Configure Snap accounts, as applicable.
- Provide pipeline parameters, as applicable.