Merge Stage
The Merge stage joins two datasets already stored in the pipeline context. It supports inner, left, right, and full outer joins using one or more keys, and writes the combined result to a target field.
What the stage does
- Source fields — Choose a left and right field from the context. Both must be arrays of rows (arrays or objects).
- Join keys — Provide one or more keys for each side. Keys can be column indexes (for row arrays) or property names (for row objects); composite keys are comma-separated.
- Join types — Inner, Left, Right, Full outer. Unmatched rows are kept or discarded according to the join type.
- Output shape — If both sides are arrays, outputs an array of arrays with left columns followed by right columns. If either side has objects, outputs merged objects. Missing-side cells are filled with
null (arrays) or omitted/empty (objects). - Target field — Defaults to
left_right_join pattern; customize to branch results. - Error handling — Throws if sources are not arrays or if join keys are missing on either side.
Configure the Merge stage
- Pick the Left field and Right field from the context.
- Enter Left join keys and Right join keys. Use indexes for tables of arrays (e.g.,
0) or property names for objects (e.g., id). For composite keys, enter a comma-separated list (e.g., country, city). - Choose a Join type: Inner, Left, Right, or Full outer.
- Optional: set an Output field to name the merged dataset; leave blank to auto-name from the sources and join type.
- Run the stage (or Run All). The stage logs the join summary and writes the merged rows to the output path.
Example: customers with orders
- Left field:
customers - Right field:
orders - Left keys:
id - Right keys:
customer_id - Join type: Inner
- Output field:
customers_orders_inner
The result contains rows where customer IDs match. Switching to Left keeps all customers, Right keeps all orders, and Full outer keeps both unmatched sides.
Tips for reliable merges
- Normalize keys first: Use a Transform stage to trim or lowercase key fields before joining.
- Check duplicates: Multiple matches on the same key produce multiple joined rows (Cartesian matches). Dedupe keys first if needed.
- Branch outputs: Keep the merged result in a new path so you can compare with original sources in Visualize stages.