Filter Stage
The Filter stage keeps only the rows that match your rules inside a pipeline. It reads data from the shared context, applies your conditions, and writes the filtered rows back to the context at the path you choose.
What the stage does
- Source / output paths — You must pick a source field from the context. The output path defaults to the source; set a different path to keep both versions.
- Row conditions — Evaluate one or more rules on a column using a literal value or another column in the same row.
- Operators — Equals, Not Equals, Contains (case-insensitive), Greater Than (>), Less Than (<), Greater Than or Equal (>=), Less Than or Equal (<=), Matches Regex (JavaScript regex).
- Type-aware comparisons — Numbers and dates are detected and compared numerically/chronologically; all other values are compared as strings.
- Logic chaining — Combine multiple conditions with AND / OR in order.
- Row range — Limit filtering to a start/end row (inclusive) before conditions are applied.
- Array-only input — The stage expects an array of rows (arrays or objects). Non-array sources raise an error; empty sources log an error and stop.
Configure the Filter stage
- Add a Filter stage and choose the Source field (context path of the table to read).
- Set an Output field. Leave it blank to overwrite the source, or enter a new path to keep a copy (e.g.
sales_filtered). - Pick a column (index for row arrays, or property name for row objects).
- Select an operator (Equals, Contains, >, Regex, etc.).
- Choose a target type:
- Value: Provide a literal value to compare against.
- Column: Compare against another column in the same row.
- Optionally set Start row and End row (inclusive indexes) to scope the filter.
- Add more conditions as needed. Additional rules use AND/OR chaining in the order shown.
- Click Run Stage to preview, or Run All to execute the pipeline. The stage logs how many rows were filtered and updates the context at the output path.
Example: keep winners
Input dataset:
Region | Sales | Target
-------|-------|-------
NY | 200 | 150
CA | 100 | 200
TX | 300 | 250
Filter setup:
- Input path:
sales - Output path:
sales.filtered - Condition: Column 1 (Sales) > Column 2 (Target)
- Logic: AND (single rule)
Result:
Region | Sales | Target
-------|-------|-------
NY | 200 | 150
TX | 300 | 250
Tips for reliable pipelines
- Validate upstream: Combine with a Validate stage to catch missing or malformed values before filtering.
- Branch outputs: Write filtered data to a new path so you can compare it with the original in downstream Visualize stages.