GridscriptDonate

Schema Stage

The Schema stage analyzes a table in the pipeline context and infers a lightweight schema. It samples rows to detect field types, nullability, and uniqueness, then writes a { fields: [...] } schema object back into the context without modifying the original data.

What the stage does

  • Source field — Reads from a context field that must be an array of rows (arrays or objects). Errors if missing, not an array, or empty.
  • Output field — Writes the inferred schema to a target context path (defaults to {source}_schema when not set).
  • Sample size — Uses the first N rows (up to the total number of rows) to infer the schema; helps keep analysis fast on large tables.
  • Field stats — For each column/key, tracks: detected types, whether it can be null/undefined, whether values are unique across the sample, and up to three example values.
  • Output shape — Writes { fields: [{ field, types, isNullable, isUnique, examples }] } to the target field in the context.
  • No mutation — Does not change the source table; it only adds or updates the schema object at the target path.

Configure the Schema stage

  1. Choose the Source field (table to analyze) from the available context fields.
  2. Set an Output field (optional). If left blank, the stage will use {source}_schema (for example, orders_schema for a source orders).
  3. Adjust the Sample size if needed. Larger samples provide more accurate uniqueness/nullability detection, but take longer to scan.
  4. Click Run Stage (or Run All) to infer the schema. On success, the pipeline context will include the schema object under the chosen output field.

Example: infer customer schema

Source field: customers

  • Sample size: 100
  • Output field: customers_schema

After running the stage, the context contains customers_schema with one entry per column, showing detected types (e.g., ["string"] or ["string", "null"]), whether the field is nullable, whether values are unique across the sample, and a few example values. You can then feed this into Validate rules or use it as documentation for downstream consumers.

Tips for using schema inference

  • Run after cleaning: Use Transform first so types and nulls reflect your normalized data, not raw imports.
  • Tune sample size: Increase it when you care about uniqueness or nullability across a large table; decrease it to speed up exploration.
  • Pair with validation: Use the inferred schema as a reference for building Validate rules (types, required fields, uniqueness).
  • Document pipelines: Keep the schema output in the context so other stages — or collaborators — can inspect how a table is shaped at each step of the pipeline.