UNIFY COLUMNS IN THE DATASET

Finds the union of all column names present across every row in a dataset and ensures each row has a value for every discovered column. Missing values introduced by this alignment are filled according to the chosen null-replacement strategy (empty, average, min, or max). Use this worker to normalise ragged or schema-inconsistent datasets before downstream processing.

When to use

Classification: process.

Tagged: column_alignment, null_fill, process, ragged_dataset, schema_normalisation, unify_keys.

Inputs

Label ID Type Default Required Description
Dataset dataset dataset   Input dataset whose rows may have inconsistent or missing columns; accepts any tabular dataset object — leave empty only if the dataset is piped directly from an upstream worker.
Null Replacement Type null_replacement_type select empty   Strategy used to fill values for columns that are absent in a given row: ‘empty’ (blank/null), ‘average’ (column mean), ‘min’ (column minimum), or ‘max’ (column maximum); defaults to ‘empty’.

Outputs

Label ID Type Description
dataset_add_column_by_expression_output_1 dataset_add_column_by_expression_output_1 dataset Unified dataset where every row contains the full set of columns discovered across all input rows, with missing values filled according to the selected null-replacement strategy.

Disciplines

  • data.dataset.transform

Runnable example

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_unify_keys


Auto-generated from transformation schema. Worker id: dataset_unify_keys. Schema hash: db7ef77693f4. Hand-curated docs in workerexamples/ override this page when present.