UNIFY COLUMNS IN THE DATASET¶

Finds the union of all column names present across every row in a dataset and ensures each row has a value for every discovered column. Missing values introduced by this alignment are filled according to the chosen null-replacement strategy (empty, average, min, or max). Use this worker to normalise ragged or schema-inconsistent datasets before downstream processing.

When to use¶

Classification: process.

Tagged: column_alignment, null_fill, process, ragged_dataset, schema_normalisation, unify_keys.

Inputs¶

Label	ID	Type	Default	Required	Description
Dataset	dataset	dataset	—		Input dataset whose rows may have inconsistent or missing columns; accepts any tabular dataset object — leave empty only if the dataset is piped directly from an upstream worker.
Null Replacement Type	null_replacement_type	select	empty		Strategy used to fill values for columns that are absent in a given row: ‘empty’ (blank/null), ‘average’ (column mean), ‘min’ (column minimum), or ‘max’ (column maximum); defaults to ‘empty’.

Outputs¶

Label	ID	Type	Description
dataset_add_column_by_expression_output_1	dataset_add_column_by_expression_output_1	dataset	Unified dataset where every row contains the full set of columns discovered across all input rows, with missing values filled according to the selected null-replacement strategy.

Disciplines¶

data.dataset.transform

Runnable example¶

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_unify_keys

Auto-generated from transformation schema. Worker id: dataset_unify_keys. Schema hash: db7ef77693f4. Hand-curated docs in workerexamples/ override this page when present.