UNIFY COLUMNS IN THE DATASET¶
Finds the union of all column names present across every row in a dataset and ensures each row has a value for every discovered column. Missing values introduced by this alignment are filled according to the chosen null-replacement strategy (empty, average, min, or max). Use this worker to normalise ragged or schema-inconsistent datasets before downstream processing.
When to use¶
Classification: process.
Tagged: column_alignment, null_fill, process, ragged_dataset, schema_normalisation, unify_keys.
Inputs¶
| Label | ID | Type | Default | Required | Description |
|---|---|---|---|---|---|
| Dataset | dataset | dataset | — | Input dataset whose rows may have inconsistent or missing columns; accepts any tabular dataset object — leave empty only if the dataset is piped directly from an upstream worker. | |
| Null Replacement Type | null_replacement_type | select | empty | Strategy used to fill values for columns that are absent in a given row: ‘empty’ (blank/null), ‘average’ (column mean), ‘min’ (column minimum), or ‘max’ (column maximum); defaults to ‘empty’. |
Outputs¶
| Label | ID | Type | Description |
|---|---|---|---|
| dataset_add_column_by_expression_output_1 | dataset_add_column_by_expression_output_1 | dataset | Unified dataset where every row contains the full set of columns discovered across all input rows, with missing values filled according to the selected null-replacement strategy. |
Disciplines¶
- data.dataset.transform
Runnable example¶
A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_unify_keys
Auto-generated from transformation schema. Worker id: dataset_unify_keys. Schema hash: db7ef77693f4. Hand-curated docs in workerexamples/ override this page when present.