JOIN DATASETS BASED ON PRIMARY KEYS¶
Joins two datasets on one or more shared primary-key columns using simple, outer, inner, or left join semantics. Use this worker when you need to merge tabular datasets from different sources into a single unified dataset before downstream analysis or modeling.
When to use¶
Classification: process.
Tagged: dataset, inner_join, join, left_join, merge, outer_join, primary_key, process.
Inputs¶
| Label | ID | Type | Default | Required | Description |
|---|---|---|---|---|---|
| Dataset1 | dataset1 | dataset | — | The primary (left-hand) dataset to join; must be a tabular dataset object — the join will fail if this is absent or invalid. | |
| Dataset2 | dataset2 | dataset | — | The secondary (right-hand) dataset(s) to join onto Dataset1; supports repeated inputs so multiple datasets can be merged sequentially. | |
| Join Type | join_type | scalar | simple | Join strategy: ‘simple’ (default, column-append with no key matching), ‘outer’ (all rows from both), ‘inner’ (only matching rows), or ‘left’ (all rows from Dataset1, matched rows from Dataset2). | |
| Primary Keys | primarykeys | scalar | — | Comma-separated column name(s) used as the matching key(s) between the two datasets (e.g. ‘id,run_id’); leave blank for a simple side-by-side column append. | |
| Datset1 Columns To Include | datset1_columns_to_include | text | — | Subset of columns from Dataset1 to carry into the output; leave empty to include all Dataset1 columns. | |
| Datset2 Columns To Include | datset2_columns_to_include | text | — | Subset of columns from Dataset2 to carry into the output; leave empty to include all Dataset2 columns. | |
| Prefix for Dataset1 | prefixfordataset1columns | scalar | — | ||
| Prefix for Dataset 2 | prefixfordataset2columns | scalar | — | Dataset 2 Column prefix |
Outputs¶
| Label | ID | Type | Description |
|---|---|---|---|
| dataset_join_output_1 | dataset_join_output_1 | dataset | Merged tabular dataset containing columns from both Dataset1 and Dataset2, combined according to the selected join type and primary keys. |
Disciplines¶
- data.dataset.transform
Runnable example¶
A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_join
Auto-generated from transformation schema. Worker id: dataset_join. Schema hash: 991debb90eaa. Hand-curated docs in workerexamples/ override this page when present.