DATASET ADD COLUMN BY REGEXP¶
Adds a new column to a dataset by extracting values from an existing column using a regular expression with capture groups. The matched substring (at the specified capture-group index) is stored in the new column for every row. Use this worker when you need to parse structured text embedded in a column — e.g., extracting a numeric dimension from a filename or label string.
When to use¶
Classification: process.
Tagged: add_column, capture_group, dataset_transform, extract, regex, regexp, string_parsing.
Inputs¶
| Label | ID | Type | Default | Required | Description |
|---|---|---|---|---|---|
| Choose Dataset | dataset | dataset | — | Input dataset (tabular) whose rows will be processed; must contain the source column specified in ‘column_name’. | |
| New Column Name | new_column_name | scalar | — | Name of the new column that will be appended to the dataset to hold the regex-extracted values. | |
| Column For Regex | column_name | scalar | — | Name of the existing column in the dataset whose cell values will be searched with the regular expression. | |
| Regular Expression | regexp | scalar | — | Regular expression (with at least one capture group) applied to each cell; e.g., ‘(d+)_mm’ extracts the digit sequence immediately preceding ‘_mm’. | |
| Choose Index | index | scalar | 1 | 1-based index of the capture group whose match is written to the new column; defaults to 1 (first capture group). |
Outputs¶
| Label | ID | Type | Description |
|---|---|---|---|
| dataset_add_column_by_regexp_output_1 | dataset_add_column_by_regexp_output_1 | dataset | A copy of the input dataset with the new column appended, containing the regex-extracted substring for each row. |
Disciplines¶
- data.dataset.transform
Runnable example¶
A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_add_column_by_regexp
Auto-generated from transformation schema. Worker id: dataset_add_column_by_regexp. Schema hash: 94ca0fbc9be2. Hand-curated docs in workerexamples/ override this page when present.