PARSES SEVERAL CSVS FILES BASED ON EXTENSION AND COLLECTS THE ROWS TO CREATE A DATASET¶

Scans a collection of CSV (or similarly-delimited) files filtered by file extension, parses each file using the specified header and value rows, and concatenates all rows into a single unified dataset. Use this worker when you need to batch-ingest multiple CSV files from a directory and merge their row-level data for downstream analysis.

When to use¶

Classification: process.

Tagged: batch, collect, concatenate, csv, dataset, extension-filter, ingest, multi-file.

Inputs¶

Label	ID	Type	Default	Description
Files Extensions Separated By Comma	files_extensions_separatedby_comma	string	—	Comma-separated list of file extensions to include when scanning for input files (e.g. ‘csv,txt’); leave blank to process all files regardless of extension.
Header Row	header_row	integer	—	Zero-based (or one-based, per platform convention) row index that contains the column header names; leave blank to use the first row as the header.
Value Row	value_row	string	—	Row index or range (as a string) at which data values begin; leave blank to start reading immediately after the header row.
Suppress From Header	suppressfrom_header	string	—	Comma-separated list of column names or patterns to exclude from the parsed header, effectively dropping those columns from every file before merging.
Limit Columns To	limit_columnsto	string	—	Comma-separated list of column names to retain in the output dataset; all other columns are discarded — leave blank to keep all columns.

Outputs¶

Label	ID	Type	Description
csv_collect_rows_to_dataset_output_1	csv_collect_rows_to_dataset_output_1	dataset	Unified tabular dataset produced by row-wise concatenation of all matching CSV files, with columns filtered and named according to the header, suppress, and limit-columns settings.

Disciplines¶

data.dataset.ingest
data.dataset.transform
data.io.csv

Auto-generated from transformation schema. Worker id: csv_collect_rows_to_dataset. Schema hash: 902905f9bc5f. Hand-curated docs in workerexamples/ override this page when present.