PARSES SEVERAL CSVS FILES BASED ON EXTENSION AND COLLECTS THE ROWS TO CREATE A DATASET

Scans a collection of CSV (or similarly-delimited) files filtered by file extension, parses each file using the specified header and value rows, and concatenates all rows into a single unified dataset. Use this worker when you need to batch-ingest multiple CSV files from a directory and merge their row-level data for downstream analysis.

When to use

Classification: process.

Tagged: batch, collect, concatenate, csv, dataset, extension-filter, ingest, multi-file.

Inputs

Label ID Type Default Required Description
Files Extensions Separated By Comma files_extensions_separatedby_comma string   Comma-separated list of file extensions to include when scanning for input files (e.g. ‘csv,txt’); leave blank to process all files regardless of extension.
Header Row header_row integer   Zero-based (or one-based, per platform convention) row index that contains the column header names; leave blank to use the first row as the header.
Value Row value_row string   Row index or range (as a string) at which data values begin; leave blank to start reading immediately after the header row.
Suppress From Header suppressfrom_header string   Comma-separated list of column names or patterns to exclude from the parsed header, effectively dropping those columns from every file before merging.
Limit Columns To limit_columnsto string   Comma-separated list of column names to retain in the output dataset; all other columns are discarded — leave blank to keep all columns.

Outputs

Label ID Type Description
csv_collect_rows_to_dataset_output_1 csv_collect_rows_to_dataset_output_1 dataset Unified tabular dataset produced by row-wise concatenation of all matching CSV files, with columns filtered and named according to the header, suppress, and limit-columns settings.

Disciplines

  • data.dataset.ingest
  • data.dataset.transform
  • data.io.csv

Auto-generated from transformation schema. Worker id: csv_collect_rows_to_dataset. Schema hash: 902905f9bc5f. Hand-curated docs in workerexamples/ override this page when present.