.. _auto_csv_collect_rows_to_dataset: *PARSES SEVERAL CSVS FILES BASED ON EXTENSION AND COLLECTS THE ROWS TO CREATE A DATASET* ======================================================================================== Scans a collection of CSV (or similarly-delimited) files filtered by file extension, parses each file using the specified header and value rows, and concatenates all rows into a single unified dataset. Use this worker when you need to batch-ingest multiple CSV files from a directory and merge their row-level data for downstream analysis. When to use ----------- Classification: **process**. Tagged: ``batch``, ``collect``, ``concatenate``, ``csv``, ``dataset``, ``extension-filter``, ``ingest``, ``multi-file``. Inputs ------ .. list-table:: :header-rows: 1 :widths: 20 20 20 20 20 20 * - Label - ID - Type - Default - Required - Description * - Files Extensions Separated By Comma - files_extensions_separatedby_comma - string - — - - Comma-separated list of file extensions to include when scanning for input files (e.g. 'csv,txt'); leave blank to process all files regardless of extension. * - Header Row - header_row - integer - — - - Zero-based (or one-based, per platform convention) row index that contains the column header names; leave blank to use the first row as the header. * - Value Row - value_row - string - — - - Row index or range (as a string) at which data values begin; leave blank to start reading immediately after the header row. * - Suppress From Header - suppressfrom_header - string - — - - Comma-separated list of column names or patterns to exclude from the parsed header, effectively dropping those columns from every file before merging. * - Limit Columns To - limit_columnsto - string - — - - Comma-separated list of column names to retain in the output dataset; all other columns are discarded — leave blank to keep all columns. Outputs ------- .. list-table:: :header-rows: 1 :widths: 20 20 20 20 * - Label - ID - Type - Description * - csv_collect_rows_to_dataset_output_1 - csv_collect_rows_to_dataset_output_1 - dataset - Unified tabular dataset produced by row-wise concatenation of all matching CSV files, with columns filtered and named according to the header, suppress, and limit-columns settings. Disciplines ----------- - data.dataset.ingest - data.dataset.transform - data.io.csv .. raw:: html

Auto-generated from transformation schema. Worker id: csv_collect_rows_to_dataset. Schema hash: 902905f9bc5f. Hand-curated docs in workerexamples/ override this page when present.