.. _auto_csv_collect_rows_to_dataset:

*PARSES SEVERAL CSVS FILES BASED ON EXTENSION AND COLLECTS THE ROWS TO CREATE A DATASET*
========================================================================================

Scans a collection of CSV (or similarly-delimited) files filtered by file extension, parses each file using the specified header and value rows, and concatenates all rows into a single unified dataset. Use this worker when you need to batch-ingest multiple CSV files from a directory and merge their row-level data for downstream analysis.

When to use
-----------

Classification: **process**.

Tagged: ``batch``, ``collect``, ``concatenate``, ``csv``, ``dataset``, ``extension-filter``, ``ingest``, ``multi-file``.

Inputs
------

.. list-table::
   :header-rows: 1
   :widths: 20 20 20 20 20 20

   * - Label
     - ID
     - Type
     - Default
     - Required
     - Description
   * - Files Extensions Separated By Comma
     - files_extensions_separatedby_comma
     - string
     - —
     - 
     - Comma-separated list of file extensions to include when scanning for input files (e.g. 'csv,txt'); leave blank to process all files regardless of extension.
   * - Header Row
     - header_row
     - integer
     - —
     - 
     - Zero-based (or one-based, per platform convention) row index that contains the column header names; leave blank to use the first row as the header.
   * - Value Row
     - value_row
     - string
     - —
     - 
     - Row index or range (as a string) at which data values begin; leave blank to start reading immediately after the header row.
   * - Suppress From Header
     - suppressfrom_header
     - string
     - —
     - 
     - Comma-separated list of column names or patterns to exclude from the parsed header, effectively dropping those columns from every file before merging.
   * - Limit Columns To
     - limit_columnsto
     - string
     - —
     - 
     - Comma-separated list of column names to retain in the output dataset; all other columns are discarded — leave blank to keep all columns.

Outputs
-------

.. list-table::
   :header-rows: 1
   :widths: 20 20 20 20

   * - Label
     - ID
     - Type
     - Description
   * - csv_collect_rows_to_dataset_output_1
     - csv_collect_rows_to_dataset_output_1
     - dataset
     - Unified tabular dataset produced by row-wise concatenation of all matching CSV files, with columns filtered and named according to the header, suppress, and limit-columns settings.

Disciplines
-----------

- data.dataset.ingest
- data.dataset.transform
- data.io.csv

.. raw:: html

   <hr style="margin-top:2em">
   <p style="font-size:11px;color:#888">
   Auto-generated from <code>transformation</code> schema. Worker id: <code>csv_collect_rows_to_dataset</code>. Schema hash: <code>902905f9bc5f</code>. Hand-curated docs in <code>workerexamples/</code> override this page when present.
   </p>