.. _auto_dataset_remove_unique_columns: *DATASET REMOVE UNIQUE COLUMNS* =============================== Removes columns from a dataset whose ratio of unique values exceeds a specified threshold, helping to eliminate high-cardinality identifier-like columns before modeling or analysis. Use this worker to clean datasets by dropping columns that carry little statistical signal due to near-unique values in every row. When to use ----------- Classification: **process**. Tagged: ``column_filtering``, ``data_cleaning``, ``high_cardinality``, ``preprocessing``, ``remove_unique_columns``, ``uniqueness_ratio``. Inputs ------ .. list-table:: :header-rows: 1 :widths: 20 20 20 20 20 20 * - Label - ID - Type - Default - Required - Description * - Choose Dataset - dataset_1 - dataset - — - - Input dataset (tabular) from which high-cardinality columns will be removed; accepts any dataset object available in the workflow. * - Uniqueness Ratio - uniqueness_ratio - scalar - 0.05 - - Fraction threshold (0–1) above which a column is considered too unique and dropped; defaults to 0.05, meaning columns where more than 5 % of values are unique relative to the total row count are removed. Outputs ------- .. list-table:: :header-rows: 1 :widths: 20 20 20 20 * - Label - ID - Type - Description * - dataset_remove_unique_columns_output_1 - dataset_remove_unique_columns_output_1 - dataset - Cleaned dataset with all columns whose uniqueness ratio exceeds the specified threshold removed, preserving the original row order and remaining column values. Disciplines ----------- - ai_ml.preprocessing - data.dataset.transform Runnable example ---------------- A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: `/api/workflow/example?id=dataset_remove_unique_columns `_ .. raw:: html

Auto-generated from transformation schema. Worker id: dataset_remove_unique_columns. Schema hash: c08c94da978a. Hand-curated docs in workerexamples/ override this page when present.