GET DATASET SUMMARY¶
Computes a grouped descriptive summary (count, mean, min, max, std, etc.) of a dataset, partitioned by one or more grouping columns. Use this worker to quickly profile subsets of tabular data by category or experimental condition.
When to use¶
Classification: process.
Tagged: describe, eda, group_by, grouped_summary, profiling, statistics, tabular.
Inputs¶
| Label | ID | Type | Default | Required | Description |
|---|---|---|---|---|---|
| Dataset | dataset | dataset | — | Input tabular dataset to summarize; accepts any d3VIEW dataset object — leave empty only if the dataset is piped in from an upstream worker. | |
| Group By | group_by | scalar | — | One or more column names whose unique value combinations define the groups; leave blank to compute a single global summary across the entire dataset. | |
| Choose Columns | columns | scalar | — | Subset of numeric or categorical columns to include in the summary; leave blank to include all columns in the dataset. |
Outputs¶
| Label | ID | Type | Description |
|---|---|---|---|
| dataset_get_summary_output_1 | dataset_get_summary_output_1 | dataset | Grouped summary dataset where each row corresponds to one group and columns contain descriptive statistics (e.g., count, mean, std, min, max) for each selected input column. |
Disciplines¶
- data.dataset.transform
- data.statistics
Runnable example¶
A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_get_grouped_summary
Auto-generated from transformation schema. Worker id: dataset_get_grouped_summary. Schema hash: 5d225919e09d. Hand-curated docs in workerexamples/ override this page when present.