GET DATASET SUMMARY¶

Computes a grouped descriptive summary (count, mean, min, max, std, etc.) of a dataset, partitioned by one or more grouping columns. Use this worker to quickly profile subsets of tabular data by category or experimental condition.

When to use¶

Classification: process.

Tagged: describe, eda, group_by, grouped_summary, profiling, statistics, tabular.

Inputs¶

Label	ID	Type	Default	Description
Dataset	dataset	dataset	—	Input tabular dataset to summarize; accepts any d3VIEW dataset object — leave empty only if the dataset is piped in from an upstream worker.
Group By	group_by	scalar	—	One or more column names whose unique value combinations define the groups; leave blank to compute a single global summary across the entire dataset.
Choose Columns	columns	scalar	—	Subset of numeric or categorical columns to include in the summary; leave blank to include all columns in the dataset.

Outputs¶

Label	ID	Type	Description
dataset_get_summary_output_1	dataset_get_summary_output_1	dataset	Grouped summary dataset where each row corresponds to one group and columns contain descriptive statistics (e.g., count, mean, std, min, max) for each selected input column.

Disciplines¶

data.dataset.transform
data.statistics

Runnable example¶

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_get_grouped_summary

Auto-generated from transformation schema. Worker id: dataset_get_grouped_summary. Schema hash: 5d225919e09d. Hand-curated docs in workerexamples/ override this page when present.