GET DATASET SUMMARY

Computes a grouped descriptive summary (count, mean, min, max, std, etc.) of a dataset, partitioned by one or more grouping columns. Use this worker to quickly profile subsets of tabular data by category or experimental condition.

When to use

Classification: process.

Tagged: describe, eda, group_by, grouped_summary, profiling, statistics, tabular.

Inputs

Label ID Type Default Required Description
Dataset dataset dataset   Input tabular dataset to summarize; accepts any d3VIEW dataset object — leave empty only if the dataset is piped in from an upstream worker.
Group By group_by scalar   One or more column names whose unique value combinations define the groups; leave blank to compute a single global summary across the entire dataset.
Choose Columns columns scalar   Subset of numeric or categorical columns to include in the summary; leave blank to include all columns in the dataset.

Outputs

Label ID Type Description
dataset_get_summary_output_1 dataset_get_summary_output_1 dataset Grouped summary dataset where each row corresponds to one group and columns contain descriptive statistics (e.g., count, mean, std, min, max) for each selected input column.

Disciplines

  • data.dataset.transform
  • data.statistics

Runnable example

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_get_grouped_summary


Auto-generated from transformation schema. Worker id: dataset_get_grouped_summary. Schema hash: 5d225919e09d. Hand-curated docs in workerexamples/ override this page when present.