GET DATASET SUMMARY¶

Computes a statistical summary (count, mean, min, max, std, etc.) for a dataset, optionally restricted to a selected subset of columns. Use this worker to quickly profile a dataset’s schema and descriptive statistics before further processing or modelling.

When to use¶

Classification: process.

Tagged: descriptive_stats, eda, profiling, schema, summary.

Inputs¶

Label	ID	Type	Default	Required	Description
Dataset	dataset	dataset	—		Input dataset to be summarised; accepts any tabular dataset available in the workflow context — leave empty only if the dataset is piped implicitly from an upstream worker.
Choose Columns	columns	scalar	—		Optional comma-separated or multi-select list of column names to restrict the summary to; leave blank to include all columns in the dataset.

Outputs¶

Label	ID	Type	Description
dataset_get_summary_output_1	dataset_get_summary_output_1	dataset	Tabular dataset containing per-column descriptive statistics (e.g. count, mean, std, min, 25/50/75 percentiles, max) for the selected columns of the input dataset.

Disciplines¶

data.statistics

Runnable example¶

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_get_summary

Auto-generated from transformation schema. Worker id: dataset_get_summary. Schema hash: 9e0258aa4ca2. Hand-curated docs in workerexamples/ override this page when present.