GET DATASET SUMMARY

Computes a statistical summary (count, mean, min, max, std, etc.) for a dataset, optionally restricted to a selected subset of columns. Use this worker to quickly profile a dataset’s schema and descriptive statistics before further processing or modelling.

When to use

Classification: process.

Tagged: descriptive_stats, eda, profiling, schema, summary.

Inputs

Label ID Type Default Required Description
Dataset dataset dataset   Input dataset to be summarised; accepts any tabular dataset available in the workflow context — leave empty only if the dataset is piped implicitly from an upstream worker.
Choose Columns columns scalar   Optional comma-separated or multi-select list of column names to restrict the summary to; leave blank to include all columns in the dataset.

Outputs

Label ID Type Description
dataset_get_summary_output_1 dataset_get_summary_output_1 dataset Tabular dataset containing per-column descriptive statistics (e.g. count, mean, std, min, 25/50/75 percentiles, max) for the selected columns of the input dataset.

Disciplines

  • data.statistics

Runnable example

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_get_summary


Auto-generated from transformation schema. Worker id: dataset_get_summary. Schema hash: 9e0258aa4ca2. Hand-curated docs in workerexamples/ override this page when present.