BIN VALUES BASED ON COLUMN AND RETURN THE BIN THAT HAS THE MOST NUMBER OF ITEMS

Divides a numeric dataset column into a specified number of equal-width bins and returns the boundary value (min or max) of the bin that contains the most items. Use this worker to quickly identify the most densely populated range within a distribution.

When to use

Classification: process.

Tagged: binning, column_stats, distribution, frequency, histogram, max_bin.

Inputs

Label ID Type Default Required Description
Dataset dataset_1 dataset   Input dataset containing the numeric column to be binned; must be a tabular dataset with at least one numeric column.
Choose Column target_column text   Name of the numeric column within dataset_1 on which binning is performed; populated dynamically from the upstream dataset.
Num Of Bins num_of_bins text 10   Number of equal-width bins to divide the column range into; defaults to 10 — increase for finer resolution, decrease for coarser groupings.
Bin Bound Type min_max select max   Which boundary of the most-populated bin to return: ‘min’ for the lower edge or ‘max’ (default) for the upper edge of that bin.

Outputs

Label ID Type Description
Bin Value bin_value scalar Scalar boundary value (lower or upper edge, per min_max selection) of the bin containing the highest number of data points.

Disciplines

  • data.dataset.transform
  • data.statistics

Runnable example

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_get_max_bin_range


Auto-generated from transformation schema. Worker id: dataset_get_max_bin_range. Schema hash: 15eb4cfce292. Hand-curated docs in workerexamples/ override this page when present.