DATASET PROCESS CURVE COLUMNS

Processes one or more curve columns in a dataset through a configurable multi-step pipeline: monotonic enforcement, pre-digitization, smoothing (average, LOESS, or CFC filter), outlier removal, clipping, cross-row X-sync, and final digitization. Use this worker to standardize and clean raw curve data in bulk before downstream analysis or model training.

When to use

Tagged: cfc_filter, clip, curve, dataset, digitize, loess, monotonic, outlier.

Inputs

Label ID Type Default Required Description
Dataset dataset dataset Input dataset (tabular) containing one or more curve-typed columns to be processed; must be non-empty.
Curve Columns curve_columns select   Names of the curve columns to process; if left empty the worker returns the dataset unchanged.
Output Column Postfix output_postfix text   String appended to processed column names (e.g. ‘_processed’ yields ‘OUT_C1_processed’); leave blank to overwrite the original columns in-place.
Monotonic Type monotonic_type select x   Axis on which to enforce monotonicity before all other steps: ‘x’ (default), ‘y’, ‘xy’, or ‘none’ to skip; applied uniformly to all selected columns.
Pre-Digitize Points pre_digitize_points text 0   [per-column] Integer number of evenly-spaced points for an optional pre-digitization pass; use 0 (default) to skip; comma-separated values apply different counts per column (e.g. ‘500,0,1000’).
Smooth Method smooth_method select none   Smoothing algorithm to apply after pre-digitization: ‘none’ (default, skip), ‘smooth’ (forward-backward average), ‘regression_smooth’ (LOESS), or ‘filter’ (CFC filter).
Smooth Points smooth_points scalar 4   Number of averaging points used by the ‘Average Smooth’ method (default 4); ignored when another smooth method is selected.
Smooth Percentage smooth_percentage scalar 30   Fraction of total points (as a percentage, default 30) used as the bandwidth for the LOESS ‘Regression Smooth’ method; ignored otherwise.
Filter Frequency filter_freq scalar 60   Channel Frequency Class (CFC) cut-off frequency in Hz (default 60) for the ‘CFC Filter’ smooth method; ignored otherwise.
Remove Outlier Type remove_outlier_type select none   Strategy for removing outlier points after smoothing; set to ‘none’ (default) to skip; consult platform documentation for available type identifiers.
Clip X Min clip_xmin text -1e20   [per-column] Minimum x-value for clipping; points with x below this threshold are removed (default -1e20, effectively no lower clip); comma-separated for per-column control.
Clip X Max clip_xmax text 1e20   [per-column] Maximum x-value for clipping; points with x above this threshold are removed (default 1e20, effectively no upper clip); comma-separated for per-column control.
Clip Y Min clip_ymin text -1e20   [per-column] Minimum y-value for clipping; points with y below this threshold are removed (default -1e20, effectively no lower clip); comma-separated for per-column control.
Clip Y Max clip_ymax text 1e20   [per-column] Maximum y-value for clipping; points with y above this threshold are removed (default 1e20, effectively no upper clip); comma-separated for per-column control.
Sync X Start and End sync_x_start_end text no   [per-column] Whether to synchronize the x-axis start and/or end across all rows of a column before final digitization; ‘no’ (default) skips this step; comma-separated for per-column control.
Digitize Points digitize_points text 100   [per-column] Number of evenly-spaced points for the final digitization pass (default 100); use 0 to skip; comma-separated values apply different counts per column.

Outputs

Label ID Type Description
Processed Dataset dataset dataset Dataset identical in structure to the input but with the selected curve columns replaced (or augmented, if an output postfix was specified) by their fully processed counterparts.

Disciplines

  • data.curve.transform
  • data.dataset.transform
  • data.signal_processing

Runnable example

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_process_curve_columns


Auto-generated from platform schema. Worker id: dataset_process_curve_columns. Schema hash: c287abbde5a6. Hand-curated docs in workerexamples/ override this page when present.