FILL SPARSE REGION WITH MORE CURVES¶

Augments a dataset that contains curve columns by synthetically filling sparse regions with interpolated or scaled curves. It supports two modes: gap-filling (inserting averaged curves between existing samples until a coefficient-of-variation threshold is met) and nearby-record generation (scaling target curves by small LHS-sampled noise to densify a specific region of the design space).

When to use¶

Tagged: curve_augmentation, curve_interpolation, data_augmentation, doe, engss, gap_filling, lhs, noise_scaling.

Inputs¶

Label	ID	Type	Default	Required	Description
Dataset	dataset	dataset	—	✓	Input dataset containing one or more curve columns whose sparse regions are to be filled; must include the column referenced by curve_column_name.
curve_column_name	curve_column_name	text	—		Name(s) of the dataset column(s) that hold curve objects (text, number, or curve type); only the first value is used — select from the dependent list populated by the input dataset.
gap_type	gap_type	text	—		Scalar feature used to measure the gap between neighbouring curves; accepted values are ymax, yfirst, ylast, or yavg — defaults to ymax if left blank.
threshold	threshold	text	—		Convergence criterion: ratio of the standard deviation to the mean of the gap-feature values; new curves are added until this CoV target is reached — defaults to 0.2 if left blank.
nrow	nrow	text	—		Desired total row count of the output dataset after augmentation; leave blank to let the threshold criterion alone determine when to stop adding rows.
fill_type	fill_type	text	—		Augmentation strategy: use ‘gap’ for curves with similar shapes and value ranges, or ‘engss’ for Engineering Stress-Strain curves (interpolates using xlast and ymax) — defaults to ‘gap’.
Number of New Nearby Records	generate_nearby_records	text	—		Number of new synthetic rows to generate near each target curve by LHS-based noise scaling; set to 0 (default) to use gap-filling mode instead.
Percentage of curve range to be used for generating noise	noise_level_percentage	text	—		Fraction of the full dataset range (xlast and ymax) used as the ±noise envelope when generating nearby records; dimensionless, e.g. 0.05 = 5% — defaults to 0.05.
Dataset with Target Curves	dataset_targets	dataset	—		Optional secondary dataset whose rows define the target curves around which nearby records are generated; if omitted, the primary input dataset is used as the target.

Outputs¶

Label	ID	Type	Description
Output New Dataset with More Rows	dataset	dataset	Augmented dataset containing all original rows plus the newly synthesised curve rows; a ‘new_row’ flag column (yes/no) distinguishes synthetic entries from originals.

Disciplines¶

ai_ml.preprocessing
data.curve.pair
data.curve.transform
data.dataset.transform
design_exploration.doe

Runnable example¶

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_fill_sparse_region_with_curves

Auto-generated from platform schema. Worker id: dataset_fill_sparse_region_with_curves. Schema hash: 90983071bc41. Hand-curated docs in workerexamples/ override this page when present.