GENERATE SAMPLING POINTS

Generates a structured set of sampling points (experiments) from a defined variable space using a choice of classical and advanced DOE strategies — including LHS, full factorial, D-Optimal, Taguchi, and Definitive Screening Design. Use this worker to create the design matrix at the start of any design-exploration, surrogate-training, or sensitivity-analysis workflow.

When to use

Tagged: d-optimal, design_matrix, design_of_experiments, doe, dsd, fractional_factorial, full_factorial, lhs.

Inputs

Label ID Type Default Required Description
Variables variables dataset (complex) Dataset defining each design variable — provide name, type (continuous / discrete / constant), and min/default/max bounds; at least one variable is required to build the design space.
Sampling Type sampling_type select d-opt DOE strategy to use for generating points; choose one or more from D-Optimal, Full Factorial, LHS, D-Optimal GA, Space Filling, Taguchi Matrix, Definitive Screening Design, or Fractional Factorial — defaults to D-Optimal.
Points Per Variable num_points_per_variable scalar 3 Number of discrete level values sampled per variable when constructing the candidate set; default is 3 — increase for finer resolution in factorial-style methods.
Number Of Experiments num_experiments scalar 10 Total number of experimental runs (rows) to select for the final design matrix; default is 10 — ignored by methods that fix run count (e.g. full factorial).
Normalize Variables normalize select no   Whether to normalize all variable values to [0, 1] before generating points; default is ‘no’ — set to ‘yes’ when variables have very different scales.
Use Per Variable Samples use_per_variable_num_samples select no   Whether to use each variable’s own sample-count setting instead of the global num_points_per_variable; default is ‘no’ — set to ‘yes’ when variables require different resolution levels.
Split Input Ranges split_input_ranges select no   When yes, split each eligible continuous variable’s [min,max] into sub-ranges (size controlled by step_percentage) and run DOE for each bucket, aggregating the results. A variable is eligible when its max/min ratio exceeds 100/step_percentage. Variables listed in log_variables are split in log-space automatically (bucket boundaries become log-spaced in the original domain), which is recommended for variables spanning multiple orders of magnitude.
Step Percentage step_percentage scalar 20   Size of each sub-range as a percentage of the variable’s full range. Also sets the eligibility threshold: a variable is split when max/min > 100/step_percentage. Default 20 gives 5 buckets and a 5x span threshold.
Apply Split To All Variables apply_split_to_all_variables select no   When yes, split every continuous variable regardless of how wide its range is. When no, only split variables whose relative range exceeds step_percentage.
Constraints constraints dataset   The dataset should contain columns named needle,condition, target where needle is the variable , condition is the operator such as gt,lt, etc and the target is the value to check for. For example, if we need X2 to be greater than X2, we can set needle=X2, condition=GT, target=X1 <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
Filter Constants filter_constants select no   Exclude constant variables prior to sampling
Reverse Constraints Order reverse_contraints_order select yes    
Design Prefix prefix scalar ITER_ Design Prefix
Design Iteration design_iter scalar 1 Design Iteration
Type of Design design_type select no   Return just the first design
Merge points based on Proximity proximity_merge select no   In the event of multiple sampling scheme selection, this will help to merge close neighbhoring points based on Euclidean distance
Proximity Merge Threshold proximity_merge_threshold scalar 0.01   The euclidean distance for each row is multiplied by this number to get the threshold value. Rows whose values are within this tolerance are replaced with the averaged values
Proximity Treatment proximity_treatment string merge   The default treatment is to merge the close points. We can choose to remove them.
Baseline Detection baseline_type string value_match   From the generated experiments, this option allows to pick the baseline design. If none is found, the first row is selected as baseline design
Cross Sample Size num_cross_size text   Number for the Cross Sample Size
Log Variables log_variables textarea   Comma- or newline-separated list of variable names to sample in log space. Their min/max/default are log-transformed before sampling and exp-transformed back on output. When split_input_ranges is yes, these variables also get log-spaced buckets (use for variables like beta that span several orders of magnitude).

Outputs

Label ID Type Description
Experiments experiments dataset Dataset of generated experimental runs — each row is one simulation/test point with columns corresponding to the input variable names and their assigned values.
Baseline Design baseline_design dataset Dataset containing the baseline (default) design point derived from the variable definitions — useful as a reference run for delta-comparisons and normalization.

Disciplines

  • design_exploration.doe

Auto-generated from platform schema. Worker id: doe_sampling_point_generator. Schema hash: aebc6d685ac6. Hand-curated docs in workerexamples/ override this page when present.