GENERATE SAMPLING POINTS¶

Generates a structured set of sampling points (experiments) from a defined variable space using a choice of classical and advanced DOE strategies — including LHS, full factorial, D-Optimal, Taguchi, and Definitive Screening Design. Use this worker to create the design matrix at the start of any design-exploration, surrogate-training, or sensitivity-analysis workflow.

When to use¶

Tagged: d-optimal, design_matrix, design_of_experiments, doe, dsd, fractional_factorial, full_factorial, lhs.

Inputs¶

Label	ID	Type	Default	Required	Description
Variables	variables	dataset	(complex)	✓	Dataset defining each design variable — provide name, type (continuous / discrete / constant), and min/default/max bounds; at least one variable is required to build the design space.
Sampling Type	sampling_type	select	d-opt	✓	DOE strategy to use for generating points; choose one or more from D-Optimal, Full Factorial, LHS, D-Optimal GA, Space Filling, Taguchi Matrix, Definitive Screening Design, or Fractional Factorial — defaults to D-Optimal.
Points Per Variable	num_points_per_variable	scalar	3	✓	Number of discrete level values sampled per variable when constructing the candidate set; default is 3 — increase for finer resolution in factorial-style methods.
Number Of Experiments	num_experiments	scalar	10	✓	Total number of experimental runs (rows) to select for the final design matrix; default is 10 — ignored by methods that fix run count (e.g. full factorial).
Normalize Variables	normalize	select	no		Whether to normalize all variable values to [0, 1] before generating points; default is ‘no’ — set to ‘yes’ when variables have very different scales.
Use Per Variable Samples	use_per_variable_num_samples	select	no		Whether to use each variable’s own sample-count setting instead of the global num_points_per_variable; default is ‘no’ — set to ‘yes’ when variables require different resolution levels.
Split Input Ranges	split_input_ranges	select	no		When yes, split each eligible continuous variable’s [min,max] into sub-ranges (size controlled by step_percentage) and run DOE for each bucket, aggregating the results. A variable is eligible when its max/min ratio exceeds 100/step_percentage. Variables listed in log_variables are split in log-space automatically (bucket boundaries become log-spaced in the original domain), which is recommended for variables spanning multiple orders of magnitude.
Step Percentage	step_percentage	scalar	20		Size of each sub-range as a percentage of the variable’s full range. Also sets the eligibility threshold: a variable is split when max/min > 100/step_percentage. Default 20 gives 5 buckets and a 5x span threshold.
Apply Split To All Variables	apply_split_to_all_variables	select	no		When yes, split every continuous variable regardless of how wide its range is. When no, only split variables whose relative range exceeds step_percentage.
Constraints	constraints	dataset	—		The dataset should contain columns named needle,condition, target where needle is the variable , condition is the operator such as gt,lt, etc and the target is the value to check for. For example, if we need X2 to be greater than X2, we can set needle=X2, condition=GT, target=X1 <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
Filter Constants	filter_constants	select	no		Exclude constant variables prior to sampling
Reverse Constraints Order	reverse_contraints_order	select	yes
Design Prefix	prefix	scalar	ITER_	✓	Design Prefix
Design Iteration	design_iter	scalar	1	✓	Design Iteration
Type of Design	design_type	select	no		Return just the first design
Merge points based on Proximity	proximity_merge	select	no		In the event of multiple sampling scheme selection, this will help to merge close neighbhoring points based on Euclidean distance
Proximity Merge Threshold	proximity_merge_threshold	scalar	0.01		The euclidean distance for each row is multiplied by this number to get the threshold value. Rows whose values are within this tolerance are replaced with the averaged values
Proximity Treatment	proximity_treatment	string	merge		The default treatment is to merge the close points. We can choose to remove them.
Baseline Detection	baseline_type	string	value_match		From the generated experiments, this option allows to pick the baseline design. If none is found, the first row is selected as baseline design
Cross Sample Size	num_cross_size	text	—		Number for the Cross Sample Size
Log Variables	log_variables	textarea	—		Comma- or newline-separated list of variable names to sample in log space. Their min/max/default are log-transformed before sampling and exp-transformed back on output. When split_input_ranges is yes, these variables also get log-spaced buckets (use for variables like beta that span several orders of magnitude).

Outputs¶

Label	ID	Type	Description
Experiments	experiments	dataset	Dataset of generated experimental runs — each row is one simulation/test point with columns corresponding to the input variable names and their assigned values.
Baseline Design	baseline_design	dataset	Dataset containing the baseline (default) design point derived from the variable definitions — useful as a reference run for delta-comparisons and normalization.

Disciplines¶

design_exploration.doe

Auto-generated from platform schema. Worker id: doe_sampling_point_generator. Schema hash: aebc6d685ac6. Hand-curated docs in workerexamples/ override this page when present.