DATASET SORT BY ROW¶

Sorts the rows of a dataset by their similarity (or difference) to a reference row from a compare dataset, using raw, absolute, or squared difference metrics. Returns the top-N closest or most-distant rows in ascending or descending order, optionally restricting comparison to specific columns and applying normalization.

When to use¶

Classification: process.

Tagged: compare, dataset_sort, diff, filter, rank, row_distance, sort.

Inputs¶

Label	ID	Type	Default	Description
Dataset	dataset	dataset	—	The input dataset whose rows will be scored and sorted against the reference row; each row is treated as a numeric vector for difference computation.
Compare Dataset	compare_row	dataset	—	Reference dataset whose first row is used as the comparison vector; all rows in ‘dataset’ are compared against this single row.
Diff Type	diff_type	scalar	raw_diff	Method used to compute the difference between each row and the reference row: ‘raw_diff’ (signed), ‘abs_diff’ (absolute value), or ‘sq_diff’ (squared); defaults to ‘raw_diff’.
Order	order	scalar	desc	Sort order for the output rows based on their computed difference score: ‘desc’ (largest difference first) or ‘asc’ (smallest difference first); defaults to ‘desc’.
Limit	limit	scalar	1	Maximum number of rows to return after sorting; defaults to 1 (return only the top-scoring row).
Return Type	return_type	scalar	original	Controls the content of the returned dataset: ‘original’ returns the original row values, or an alternative mode returns the computed difference values; defaults to ‘original’.
Columns To Match	columns	scalar	—	These columns will be used to compute the diference and find the closest match
Normalize	normalize	scalar	no	Normalize columns before sorting

Outputs¶

Label	ID	Type	Description
Dataset	dataset	dataset	Sorted and filtered dataset containing up to ‘limit’ rows ranked by their difference score relative to the reference row, in the specified order and return type format.

Disciplines¶

data.dataset.transform
data.statistics

Runnable example¶

A runnable example is registered for this worker. Open the example workflow on the d3VIEW canvas: /api/workflow/example?id=dataset_sort_by_row

Auto-generated from transformation schema. Worker id: dataset_sort_by_row. Schema hash: 4195bb5e3bdd. Hand-curated docs in workerexamples/ override this page when present.