.. _auto_ontology_dataset: *ONTOLOGY DATASET EXTRACTOR* ============================ Extracts entities from an RDF/SPARQL ontology store and flattens them into a tabular dataset ready for machine learning. Each row represents one entity of the specified type; columns are derived from the entity's own properties or from related child entities traversed via configurable relationship paths, with separate input-feature and target-label column groups. When to use ----------- Tagged: ``arc2``, ``dataset``, ``feature_extraction``, ``ml_prep``, ``ontology``, ``rdf``, ``sparql``, ``tabular``. Inputs ------ .. list-table:: :header-rows: 1 :widths: 20 20 20 20 20 20 * - Label - ID - Type - Default - Required - Description * - Ontology Store Name - ontology_store_name - text - — - ✓ - Name of the ARC2 RDF ontology store to query; the prefix 'arc2_ontology_' is added automatically if omitted (e.g., enter 'my_model' to target the store 'arc2_ontology_my_model'). * - Entity Type (Rows) - entity_type - text - — - ✓ - RDF entity type whose instances become dataset rows (e.g., 'SystemModel'); must match a type present in the ontology store. * - Include Name Column - include_name_column - select - yes - - Controls whether a human-readable name column is added alongside the entity ID column; default 'yes' includes both ID and Name, 'no' includes ID only. * - Input Columns Relationship Path - input_cols_relationship_path - text - — - - Comma-separated chain of RDF relationship names to traverse from the row entity to the child nodes that become input feature columns (e.g., 'hasSimulation,hasResponse'); leave empty to use the entity's own properties directly. * - Input Columns Data Mode - input_cols_data - select - binary - - Encoding mode for input columns: 'binary' (default) writes 1/0 for child presence—requires a non-empty relationship path—while 'properties' writes the numeric/string property values of the resolved child nodes. * - Input Columns Filter Key - input_cols_filter_key - text - — - - Optional property name used to filter which child nodes contribute input columns (e.g., 'name'); matching is case-insensitive substring containment; leave empty to include all children. * - Input Columns Filter Value - input_cols_filter_value - text - — - - Substring value to match against the property specified in input_cols_filter_key (e.g., 'PERFORMANCE'); leave empty if no filtering is needed. * - Target Columns Relationship Path - target_cols_relationship_path - text - — - - Comma-separated RDF relationship path to the child nodes that become target/label columns (e.g., 'hasSimulation,hasResponse'); leave empty to produce a dataset with no target columns. * - Target Columns Data Mode - target_cols_data - select - properties - - Encoding mode for target columns: 'properties' (default) writes child property values; 'binary' writes 1/0 for child presence and requires a non-empty relationship path. * - Target Columns Filter Key - target_cols_filter_key - text - — - - Optional property name used to filter which child nodes contribute target columns; uses substring containment matching; leave empty to include all children on the target path. * - Target Columns Filter Value - target_cols_filter_value - text - — - - Substring value to match against the property specified in target_cols_filter_key; leave empty if no filtering is needed on the target path. Outputs ------- .. list-table:: :header-rows: 1 :widths: 20 20 20 20 * - Label - ID - Type - Description * - Dataset - dataset - dataset - Tabular dataset (d3VIEW dataset type) with one row per entity instance and columns for entity ID, optional name, all resolved input features, and all resolved target labels. * - Input Columns - input_columns - array - Ordered array of column name strings that correspond to the input (feature) columns in the extracted dataset, ready for direct use in ML worker feature-selection fields. * - Target Columns - target_columns - array - Ordered array of column name strings that correspond to the target (label) columns in the extracted dataset, ready for direct use in ML worker target-selection fields. * - Summary - summary - text - Human-readable text summary of the extraction result, including entity type queried, total row count, and number of input and target columns generated. Disciplines ----------- - ai_ml.preprocessing - data.dataset.transform - platform.ontology .. raw:: html

Auto-generated from platform schema. Worker id: ontology_dataset. Schema hash: fc7ecf5648e9. Hand-curated docs in workerexamples/ override this page when present.