Explore ======= The ML Explore Plugin provides data insights from individual features and how features relate to a target. ``python bin/lucy.egg plugins ml_explore -h`` Explore Arguments ^^^^^^^^^^^^^^^^^ Primary ------- Main specifications for an Explore task. .. list-table:: :widths: 10 40 :header-rows: 1 * - Argument - Detail * - -input - **PATH**. File path to the input data. * - -output - **PATH**. File path to where the output summary file will be saved. It is recommend that this file be saved as a JSON. However, if '-get-schema' is the only explore task that was used, the output can be saved as a CSV. * - -clabel - **STRING**. The name of the Target/Response/output column. Preprocessing ------------- Options for manipulating the dataset before using it. .. list-table:: :widths: 10 40 :header-rows: 1 * - Argument - Detail * - -independents - **COMMA SEPARATED LIST**. A subset of column names that will be used for Exploring. * - -normalize - **METHOD**. ('standard', 'minmax') Normalizes the columns of ALL numerical data using the selected scaling method. * - -autoclean - **NONE**. The use of this command cleans the data based on a pre-defined set of rules. Data Insight Tasks ------------------ The main tasks for Explore. .. list-table:: :widths: 10 40 :header-rows: 1 * - Argument - Detail * - -get-schema - **NONE**. Produces the information about each column: count, null count, unique count, data type. Depending on the data type provides additional information: range, mean, most frequent value, bins, etc. * - -feature-importance - **NONE**. Provides a score associated with each input feature and its relation to the picked '-clabel'. There are various measures for this; scores of the same type can be compared but different score types should not be compared to one another. .. include:: explore_examples.rst