Explore
=======
The ML Explore Plugin provides data insights from individual features and how
features relate to a target.

``python bin/lucy.egg plugins ml_explore -h``

Explore Arguments
^^^^^^^^^^^^^^^^^

Primary
-------
Main specifications for an Explore task.

.. list-table::
   :widths: 10 40
   :header-rows: 1

   * - Argument
     - Detail

   * - -input
     - **PATH**. File path to the input data.

   * - -output
     - **PATH**. File path to where the output summary file will be saved. It is
       recommend that this file be saved as a JSON. However, if '-get-schema' is
       the only explore task that was used, the output can be saved as a CSV.

   * - -clabel
     - **STRING**. The name of the Target/Response/output column.

Preprocessing
-------------
Options for manipulating the dataset before using it.

.. list-table::
   :widths: 10 40
   :header-rows: 1

   * - Argument
     - Detail

   * - -independents
     - **COMMA SEPARATED LIST**. A subset of column names that will be used for
       Exploring.

   * - -normalize
     - **METHOD**. ('standard', 'minmax') Normalizes the columns of ALL
       numerical data using the selected scaling method.

   * - -autoclean
     - **NONE**. The use of this command cleans the data based on a pre-defined
       set of rules.

Data Insight Tasks
------------------
The main tasks for Explore.

.. list-table::
   :widths: 10 40
   :header-rows: 1

   * - Argument
     - Detail

   * - -get-schema
     - **NONE**. Produces the information about each column: count, null count,
       unique count, data type. Depending on the data type provides additional
       information: range, mean, most frequent value, bins, etc.

   * - -feature-importance
     - **NONE**. Provides a score associated with each input feature and its
       relation to the picked '-clabel'. There are various measures for this;
       scores of the same type can be compared but different score types should
       not be compared to one another.

.. include:: explore_examples.rst