*ML_DECISION_TREE_TRAIN¶

Description: Train or classify using Decision Tree

Syntax: ml_decision_tree_train(database_id,train_column,drop_columns)

Inputs¶
ID	Name	Type
1	Database	text
2	Column Name	file
3	Drop Columns	scalar

Outputs¶
ID	Name	Type	Remarks
1		text

*ML_LEARN_DTREEREG¶

Description: Decision Tree Regression

Syntax: ml_learn_dtreereg(dataset,independents,clabel,normalize,grid_response,autoclean,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
6	Auto Clean	select	no	Clean a dataset using preset rules.
7	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
8	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
9	Use Target Dependency	select	no	Useful for time-history predictions
10	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
11	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_GBOOSTR¶

Description: Gradient Boost Regressor

Syntax: ml_learn_gboostr(dataset,independents,clabel,normalize,grid_response,autoclean,run_grid_search,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
6	Auto Clean	select	no	Clean a dataset using preset rules.
7	Run Grid Search	select	no	Runs grid search to find the best hyperparameters
8	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
9	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
10	Use Target Dependency	select	no	Useful for time-history predictions
11	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
12	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_BAYESIANRIDGECV¶

Description: BayesianRidge Regression with CV

Syntax: ml_learn_bayesianridgecv(dataset,independents,clabel,normalize,autoclean,grid_response,cv,train_ratio,dependent_targets,num_dig)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
7	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
8	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
9	Use Target Dependency	select	no	Useful for time-history predictions
10	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_LASSO¶

Description: Lasso Regression

Syntax: ml_learn_lasso(dataset,independents,clabel,degree,normalize,grid_response,autoclean,run_grid_search,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Degree	select	none	Order of fit.
5	Normalize	select	none	Normalizes each numerical column.
6	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
7	Auto Clean	select	no	Clean a dataset using preset rules.
8	Run Grid Search	select	no	Runs grid search to find the best hyperparameters
9	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
10	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
11	Use Target Dependency	select	no	Useful for time-history predictions
12	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
13	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_RIDGE¶

Description: Ridge Regression

Syntax: ml_learn_ridge(dataset,independents,clabel,normalize,grid_response,autoclean,degree,run_grid_search,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
6	Auto Clean	select	no	Clean a dataset using preset rules.
7	Degree	select	none	Order of fit.
8	Run Grid Search	select	no	Runs grid search to find the best hyperparameters
9	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
10	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
11	Use Target Dependency	select	no	Useful for time-history predictions
12	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
13	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_SVR¶

Description: Support Vector Regression

Syntax: ml_learn_svr(dataset,independents,clabel,normalize,grid_response,autoclean,cv,train_ratio,dependent_targets,num_dig)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
6	Auto Clean	select	no	Clean a dataset using preset rules.
7	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
8	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
9	Use Target Dependency	select	no	Useful for time-history predictions
10	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_GPR¶

Description: Gaussian Process Regression

Syntax: ml_learn_gpr(dataset,independents,clabel,normalize,grid_response,autoclean,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
6	Auto Clean	select	no	Clean a dataset using preset rules.
7	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
8	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
9	Use Target Dependency	select	no	Useful for time-history predictions
10	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
11	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_PREDICT_INTERACTIVE¶

Description: ML Predict Info

Syntax: ml_predict_interactive(mfile,inputs,targets,schema,dataset,reference_dataset,raw_column_name)

Inputs¶
ID	Name	Type	Remarks
1	Model File	text	Trained model file
2	Inputs	scalar	This can come from a ML_PREDICT_INFO worker that were used while learning
3	Targets	scalar	This can come from a ML_PREDICT_INFO worker that were used while learning
4	Schema	dataset	This can come from a ML_PREDICT_INFO worker that were used while learning <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
5	Inputs For Prediction	predict_interactive	If the inputs, targets and schema are specified, a slider will be presented for selection
6	Reference Dataset	dataset	This dataset contains the points that will be original dataset with curves <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
7	Raw Column Name	scalar	If the prediction is for curve points, you can choose the Curve Column name during training

Outputs¶
ID	Name	Type	Remarks
1	Predictions	dataset

*ML_LEARN_LOGISTIC¶

Description: Logistic Regression

Syntax: ml_learn_logistic(dataset,independents,clabel,normalize,autoclean,grid_response,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
7	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
8	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
9	Use Target Dependency	select	no	Useful for time-history predictions
10	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
11	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_PREDICT_INFO¶

Description: ML Interactive Prediction

Syntax: ml_predict_info(mfile)

Inputs¶
ID	Name	Type	Default	Remarks
1	Saved Model File	text		Path of the model file from a previous training worker

Outputs¶
ID	Name	Type
1	Model File	text
2	Lucy JSON	json
3	Independents	scalar
4	Targets	scalar
5	Schema	dataset
6	Config Parameters	keyvalue

*ML_PREDICT¶

Description: ML Predict

Syntax: ml_predict(dataset,mfile,reference_dataset,raw_curve_column,inputs,targets,raw_vs_predictions)

Inputs¶
ID	Name	Type	Remarks
1	Prediction Dataset	dataset	Input Dataset for Prediction <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
2	Saved Model File	text
3	Learn Dataset	dataset	Useful for curve predictions <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
4	Raw Curve Column Name	text	Useful for curve predictions so we can use the x-values of this curve to build the Predicted curve
5	Inputs	text	Useful for prediction accuracy
6	Targets	text	Useful for prediction accuracy
7	Training Raw vs Predictions	dataset	Useful for prediction accuracy

Outputs¶
ID	Name	Type	Remarks
1	Predictions	dataset

*ML_LEARN_MLP_CLASSIFY¶

Description: MLP-Classifier

Syntax: ml_learn_mlp_classify(dataset,independents,clabel,train_ratio,normalize,autoclean)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
5	Normalize	select	none	Normalizes each numerical column.
6	Auto Clean	select	no	Clean a dataset using preset rules.

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_GNB¶

Description: Gaussian Naive Bayes

Syntax: ml_learn_gnb(dataset,independents,clabel,normalize,train_ratio,autoclean)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
6	Auto Clean	select	no	Clean a dataset using preset rules.

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_MLP¶

Description: MLP Regression

Syntax: ml_learn_mlp(dataset,independents,clabel,normalize,grid_response,autoclean,run_grid_search,cv,train_ratio,dependent_targets,mlp_hidden_layer_sizes,mlp_max_iter,mlp_alpha,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
6	Auto Clean	select	no	Clean a dataset using preset rules.
7	Run Grid Search	select	no	Runs grid search to find the best hyperparameters
8	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
9	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
10	Use Target Dependency	select	no	Useful for time-history predictions
11	MLP Hidden Layer Sizes (csv)	text		MLP Hidden layers separarted by commas. Each delimited values represents a layer between the Input and the Output layers
12	MLP Maximum Number of Iterations	text		MLP Hidden layers separarted by commas. Each delimited values represents a layer between the Input and the Output layers
13	MLP Strength of the L2 regularization term	text		Strength of the L2 Regularization term
14	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
15	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_DTREE¶

Description: Decision Tree Classifier

Syntax: ml_learn_dtree(dataset,independents,clabel,criterion,cv,train_ratio,normalize,autoclean)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Criterion	select	gini	The method used to measure the quality of the splits in a tree.
5	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
6	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
7	Normalize	select	none	Normalizes each numerical column.
8	Auto Clean	select	no	Clean a dataset using preset rules.

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_RFC¶

Description: Random Forest Classifier

Syntax: ml_learn_rfc(dataset,independents,clabel,cv,train_ratio,normalize,autoclean)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
5	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
6	Normalize	select	none	Normalizes each numerical column.
7	Auto Clean	select	no	Clean a dataset using preset rules.

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_SVC¶

Description: Support Vector Classifier

Syntax: ml_learn_svc(dataset,independents,clabel,cv,train_ratio,normalize,autoclean)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
5	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
6	Normalize	select	none	Normalizes each numerical column.
7	Auto Clean	select	no	Clean a dataset using preset rules.

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_BAYESIANRIDGE¶

Description: BayesianRidge Regression

Syntax: ml_learn_bayesianridge(dataset,independents,clabel,normalize,autoclean,grid_response,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
7	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
8	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
9	Use Target Dependency	select	no	Useful for time-history predictions
10	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
11	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_RFR¶

Description: Random Forest Regression

Syntax: ml_learn_rfr(dataset,independents,clabel,normalize,grid_response,autoclean,run_grid_search,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
6	Auto Clean	select	no	Clean a dataset using preset rules.
7	Run Grid Search	select	no	Runs grid search to find the best hyperparameters
8	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
9	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
10	Use Target Dependency	select	no	Useful for time-history predictions
11	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
12	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_MEANSHIFT¶

Description: Mean Shift Clustering

Syntax: ml_learn_meanshift(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Number of Clusters	text	3	Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_AFFINITYPROPAGATION¶

Description: Affinity Propagation

Syntax: ml_learn_affinitypropagation(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Number of Clusters	text	3	Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_KMEANS¶

Description: K-Means

Syntax: ml_learn_kmeans(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Number of Clusters	text	3	Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_EXPLORE_RUN_PCA¶

Description: RunPCA

Syntax: ml_explore_run_pca(dataset,independents,normalize)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Normalize	select	none	Normalizes each numerical column.

Outputs¶
ID	Name	Type	Remarks
1	Lucy-ML JSON output	json

*ML_EXPLORE_GET_SCHEMA¶

Description: Get Schema

Syntax: ml_explore_get_schema(dataset,independents,autoclean)

Inputs¶
ID	Name	Type	Remarks
1	Dataset	dataset
2	Input Features	select	A subset of columns that will be used to train the model.
3	Auto Clean	select	Clean a dataset using preset rules.

Outputs¶
ID	Name	Type	Remarks
1	Lucy-ML JSON output	json

*ML_LEARN_ELASTICNET¶

Description: ElasticNet Regression

Syntax: ml_learn_elasticnet(dataset,independents,clabel,normalize,autoclean,grid_response,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
7	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
8	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
9	Use Target Dependency	select	no	Useful for time-history predictions
10	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
11	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_SPECTRALCLUSTERING¶

Description: Spectral clustering

Syntax: ml_learn_spectralclustering(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Number of Clusters	text	3	Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_LINEAR¶

Description: Linear Regression

Syntax: ml_learn_linear(dataset,independents,clabel,degree,normalize,grid_response,autoclean,cv,train_ratio,run_grid_search,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Degree	select	none	Order of fit.
5	Normalize	select	none	Normalizes each numerical column.
6	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
7	Auto Clean	select	no	Clean a dataset using preset rules.
8	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
9	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
10	Run Grid Search	select	no	Runs grid search to find the best hyperparameters
11	Use Target Dependency	select	no	Useful for time-history predictions
12	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
13	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_SGDR¶

Description: SGD Regression

Syntax: ml_learn_sgdr(dataset,independents,clabel,degree,normalize,grid_response,autoclean,cv,train_ratio,optimize,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Degree	select	none	Order of fit.
5	Normalize	select	none	Normalizes each numerical column.
6	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
7	Auto Clean	select	no	Clean a dataset using preset rules.
8	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
9	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
10	Optimize	select		Obtain the input points and resulting Optimum.
11	Use Target Dependency	select	no	Useful for time-history predictions
12	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
13	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_AUTO¶

Description: Auto Regression

Syntax: ml_learn_auto(dataset,independents,clabel,regression_types,normalize,grid_response,autoclean,cv,train_ratio,run_grid_search,dependent_targets,num_dig,skip_trainingdata_in_model,gs_scoring_type,cv_scoring_type,ref_dataset,ref_ratio)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Regression Type	select	Array	Available regression methods
5	Normalize	select	none	Normalizes each numerical column.
6	Generate Regression Line/Surface	select	yes	Used to visualize a regression line or surface.
7	Auto Clean	select	no	Clean a dataset using preset rules.
8	Selection of Validation	select	leave_one_out	Ability to valide the model using different methods. The default is test_train_split
9	Train ratio	text	0.9	Percentage of data to be used for training. Remaining will be used for testing
10	Run Grid Search	select	no	Runs grid search to find the best hyperparameters
11	Use Target Dependency	select	no	Useful for time-history predictions
12	Curve Digitize Points	select	none	Number of digitized points to use when the Input or the Target column is a curve
13	Skip including training data in model	select	no	While saving the model, we skip including the training data
14	Grid Search Scoring Type	select	r2	How to measure error
15	Cross-Validation Scoring Type	select	neg_mean_squared_error	How to measure error
16	Verification Dataset	dataset		The trained model will be tested using this dataset
17	Hold Back Ratio	scalar	0	When this ratio is non-zero, the model will be trained on the dataset and tested on the verification dataset

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_BIRCHCLUSTERING¶

Description: Birch clustering

Syntax: ml_learn_birchclustering(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Number of Clusters	text	3	Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_LEARN_DBSCANCLUSTERING¶

Description: DBSCAN clustering

Syntax: ml_learn_dbscanclustering(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Number of Clusters	text	3	Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_EXPLORE_FEATURE_IMPORTANCE¶

Description: Feature Importance

Syntax: ml_explore_feature_importance(dataset,independents,clabel,normalize,autoclean)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Target Column	select		The name of the target column.
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select		Clean a dataset using preset rules.

Outputs¶
ID	Name	Type	Remarks
1	Lucy-ML JSON output	json

*ML_LEARN_AGGLOMERATIVECLUSTERING¶

Description: Agglomerative Clustering

Syntax: ml_learn_agglomerativeclustering(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset
2	Input Features	select		A subset of columns that will be used to train the model.
3	Number of Clusters	text	3	Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4	Normalize	select	none	Normalizes each numerical column.
5	Auto Clean	select	no	Clean a dataset using preset rules.
6	Skip including training data in model	select	no	While saving the model, we skip including the training data

Outputs¶
ID	Name	Type
1	Lucy ML JSON Output	json
2	Responses from Learn	dataset
3	Visualizations from Learn	dataset
4	Score	number
5	Model Path	textarea

*ML_SIMLYTIKS_TERMINAL¶

Description: Learns and predicts the commands for Simlytiks based on data

Syntax: ml_simlytiks_terminal(server_url,dataset,schema,visualizations,type,uid,query)

Inputs¶
ID	Name	Type	Remarks
1	Server URL	text	Url of the server running d3VIEW Terminal Server
2	Dataset	dataset	Input Dataset <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
3	Schema	dataset	Schema of the dataset that contains the column information <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
4	Visualizations	json	JSON file that contains the visualization information
5	Learn Or Predict	select	Learn or Predict for the terminal
6	UID	text	Unique Identifier to use while predicting
7	Predict Query	textarea	Text to predict Simlytiks command

Outputs¶
ID	Name	Type	Remarks
1	UID	text
2	Prediction	json

*ML_CLEAN_AUTOCLEAN¶

Description: Auto Clean

Syntax: ml_clean_autoclean(dataset)

Inputs¶
ID	Name	Type	Default	Remarks
1	Dataset	dataset

Outputs¶
ID	Name	Type	Remarks
1	Lucy ML JSON Output	json
2	Cleaned Dataset	dataset

*ML_IMAGE_CLASSIFICATION_LEARN¶

Description: Connects to a d3VIEW Image Classification Server to learn based on annotations generated from Image Annotator

Syntax: ml_image_classification_learn(base_url,dataset,config)

Inputs¶
ID	Name	Type	Remarks
1	Base URL	text	Url of the server running d3VIEW Image classification
2	LDWA	dataset	This would be a set of manually annotated images generated by the image annotation utility worker in Workflows <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
3	Configuration Options	keyvalue	Configuration Options <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#keyvalueinput’> <i class=’fa fa-external-link’> </i> View more </a>

Outputs¶
ID	Name	Type
1	UID	text
2	Model Path	textarea
3	Base Url	text
4	Status Url	text
5	Logs	dataset