*ML_LEARN_RFC

Description: Random Forest Classifier

Syntax: ml_learn_rfc(dataset,independents,clabel,cv,train_ratio,normalize,autoclean)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
5 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
6 Normalize select none Normalizes each numerical column.
7 Auto Clean select no Clean a dataset using preset rules.
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_DTREE

Description: Decision Tree Classifier

Syntax: ml_learn_dtree(dataset,independents,clabel,criterion,cv,train_ratio,normalize,autoclean)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Criterion select gini The method used to measure the quality of the splits in a tree.
5 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
6 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
7 Normalize select none Normalizes each numerical column.
8 Auto Clean select no Clean a dataset using preset rules.
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_ELASTICNET

Description: ElasticNet Regression

Syntax: ml_learn_elasticnet(dataset,independents,clabel,normalize,autoclean,grid_response,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
7 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
8 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
9 Use Target Dependency select no Useful for time-history predictions
10 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
11 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_BAYESIANRIDGE

Description: BayesianRidge Regression

Syntax: ml_learn_bayesianridge(dataset,independents,clabel,normalize,autoclean,grid_response,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
7 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
8 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
9 Use Target Dependency select no Useful for time-history predictions
10 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
11 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_BAYESIANRIDGECV

Description: BayesianRidge Regression with CV

Syntax: ml_learn_bayesianridgecv(dataset,independents,clabel,normalize,autoclean,grid_response,cv,train_ratio,dependent_targets,num_dig)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
7 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
8 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
9 Use Target Dependency select no Useful for time-history predictions
10 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_DECISION_TREE_TRAIN

Description: Train or classify using Decision Tree

Syntax: ml_decision_tree_train(database_id,train_column,drop_columns)

Inputs
ID Name Type Default Remarks
1 Database text    
2 Column Name file    
3 Drop Columns scalar    
Outputs
ID Name Type Remarks
1   text  

*ML_LEARN_LOGISTIC

Description: Logistic Regression

Syntax: ml_learn_logistic(dataset,independents,clabel,normalize,autoclean,grid_response,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
7 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
8 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
9 Use Target Dependency select no Useful for time-history predictions
10 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
11 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_SGDR

Description: SGD Regression

Syntax: ml_learn_sgdr(dataset,independents,clabel,degree,normalize,grid_response,autoclean,cv,train_ratio,optimize,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Degree select none Order of fit.
5 Normalize select none Normalizes each numerical column.
6 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
7 Auto Clean select no Clean a dataset using preset rules.
8 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
9 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
10 Optimize select   Obtain the input points and resulting Optimum.
11 Use Target Dependency select no Useful for time-history predictions
12 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
13 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_LASSO

Description: Lasso Regression

Syntax: ml_learn_lasso(dataset,independents,clabel,degree,normalize,grid_response,autoclean,run_grid_search,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Degree select none Order of fit.
5 Normalize select none Normalizes each numerical column.
6 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
7 Auto Clean select no Clean a dataset using preset rules.
8 Run Grid Search select no Runs grid search to find the best hyperparameters
9 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
10 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
11 Use Target Dependency select no Useful for time-history predictions
12 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
13 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_RIDGE

Description: Ridge Regression

Syntax: ml_learn_ridge(dataset,independents,clabel,normalize,grid_response,autoclean,degree,run_grid_search,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
6 Auto Clean select no Clean a dataset using preset rules.
7 Degree select none Order of fit.
8 Run Grid Search select no Runs grid search to find the best hyperparameters
9 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
10 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
11 Use Target Dependency select no Useful for time-history predictions
12 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
13 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_DTREEREG

Description: Decision Tree Regression

Syntax: ml_learn_dtreereg(dataset,independents,clabel,normalize,grid_response,autoclean,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
6 Auto Clean select no Clean a dataset using preset rules.
7 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
8 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
9 Use Target Dependency select no Useful for time-history predictions
10 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
11 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_GPR

Description: Gaussian Process Regression

Syntax: ml_learn_gpr(dataset,independents,clabel,normalize,grid_response,autoclean,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
6 Auto Clean select no Clean a dataset using preset rules.
7 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
8 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
9 Use Target Dependency select no Useful for time-history predictions
10 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
11 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_GBOOSTR

Description: Gradient Boost Regressor

Syntax: ml_learn_gboostr(dataset,independents,clabel,normalize,grid_response,autoclean,run_grid_search,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
6 Auto Clean select no Clean a dataset using preset rules.
7 Run Grid Search select no Runs grid search to find the best hyperparameters
8 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
9 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
10 Use Target Dependency select no Useful for time-history predictions
11 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
12 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_RFR

Description: Random Forest Regression

Syntax: ml_learn_rfr(dataset,independents,clabel,normalize,grid_response,autoclean,run_grid_search,cv,train_ratio,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
6 Auto Clean select no Clean a dataset using preset rules.
7 Run Grid Search select no Runs grid search to find the best hyperparameters
8 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
9 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
10 Use Target Dependency select no Useful for time-history predictions
11 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
12 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_SVR

Description: Support Vector Regression

Syntax: ml_learn_svr(dataset,independents,clabel,normalize,grid_response,autoclean,cv,train_ratio,dependent_targets,num_dig)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
6 Auto Clean select no Clean a dataset using preset rules.
7 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
8 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
9 Use Target Dependency select no Useful for time-history predictions
10 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_MLP

Description: MLP Regression

Syntax: ml_learn_mlp(dataset,independents,clabel,normalize,grid_response,autoclean,run_grid_search,cv,train_ratio,dependent_targets,mlp_hidden_layer_sizes,mlp_max_iter,mlp_alpha,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
6 Auto Clean select no Clean a dataset using preset rules.
7 Run Grid Search select no Runs grid search to find the best hyperparameters
8 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
9 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
10 Use Target Dependency select no Useful for time-history predictions
11 MLP Hidden Layer Sizes (csv) text   MLP Hidden layers separarted by commas. Each delimited values represents a layer between the Input and the Output layers
12 MLP Maximum Number of Iterations text   MLP Hidden layers separarted by commas. Each delimited values represents a layer between the Input and the Output layers
13 MLP Strength of the L2 regularization term text   Strength of the L2 Regularization term
14 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
15 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_MLP_CLASSIFY

Description: MLP-Classifier

Syntax: ml_learn_mlp_classify(dataset,independents,clabel,train_ratio,normalize,autoclean)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
5 Normalize select none Normalizes each numerical column.
6 Auto Clean select no Clean a dataset using preset rules.
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_GNB

Description: Gaussian Naive Bayes

Syntax: ml_learn_gnb(dataset,independents,clabel,normalize,train_ratio,autoclean)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
6 Auto Clean select no Clean a dataset using preset rules.
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_PREDICT_INTERACTIVE

Description: ML Predict Info

Syntax: ml_predict_interactive(mfile,inputs,targets,schema,dataset,reference_dataset,raw_column_name)

Inputs
ID Name Type Default Remarks
1 Model File text   Trained model file
2 Inputs scalar   This can come from a ML_PREDICT_INFO worker that were used while learning
3 Targets scalar   This can come from a ML_PREDICT_INFO worker that were used while learning
4 Schema dataset   This can come from a ML_PREDICT_INFO worker that were used while learning <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
5 Inputs For Prediction predict_interactive   If the inputs, targets and schema are specified, a slider will be presented for selection
6 Reference Dataset dataset   This dataset contains the points that will be original dataset with curves <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
7 Raw Column Name scalar   If the prediction is for curve points, you can choose the Curve Column name during training
Outputs
ID Name Type Remarks
1 Predictions dataset  

*ML_PREDICT_INFO

Description: ML Interactive Prediction

Syntax: ml_predict_info(mfile)

Inputs
ID Name Type Default Remarks
1 Saved Model File text   Path of the model file from a previous training worker
Outputs
ID Name Type Remarks
1 Model File text  
2 Lucy JSON json  
3 Independents scalar  
4 Targets scalar  
5 Schema dataset  
6 Config Parameters keyvalue  

*ML_PREDICT

Description: ML Predict

Syntax: ml_predict(dataset,mfile,reference_dataset,raw_curve_column)

Inputs
ID Name Type Default Remarks
1 Prediction Dataset dataset   Input Dataset for Prediction <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
2 Saved Model File text    
3 Learn Dataset dataset   Useful for curve predictions <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
4 Raw Curve Column Name text   Useful for curve predictions so we can use the x-values of this curve to build the Predicted curve
Outputs
ID Name Type Remarks
1 Predictions dataset  

*ML_LEARN_SVC

Description: Support Vector Classifier

Syntax: ml_learn_svc(dataset,independents,clabel,cv,train_ratio,normalize,autoclean)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
5 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
6 Normalize select none Normalizes each numerical column.
7 Auto Clean select no Clean a dataset using preset rules.
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_AUTO

Description: Auto Regression

Syntax: ml_learn_auto(dataset,independents,clabel,regression_types,normalize,grid_response,autoclean,cv,train_ratio,run_grid_search,dependent_targets,num_dig,skip_trainingdata_in_model,gs_scoring_type,cv_scoring_type,ref_dataset,ref_ratio)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Regression Type select Array Available regression methods
5 Normalize select none Normalizes each numerical column.
6 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
7 Auto Clean select no Clean a dataset using preset rules.
8 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
9 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
10 Run Grid Search select no Runs grid search to find the best hyperparameters
11 Use Target Dependency select no Useful for time-history predictions
12 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
13 Skip including training data in model select no While saving the model, we skip including the training data
14 Grid Search Scoring Type select r2 How to measure error
15 Cross-Validation Scoring Type select r2 How to measure error
16 Verification Dataset dataset   The trained model will be tested using this dataset
17 Hold Back Ratio scalar 0 When this ratio is non-zero, the model will be trained on the dataset and tested on the verification dataset
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_IMAGE_CLASSIFICATION_LEARN

Description: Connects to a d3VIEW Image Classification Server to learn based on annotations generated from Image Annotator

Syntax: ml_image_classification_learn(base_url,dataset,config)

Inputs
ID Name Type Default Remarks
1 Base URL text   Url of the server running d3VIEW Image classification
2 LDWA dataset   This would be a set of manually annotated images generated by the image annotation utility worker in Workflows <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
3 Configuration Options keyvalue   Configuration Options <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#keyvalueinput’> <i class=’fa fa-external-link’> </i> View more </a>
Outputs
ID Name Type Remarks
1 UID text  
2 Model Path textarea  
3 Base Url text  
4 Status Url text  
5 Logs dataset  

*ML_SIMLYTIKS_TERMINAL

Description: Learns and predicts the commands for Simlytiks based on data

Syntax: ml_simlytiks_terminal(server_url,dataset,schema,visualizations,type,uid,query)

Inputs
ID Name Type Default Remarks
1 Server URL text   Url of the server running d3VIEW Terminal Server
2 Dataset dataset   Input Dataset <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
3 Schema dataset   Schema of the dataset that contains the column information <a class=’btn btn-xs btn-default’ target=’_blank’ href=’https://www.d3view.com/docs/master/workflows/Glossary.html#datasetinput’> <i class=’fa fa-external-link’> </i> View more </a>
4 Visualizations json   JSON file that contains the visualization information
5 Learn Or Predict select   Learn or Predict for the terminal
6 UID text   Unique Identifier to use while predicting
7 Predict Query textarea   Text to predict Simlytiks command
Outputs
ID Name Type Remarks
1 UID text  
2 Prediction json  

*ML_LEARN_MEANSHIFT

Description: Mean Shift Clustering

Syntax: ml_learn_meanshift(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Number of Clusters text 3 Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_AFFINITYPROPAGATION

Description: Affinity Propagation

Syntax: ml_learn_affinitypropagation(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Number of Clusters text 3 Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_EXPLORE_RUN_PCA

Description: RunPCA

Syntax: ml_explore_run_pca(dataset,independents,normalize)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Normalize select none Normalizes each numerical column.
Outputs
ID Name Type Remarks
1 Lucy-ML JSON output json  

*ML_EXPLORE_GET_SCHEMA

Description: Get Schema

Syntax: ml_explore_get_schema(dataset,independents,autoclean)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Auto Clean select   Clean a dataset using preset rules.
Outputs
ID Name Type Remarks
1 Lucy-ML JSON output json  

*ML_LEARN_AGGLOMERATIVECLUSTERING

Description: Agglomerative Clustering

Syntax: ml_learn_agglomerativeclustering(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Number of Clusters text 3 Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_SPECTRALCLUSTERING

Description: Spectral clustering

Syntax: ml_learn_spectralclustering(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Number of Clusters text 3 Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_LINEAR

Description: Linear Regression

Syntax: ml_learn_linear(dataset,independents,clabel,degree,normalize,grid_response,autoclean,cv,train_ratio,run_grid_search,dependent_targets,num_dig,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Degree select none Order of fit.
5 Normalize select none Normalizes each numerical column.
6 Generate Regression Line/Surface select yes Used to visualize a regression line or surface.
7 Auto Clean select no Clean a dataset using preset rules.
8 Selection of Validation select leave_one_out Ability to valide the model using different methods. The default is test_train_split
9 Train ratio text 0.9 Percentage of data to be used for training. Remaining will be used for testing
10 Run Grid Search select no Runs grid search to find the best hyperparameters
11 Use Target Dependency select no Useful for time-history predictions
12 Curve Digitize Points select none Number of digitized points to use when the Input or the Target column is a curve
13 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_BIRCHCLUSTERING

Description: Birch clustering

Syntax: ml_learn_birchclustering(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Number of Clusters text 3 Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_LEARN_DBSCANCLUSTERING

Description: DBSCAN clustering

Syntax: ml_learn_dbscanclustering(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Number of Clusters text 3 Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_EXPLORE_FEATURE_IMPORTANCE

Description: Feature Importance

Syntax: ml_explore_feature_importance(dataset,independents,clabel,normalize,autoclean)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Target Column select   The name of the target column.
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select   Clean a dataset using preset rules.
Outputs
ID Name Type Remarks
1 Lucy-ML JSON output json  

*ML_LEARN_KMEANS

Description: K-Means

Syntax: ml_learn_kmeans(dataset,independents,nc,normalize,autoclean,skip_trainingdata_in_model)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
2 Input Features select   A subset of columns that will be used to train the model.
3 Number of Clusters text 3 Total number of clusters to group the data into. If set to auto, an optimum cluster size will be found based on distance optimization
4 Normalize select none Normalizes each numerical column.
5 Auto Clean select no Clean a dataset using preset rules.
6 Skip including training data in model select no While saving the model, we skip including the training data
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Responses from Learn dataset  
3 Visualizations from Learn dataset  
4 Score number  
5 Model Path textarea  

*ML_CLEAN_AUTOCLEAN

Description: Auto Clean

Syntax: ml_clean_autoclean(dataset)

Inputs
ID Name Type Default Remarks
1 Dataset dataset    
Outputs
ID Name Type Remarks
1 Lucy ML JSON Output json  
2 Cleaned Dataset dataset