Delete subcommand

You have seen that BigMLer is an agile tool that empowers you to create a great number of resources easily. This is a tremedous help, but it also can lead to a garbage-prone environment. To keep a control of each new created remote resource use the flag –resources-log followed by the name of the log file you choose.

bigmler --train data/iris.csv --resources-log my_log.log

Each new resource created by that command will cause its id to be appended as a new line of the log file.

BigMLer can help you as well in deleting these resources. Using the delete subcommand there are many options available. For instance, deleting a comma-separated list of ids

bigmler delete \
        --ids source/50a2bb64035d0706db0006cc,dataset/50a1f441035d0706d9000371

deleting resources listed in a file

bigmler delete --from-file to_delete.log

where to_delete.log contains a resource id per line.

As we’ve previously seen, each BigMLer command execution generates a bunch of remote resources whose ids are stored in files located in a directory that can be set using the --output-dir option. The bigmler delete subcommand can retrieve the ids stored in such files by using the --from-dir option.

bigmler --train data/iris.csv --output my_BigMLer_output_dir
bigmler delete --from-dir my_BigMLer_output_dir

The last command will delete all the remote resources previously generated by the fist command by retrieving their ids from the files in my_BigMLer_output_dir directory.

You can also delete resources based on the tags they are associated to

bigmler delete --all-tag my_tag

or restricting the operation to a specific type

bigmler delete --source-tag my_tag
bigmler delete --dataset-tag my_tag
bigmler delete --model-tag my_tag
bigmler delete --prediction-tag my_tag
bigmler delete --evaluation-tag my_tag
bigmler delete --ensemble-tag my_tag
bigmler delete --batch-prediction-tag my_tag
bigmler delete --cluster-tag my_tag
bigmler delete --centroid-tag my_tag
bigmler delete --batch-centroid-tag my_tag
bigmler delete --anomaly-tag my_tag
bigmler delete --anomaly-score-tag my_tag
bigmler delete --batch-anomaly-score-tag my_tag
bigmler delete --project-tag my_tag
bigmler delete --logistic-regression-tag my_tag
bigmler delete --linear-regression-tag my_tag
bigmler delete --time-series-tag my_tag
bigmler delete --deepnet-tag my_tag
bigmler delete --topic-model-tag my_tag
bigmler delete --topic-distribution-tag my_tag
bigmler delete --association-tag my_tag

You can also delete resources by date. The options --newer-than and --older-than let you specify a reference date. Resources created after and before that date respectively, will be deleted. Both options can be combined to set a range of dates. The allowed values are:

  • dates in a YYYY-MM-DD format

  • integers, that will be interpreted as number of days before now

  • resource id, the creation datetime of the resource will be used

Thus,

bigmler delete --newer-than 2

will delete all resources created less than two days ago (now being 2014-03-23 14:00:00.00000, its creation time will be greater than 2014-03-21 14:00:00.00000).

bigmler delete --older-than 2014-03-20 --newer-than 2014-03-19

will delete all resources created during 2014, March the 19th (creation time between 2014-03-19 00:00:00 and 2014-03-20 00:00:00) and

bigmler delete --newer-than source/532db2b637203f3f1a000104

will delete all resources created after the source/532db2b637203f3f1a000104 was created.

You can also combine both types of options, to delete sources tagged as my_tag starting from a certain date on

bigmler delete --newer-than 2 --source-tag my_tag

And finally, you can filter the type of resource to be deleted using the --resource-types option to specify a comma-separated list of resource types to be deleted

bigmler delete --older-than 2 --resource-types source,model

will delete the sources and models created more than two days ago.

Additionally, you can use the --resource-types option to tell which type of resources to exclude from deletion if the --exclude-types flag is added to the call.

bigmler delete --older-than 2 --resource-types source,model --exclude-types

That command will delete all the resources that are older than two days except for sources and models.

You can simulate the a delete subcommand using the --dry-run flag

bigmler delete --newer-than source/532db2b637203f3f1a000104 \
               --source-tag my_source --dry-run

The output for the command will be a list of resources that would be deleted if the --dry-run flag was removed. In this case, they will be sources that contain the tag my_source and were created after the one given as --newer-than value. The first 15 resources will be logged to console, and the complete list can be found in the bigmler_sessions file.

A similar option that does not delete the resources immediately is --bin.

bigmler delete --newer-than 3 --resource-types source \
               --source-tag my_source --bin

By setting that flag, all the selected resources are moved to a newly created Trash bin project in your account. That allows the user to inspect the selected resources before deletion and delete them in an efficient way by deleting the Trash bin project.

By default, only finished resources are selected to be deleted. If you want to delete other resources, you can select them by choosing their status:

bigmler delete --older-than 2 --status faulty

would remove all failed resources created more than two days ago.

Also, you can apply a filter based on the filters used in the API list query strings (see the API documentation).

bigmler delete --filter "name__icontains=iris"

Delete Subcommand Options

--ids LIST_OF_IDS

Comma separated list of ids to be deleted

--from-file FILE_OF_IDS

Path to a file containing the resources’ ids to be deleted

--from-dir

Path to a directory where BigMLer has stored its session data and created resources

--all-tag TAG

Retrieves resources that were tagged with tag to delete them

--source-tag TAG

Retrieves sources that were tagged with tag to delete them

--dataset-tag TAG

Retrieves datasets that were tagged with tag to delete them

--model-tag TAG

Retrieves models that were tagged with tag to delete them

--prediction-tag TAG

Retrieves predictions that were tagged with tag to delete them

--evaluation-tag TAG

Retrieves evaluations that were tagged with tag to delete them

--ensemble-tag TAG

Retrieves ensembles that were tagged with tag to delete them

--batch-prediction-tag TAG

Retrieves batch predictions that were tagged with tag to delete them

--cluster-tag TAG

Retrieves clusters that were tagged with tag to delete them

--centroid-tag TAG

Retrieves centroids that were tagged with tag to delete them

--batch-centroid-tag TAG

Retrieves batch centroids that were tagged with tag to delete them

--anomaly-tag TAG

Retrieves anomalies that were tagged with tag to delete them

--anomaly-score-tag TAG

Retrieves anomaly scores that were tagged with tag to delete them

--batch-anomlay-score-tag TAG

Retrieves batch anomaly scores that were tagged with tag to delete them

--logistic-regression-tag TAG

Retrieves logistic regressions that were tagged with tag to delete them

--linear-regression-tag TAG

Retrieves linear regressions that were tagged with tag to delete them

--topic-model-tag TAG

Retrieves topic models that were tagged with tag to delete them

--topic-distribution-tag TAG

Retrieves topic distributions that were tagged with tag to delete them

--batch-topic-distribution-tag TAG

Retrieves batch topic distributions that were tagged with tag to delete them

--time-series-tag TAG

Retrieves time series that were tagged with tag to delete them

--forecast-tag TAG

Retrieves forecasts that were tagged with tag to delete them

--deepnet-tag TAG

Retrieves deepnets that were tagged with tag to delete them

--project TAG

Retrieves projects that were tagged with tag to delete them

--association TAG

Retrieves associations that were tagged with tag to delete them

--older-than DATE

Retrieves resources created before the specified date. Date can be any YYYY-MM-DD string, an integer meaning the number of days before the current datetime or a resource id, meaning the creation datetime of the resource

--newer-than DATE

Retrieves resources created after the specified date. Date can be any YYYY-MM-DD string, an integer meaning the number of days before the current datetime or a resource id, meaning the creation datetime of the resource

--resource-types

Comma-separated list of types of resources to be deleted. Allowed values are source, dataset, model, ensemble, prediction, batch_prediction, cluster, centroid, batch_centroid, etc.

--exclude-types

When used together with resource-types, the list of types provided will be excluded from deletion. Only the rest of resource types will be deleted.

--filter FILTER EXPRESSION

Filter expression as used in the API list calls query string

--dry-run

Delete simulation. No removal.

--status

Status codes used in the filter to retrieved the resources to be delete. The possible values are: finished, faulty, waiting, queued, started, in progress, summarized, uploading, unknown, runnable

Export subcommand

The bigmler export subcommand is intended to help generating the code needed for the models in BigML to be integrated in other applications. To produce a prediction using a BigML model you just need a function that receives as argument the new test case data and returns this prediction (and a confidence). The bigmler export subcommand will retrieve the JSON information of your existing decision tree model in BigML and will generate from it this function code and store it in a file that can be imported or copied directly in your application.

Obviously, the function syntax will depend on the model and the language used in your application, so these will be the options we need to provide:

bigmler export --model model/532db2b637203f3f1a001304 \
               --language javascript --output-dir my_exports

This command will create a javascript version of the function that produces the predictions and store it in a file named model_532db2b637203f3f1a001304.js (after the model ID) in the my_exports directory.

Models can currently exported in Python, Javascript and R. For models whose fields are numeric or categorical, the command also supports creating MySQL functions and Tableau separate expressions for both the prediction and the confidence.

You can also generate the code for all the models in an ensemble in a single bigmler export command using the –ensemble option followed by the corresponding ensemble ID. The code for each model will be stored in a separate file, named after the model ID and transforming the slash into an underscore.

bigmler export --ensemble ensemble/532db2b637203f3f1a001307 \
               --language javascript --output-dir my_ensemble