AutoML API¶

orca.automl.auto_estimator¶

A general estimator supports automatic model tuning. It allows users to fit and search the best hyperparameter for their model.

class zoo.orca.automl.auto_estimator.AutoEstimator(model_builder, logs_dir='/tmp/auto_estimator_logs', resources_per_trial=None, remote_dir=None, name=None)[source]¶

Bases: object

Example

>>> auto_est = AutoEstimator.from_torch(model_creator=model_creator,
                                        optimizer=get_optimizer,
                                        loss=nn.BCELoss(),
                                        logs_dir="/tmp/zoo_automl_logs",
                                        resources_per_trial={"cpu": 2},
                                        name="test_fit")
>>> auto_est.fit(data=data,
                 validation_data=validation_data,
                 search_space=create_linear_search_space(),
                 n_sampling=4,
                 epochs=1,
                 metric="accuracy")
>>> best_model = auto_est.get_best_model()

static from_torch(*, model_creator, optimizer, loss, logs_dir='/tmp/auto_estimator_logs', resources_per_trial=None, name=None)[source]¶

Create an AutoEstimator for torch.

Parameters

model_creator – PyTorch model creator function.
optimizer – PyTorch optimizer creator function or pytorch optimizer name (string). Note that you should specify learning rate search space with key as “lr” or LR_NAME (from zoo.orca.automl.pytorch_utils import LR_NAME) if input optimizer name. Without learning rate search space specified, the default learning rate value of 1e-3 will be used for all estimators.
loss – PyTorch loss instance or PyTorch loss creator function or pytorch loss name (string).
logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_estimator_logs”
resources_per_trial – Dict. resources for each trial. e.g. {“cpu”: 2}.
name – Name of the auto estimator.

Returns

an AutoEstimator object.

static from_keras(*, model_creator, logs_dir='/tmp/auto_estimator_logs', resources_per_trial=None, name=None)[source]¶

Create an AutoEstimator for tensorflow keras.

Parameters

model_creator – Tensorflow keras model creator function.
logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_estimator_logs”
resources_per_trial – Dict. resources for each trial. e.g. {“cpu”: 2}.
name – Name of the auto estimator.

Returns

an AutoEstimator object.

fit(data, epochs=1, validation_data=None, metric=None, metric_mode=None, metric_threshold=None, n_sampling=1, search_space=None, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]¶

Automatically fit the model and search for the best hyperparameters.

Parameters

data – train data. If the AutoEstimator is created with from_torch, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. If the AutoEstimator is created with from_keras, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y), where x is training input data and y is training target data.
epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.
validation_data – Validation data. Validation data type should be the same as data.
metric – String. The evaluation metric name to optimize, e.g. “mse”
metric_mode – One of [“min”, “max”]. “max” means greater metric value is better. You don’t have to specify metric_mode if you use the built-in metric in zoo.automl.common.metrics.Evaluator.
metric_threshold – a trial will be terminated when metric threshold is met
n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.
search_space – a dict for search space
search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)
search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode
scheduler – str, all supported scheduler provided by ray tune
scheduler_params – parameters for scheduler

get_best_model()[source]¶

Return the best model found by the AutoEstimator

Returns: the best model instance

get_best_config()[source]¶

Return the best config found by the AutoEstimator

Returns: A dictionary of best hyper parameters