AutoML API¶
orca.automl.auto_estimator¶
A general estimator supports automatic model tuning. It allows users to fit and search the best hyperparameter for their model.
- class zoo.orca.automl.auto_estimator.AutoEstimator(model_builder, logs_dir='/tmp/auto_estimator_logs', resources_per_trial=None, remote_dir=None, name=None)[source]¶
Bases:
objectExample
>>> auto_est = AutoEstimator.from_torch(model_creator=model_creator, optimizer=get_optimizer, loss=nn.BCELoss(), logs_dir="/tmp/zoo_automl_logs", resources_per_trial={"cpu": 2}, name="test_fit") >>> auto_est.fit(data=data, validation_data=validation_data, search_space=create_linear_search_space(), n_sampling=4, epochs=1, metric="accuracy") >>> best_model = auto_est.get_best_model()
- static from_torch(*, model_creator, optimizer, loss, logs_dir='/tmp/auto_estimator_logs', resources_per_trial=None, name=None)[source]¶
Create an AutoEstimator for torch.
- Parameters
model_creator – PyTorch model creator function.
optimizer – PyTorch optimizer creator function or pytorch optimizer name (string). Note that you should specify learning rate search space with key as “lr” or LR_NAME (from zoo.orca.automl.pytorch_utils import LR_NAME) if input optimizer name. Without learning rate search space specified, the default learning rate value of 1e-3 will be used for all estimators.
loss – PyTorch loss instance or PyTorch loss creator function or pytorch loss name (string).
logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_estimator_logs”
resources_per_trial – Dict. resources for each trial. e.g. {“cpu”: 2}.
name – Name of the auto estimator.
- Returns
an AutoEstimator object.
- static from_keras(*, model_creator, logs_dir='/tmp/auto_estimator_logs', resources_per_trial=None, name=None)[source]¶
Create an AutoEstimator for tensorflow keras.
- Parameters
model_creator – Tensorflow keras model creator function.
logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_estimator_logs”
resources_per_trial – Dict. resources for each trial. e.g. {“cpu”: 2}.
name – Name of the auto estimator.
- Returns
an AutoEstimator object.
- fit(data, epochs=1, validation_data=None, metric=None, metric_mode=None, metric_threshold=None, n_sampling=1, search_space=None, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]¶
Automatically fit the model and search for the best hyperparameters.
- Parameters
data – train data. If the AutoEstimator is created with from_torch, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. If the AutoEstimator is created with from_keras, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y), where x is training input data and y is training target data.
epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.
validation_data – Validation data. Validation data type should be the same as data.
metric – String. The evaluation metric name to optimize, e.g. “mse”
metric_mode – One of [“min”, “max”]. “max” means greater metric value is better. You don’t have to specify metric_mode if you use the built-in metric in zoo.automl.common.metrics.Evaluator.
metric_threshold – a trial will be terminated when metric threshold is met
n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.
search_space – a dict for search space
search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)
search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode
scheduler – str, all supported scheduler provided by ray tune
scheduler_params – parameters for scheduler