tpcp.optimize.optuna.OptunaSearch#

class tpcp.optimize.optuna.OptunaSearch(pipeline: PipelineT, create_study: Callable[[], Study], create_search_space: Callable[[Trial], None], *, scoring: Optional[Union[Callable[[PipelineT, DatasetT], Union[T, Aggregator[Any], Dict[str, Union[T, Aggregator[Any]]], Dict[str, Union[T, Aggregator[Any], Dict[str, Union[T, Aggregator[Any]]]]]]], Scorer[PipelineT, DatasetT, Union[T, Aggregator[Any], Dict[str, Union[T, Aggregator[Any]]]]]]], score_name: Optional[str] = None, n_trials: Optional[int] = None, timeout: Optional[float] = None, callbacks: Optional[List[Callable[[Study, FrozenTrial], None]]] = None, gc_after_trial: bool = False, n_jobs: int = 1, show_progress_bar: bool = False, return_optimized: bool = True)[source]#

GridSearch equivalent using Optuna.

An opinionated parameter optimization for simple (i.e. non-optimizable) pipelines that can be used as a replacement to GridSearch.

Parameters:
pipeline

A tpcp pipeline with some hyper-parameters that should be optimized. This should be a normal (i.e. non-optimizable pipeline) when using this class.

create_study

A callable that returns an optuna study instance to be used for the optimization. It will be called as part of the optimize method without parameters. The resulting study object can be accessed via self.study_ after the optimization is finished. Creating the study is handled via a callable, instead of providing the study object itself, to make it possible to create individual studies, when CustomOptuna optimize is called by an external wrapper (i.e. cross_validate).

create_search_space

A callable that takes a Trial object as input and calls suggest_* methods on it to define the search space.

scoring

A callable that can score a single data point given a pipeline. This function should return either a single score or a dictionary of scores. If scoring is None the default score method of the pipeline is used instead.

Note that if scoring returns a dictionary, score_name must be set to the name of the score that should be used for ranking.

score_name

The name of the score that should be used for ranking in case the scoring function returns a dictionary of values.

n_trials

The number of trials. If this argument is set to None, there is no limitation on the number of trials. In this case you should use timeout instead. Because optuna is called internally by this wrapper, you can not set up a study without limits and end it using CTRL+C (as suggested by the Optuna docs). In this case the entire execution flow would be stopped.

timeout

Stop study after the given number of second(s). If this argument is set to None, the study is executed without time limitation. In this case you should use n_trials to limit the execution.

return_optimized

If True, a pipeline object with the overall best parameters is created. The optimized pipeline object is stored as optimized_pipeline_.

callbacks

List of callback functions that are invoked at the end of each trial. Each function must accept two parameters with the following types in this order: Study and FrozenTrial.

n_jobs

Number of parallel jobs to use (default = 1 -> single process, -1 -> all available cores). This uses joblib with the multiprocessing backend to parallelize the optimization. If this is set to -1, all available cores are used.

Warning

Read the notes in CustomOptunaOptimize on multiprocessing below before using this feature.

show_progress_bar

Flag to show progress bars or not.

gc_after_trial

Run the garbage collector after each trial. Check the optuna documentation for more detail

Other Parameters:
dataset

The dataset instance passed to the optimize method

Attributes:
search_results_

Detailed results of the study.

optimized_pipeline_

An instance of the input pipeline with the best parameter set. This is only available if return_optimized is not False.

best_params_

Parameters of the best trial in the Study.

best_score_

Best score reached in the study.

best_trial_

Best trial in the Study.

study_

The study object itself. This should usually be identical to self.study.

multimetric_

If the scorer returned multiple scores

Methods

clone()

Create a new instance of the class with all parameters copied over.

create_objective()

Create the objective function for optuna.

get_params([deep])

Get parameters for this algorithm.

optimize(dataset, **_)

Optimize the objective over the dataset and find the best parameter combination.

return_optimized_pipeline(pipeline, dataset, ...)

Return the pipeline with the best parameters of a study.

run(datapoint)

Run the optimized pipeline.

safe_run(datapoint)

Run the optimized pipeline.

score(datapoint)

Run score of the optimized pipeline.

set_params(**params)

Set the parameters of this Algorithm.

__init__(pipeline: PipelineT, create_study: Callable[[], Study], create_search_space: Callable[[Trial], None], *, scoring: Optional[Union[Callable[[PipelineT, DatasetT], Union[T, Aggregator[Any], Dict[str, Union[T, Aggregator[Any]]], Dict[str, Union[T, Aggregator[Any], Dict[str, Union[T, Aggregator[Any]]]]]]], Scorer[PipelineT, DatasetT, Union[T, Aggregator[Any], Dict[str, Union[T, Aggregator[Any]]]]]]], score_name: Optional[str] = None, n_trials: Optional[int] = None, timeout: Optional[float] = None, callbacks: Optional[List[Callable[[Study, FrozenTrial], None]]] = None, gc_after_trial: bool = False, n_jobs: int = 1, show_progress_bar: bool = False, return_optimized: bool = True)[source]#
_call_optimize(study: Study, objective: Callable[[Trial], Union[float, Sequence[float]]])[source]#

Call the optuna study.

This is a separate method to make it easy to modify how the study is called.

clone() Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

create_objective() Callable[[Trial, PipelineT, DatasetT], Union[float, Sequence[float]]][source]#

Create the objective function for optuna.

This is an internal function and should not be called directly.

get_params(deep: bool = True) Dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:
deep

Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:
params

Parameter names mapped to their values.

optimize(dataset: DatasetT, **_: Any) Self[source]#

Optimize the objective over the dataset and find the best parameter combination.

This method calls self.create_objective to obtain the objective function that should be optimized.

Parameters:
dataset

The dataset used for optimization.

return_optimized_pipeline(pipeline: PipelineT, dataset: DatasetT, study: Study) PipelineT[source]#

Return the pipeline with the best parameters of a study.

This is an internal function and should not be called directly.

run(datapoint: DatasetT) PipelineT[source]#

Run the optimized pipeline.

This is a wrapper to contain API compatibility with Pipeline.

safe_run(datapoint: DatasetT) PipelineT[source]#

Run the optimized pipeline.

This is a wrapper to contain API compatibility with Pipeline.

score(datapoint: DatasetT) Union[float, Dict[str, float]][source]#

Run score of the optimized pipeline.

This is a wrapper to contain API compatibility with Pipeline.

set_params(**params: Any) Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

Examples using tpcp.optimize.optuna.OptunaSearch#

Custom Optuna Optimizer

Custom Optuna Optimizer

Custom Optuna Optimizer
Build-in Optuna Optimizers

Build-in Optuna Optimizers

Build-in Optuna Optimizers