tpcp.validate.cross_validate(optimizable: BaseOptimize, dataset: Dataset, *, groups: list[str | tuple[str, ...]] | None = None, mock_labels: list[str | tuple[str, ...]] | None = None, scoring: Callable | None = None, cv: int | BaseCrossValidator | Iterator | None = None, n_jobs: int | None = None, verbose: int = 0, optimize_params: dict[str, Any] | None = None, propagate_groups: bool = True, propagate_mock_labels: bool = True, pre_dispatch: str | int = '2*n_jobs', return_train_score: bool = False, return_optimizer: bool = False, progress_bar: bool = True)[source]#

Evaluate a pipeline on a dataset using cross validation.

This function follows as much as possible the interface of cross_validate. If the tpcp documentation is missing some information, the respective documentation of sklearn might be helpful.


A optimizable class instance like GridSearch/GridSearchCV or a Pipeline wrapped in an Optimize object (OptimizablePipeline).


A Dataset containing all information.


Group labels for samples used by the cross validation helper, in case a grouped CV is used (e.g. GroupKFold). Check the documentation of the Dataset class and the respective example for information on how to generate group labels for tpcp datasets.

The groups will be passed to the optimizers optimize method under the same name, if propagate_groups is True.


The value of mock_labels is passed as the y parameter to the cross-validation helper’s split method. This can be helpful, if you want to use stratified cross validation. Usually, the stratified CV classes use y (i.e. the label) to stratify the data. However, in tpcp, we don’t have a dedicated y as data and labels are both stored in a single datastructure. If you want to stratify the data (e.g. based on patient cohorts), you can create your own list of labels/groups that should be used for stratification and pass it to mock_labels instead.

The labels will be passed to the optimizers optimize method under the same name, if propagate_mock_labels is True (similar to how groups are handled).


A callable that can score a single data point given a pipeline. This function should return either a single score or a dictionary of scores. If scoring is None the default score method of the optimizable is used instead.


An integer specifying the number of folds in a K-Fold cross validation or a valid cross validation helper. The default (None) will result in a 5-fold cross validation. For further inputs check the sklearn documentation.


Number of jobs to run in parallel. One job is created per CV fold. The default (None) means 1 job at the time, hence, no parallel computing.


The verbosity level (larger number -> higher verbosity). At the moment this only effects Parallel.


Additional parameter that are forwarded to the optimize method.


In case your optimizable is a cross validation based optimize (e.g. GridSearchCv) and you are using a grouped cross validation, you probably want to use the same grouped CV for the outer and the inner cross validation. If propagate_groups is True, the group labels belonging to the training of each fold are passed to the optimize method of the optimizable. This only has an effect if groups are specified.


For the same reason as propagate_groups, you might also want to forward the value provided for mock_labels to the optimization workflow.


The number of jobs that should be pre dispatched. For an explanation see the documentation of Parallel.


If True the performance on the train score is returned in addition to the test score performance. Note, that this increases the runtime. If True, the fields train_data_labels, train_score, and train_score_single are available in the results.


If the optimized instance of the input optimizable should be returned. If True, the field optimizer is available in the results.


True/False to enable/disable a tqdm progress bar.


Dictionary with results. Each element is either a list or array of length n_folds. The dictionary can be directly passed into the pandas DataFrame constructor for a better representation.

The following fields are in the results:

test_score / test_{scorer-name}

The aggregated value of a score over all data-points. If a single score is used for scoring, then the generic name “score” is used. Otherwise, multiple columns with the name of the respective scorer exist.

test_single_score / test_single_{scorer-name}

The individual scores per datapoint per fold. This is a list of values with the len(train_set).


A list of data labels of the train set in the order the single score values are provided. These can be used to associate the single_score values with a certain data-point.

train_score / train_{scorer-name}

Results for train set of each fold.

train_single_score / train_single_{scorer-name}

Results for individual data points in the train set of each fold


The data labels for the train set.


Time required to optimize the pipeline in each fold.


Cumulative score time to score all data points in the test set.


The optimized instances per fold. One instance per fold is returned. The optimized version of the pipeline can be obtained via the optimized_pipeline_ attribute on the instance.

Examples using tpcp.validate.cross_validate#

Custom Optuna Optimizer

Custom Optuna Optimizer

Cross Validation

Cross Validation