.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/validation/_02_cross_validation.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_validation__02_cross_validation.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_validation__02_cross_validation.py:


.. _cross_validation:

Cross Validation
================

Whenever using some sort of trainable algorithm it is important to clearly separate the training and the testing data to
get an unbiased result.
Usually this is achieved by a train-test split.
However, if you don't have that much data, there is always a risk that one random train-test split, will provide
better (or worse) results than another.
In these cases it is a good idea to use cross-validation.
In this procedure, you perform multiple train-test splits and average the results over all "folds".
For more information see our :ref:`evaluation guide <algorithm_evaluation>` and the `sklearn guide on cross
validation <https://scikit-learn.org/stable/modules/cross_validation.html>`_.

In this example, we will learn how to use the :func:`~tpcp.validate.cross_validate` function implemented in
tcpc.
For this, we will redo the example on :ref:`optimizable pipelines <optimize_pipelines>` but we will perform the final
evaluation via cross-validation.
If you want to have more information on how the dataset and pipeline is built, head over to this example.
Here we will just copy the code over.

.. GENERATED FROM PYTHON SOURCE LINES 25-26

Dataset

.. GENERATED FROM PYTHON SOURCE LINES 26-37

.. code-block:: default

    from pathlib import Path

    from examples.datasets.datasets_final_ecg import ECGExampleData

    try:
        HERE = Path(__file__).parent
    except NameError:
        HERE = Path().resolve()
    data_path = HERE.parent.parent / "example_data/ecg_mit_bih_arrhythmia/data"
    example_data = ECGExampleData(data_path)


.. GENERATED FROM PYTHON SOURCE LINES 38-39

Pipeline

.. GENERATED FROM PYTHON SOURCE LINES 39-71

.. code-block:: default

    import pandas as pd

    from examples.algorithms.algorithms_qrs_detection_final import OptimizableQrsDetector
    from tpcp import OptimizableParameter, OptimizablePipeline, Parameter, cf


    class MyPipeline(OptimizablePipeline):
        algorithm: Parameter[OptimizableQrsDetector]
        algorithm__min_r_peak_height_over_baseline: OptimizableParameter[float]

        r_peak_positions_: pd.Series

        def __init__(self, algorithm: OptimizableQrsDetector = cf(OptimizableQrsDetector())):
            self.algorithm = algorithm

        def self_optimize(self, dataset: ECGExampleData, **kwargs):
            ecg_data = [d.data["ecg"] for d in dataset]
            r_peaks = [d.r_peak_positions_["r_peak_position"] for d in dataset]
            # Note: We need to clone the algorithm instance, to make sure we don't leak any data between runs.
            algo = self.algorithm.clone()
            self.algorithm = algo.self_optimize(ecg_data, r_peaks, dataset.sampling_rate_hz)
            return self

        def run(self, datapoint: ECGExampleData):
            # Note: We need to clone the algorithm instance, to make sure we don't leak any data between runs.
            algo = self.algorithm.clone()
            algo.detect(datapoint.data, datapoint.sampling_rate_hz)

            self.r_peak_positions_ = algo.r_peak_positions_
            return self


.. GENERATED FROM PYTHON SOURCE LINES 72-76

The Scorer
----------
The scorer is identical to the scoring function used in the other examples.
The F1-score is still the most important parameter for our comparison.

.. GENERATED FROM PYTHON SOURCE LINES 76-94

.. code-block:: default

    from examples.algorithms.algorithms_qrs_detection_final import match_events_with_reference, precision_recall_f1_score


    def score(pipeline: MyPipeline, datapoint: ECGExampleData):
        # We use the `safe_run` wrapper instead of just run. This is always a good idea.
        # We don't need to clone the pipeline here, as GridSearch will already clone the pipeline internally and `run`
        # will clone it again.
        pipeline = pipeline.safe_run(datapoint)
        tolerance_s = 0.02  # We just use 20 ms for this example
        matches = match_events_with_reference(
            pipeline.r_peak_positions_.to_numpy(),
            datapoint.r_peak_positions_.to_numpy(),
            tolerance=tolerance_s * datapoint.sampling_rate_hz,
        )
        precision, recall, f1_score = precision_recall_f1_score(matches)
        return {"precision": precision, "recall": recall, "f1_score": f1_score}


.. GENERATED FROM PYTHON SOURCE LINES 95-102

Data Splitting
--------------
Before performing a cross validation, we need to decide on the number of folds and type of splits.
In `tpcp` we support all cross validation iterators provided in
`sklearn <https://scikit-learn.org/stable/modules/cross_validation.html#cross-validation-iterators>`__.

To keep the runtime low for this example, we are going to use a 3-fold CV.

.. GENERATED FROM PYTHON SOURCE LINES 102-106

.. code-block:: default

    from sklearn.model_selection import KFold

    cv = KFold(n_splits=3)


.. GENERATED FROM PYTHON SOURCE LINES 107-113

Cross Validation
----------------
Now we have all the pieces for the final cross validation.
First we need to create instances of our data and pipeline.
Then we need to wrap our pipeline instance into an :class:`~tpcp.optimize.Optimize` wrapper.
Finally, we can call `tpcp.validate.cross_validate`.

.. GENERATED FROM PYTHON SOURCE LINES 113-125

.. code-block:: default

    from tpcp.optimize import Optimize
    from tpcp.validate import cross_validate

    pipe = MyPipeline()
    optimizable_pipe = Optimize(pipe)

    results = cross_validate(
        optimizable_pipe, example_data, scoring=score, cv=cv, return_optimizer=True, return_train_score=True
    )
    result_df = pd.DataFrame(results)
    result_df


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    CV Folds:   0%|          | 0/3 [00:00<?, ?it/s]
    Datapoints:   0%|          | 0/4 [00:00<?, ?it/s]
    Datapoints:  50%|█████     | 2/4 [00:00<00:00, 14.66it/s]
    Datapoints: 100%|██████████| 4/4 [00:00<00:00, 14.34it/s]    Datapoints: 100%|██████████| 4/4 [00:00<00:00, 14.37it/s]

    Datapoints:   0%|          | 0/8 [00:00<?, ?it/s]
    Datapoints:  25%|██▌       | 2/8 [00:00<00:00, 13.97it/s]
    Datapoints:  50%|█████     | 4/8 [00:00<00:00, 13.81it/s]
    Datapoints:  75%|███████▌  | 6/8 [00:00<00:00, 13.87it/s]
    Datapoints: 100%|██████████| 8/8 [00:00<00:00, 14.18it/s]    Datapoints: 100%|██████████| 8/8 [00:00<00:00, 14.05it/s]
    CV Folds:  33%|███▎      | 1/3 [00:01<00:02,  1.34s/it]
    Datapoints:   0%|          | 0/4 [00:00<?, ?it/s]
    Datapoints:  50%|█████     | 2/4 [00:00<00:00, 14.60it/s]
    Datapoints: 100%|██████████| 4/4 [00:00<00:00, 14.61it/s]    Datapoints: 100%|██████████| 4/4 [00:00<00:00, 14.58it/s]

    Datapoints:   0%|          | 0/8 [00:00<?, ?it/s]
    Datapoints:  25%|██▌       | 2/8 [00:00<00:00, 15.01it/s]
    Datapoints:  50%|█████     | 4/8 [00:00<00:00, 14.88it/s]
    Datapoints:  75%|███████▌  | 6/8 [00:00<00:00, 14.85it/s]
    Datapoints: 100%|██████████| 8/8 [00:00<00:00, 14.88it/s]    Datapoints: 100%|██████████| 8/8 [00:00<00:00, 14.88it/s]
    CV Folds:  67%|██████▋   | 2/3 [00:02<00:01,  1.36s/it]
    Datapoints:   0%|          | 0/4 [00:00<?, ?it/s]
    Datapoints:  50%|█████     | 2/4 [00:00<00:00, 15.11it/s]
    Datapoints: 100%|██████████| 4/4 [00:00<00:00, 14.94it/s]    Datapoints: 100%|██████████| 4/4 [00:00<00:00, 14.94it/s]

    Datapoints:   0%|          | 0/8 [00:00<?, ?it/s]
    Datapoints:  25%|██▌       | 2/8 [00:00<00:00, 15.06it/s]
    Datapoints:  50%|█████     | 4/8 [00:00<00:00, 14.67it/s]
    Datapoints:  75%|███████▌  | 6/8 [00:00<00:00, 14.63it/s]
    Datapoints: 100%|██████████| 8/8 [00:00<00:00, 14.82it/s]    Datapoints: 100%|██████████| 8/8 [00:00<00:00, 14.77it/s]
    CV Folds: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]    CV Folds: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>score_time</th>
          <th>optimize_time</th>
          <th>train_data_labels</th>
          <th>test_data_labels</th>
          <th>optimizer</th>
          <th>test_precision</th>
          <th>test_recall</th>
          <th>test_f1_score</th>
          <th>test_single_precision</th>
          <th>test_single_recall</th>
          <th>test_single_f1_score</th>
          <th>train_precision</th>
          <th>train_recall</th>
          <th>train_f1_score</th>
          <th>train_single_precision</th>
          <th>train_single_recall</th>
          <th>train_single_f1_score</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>0.304279</td>
          <td>0.416829</td>
          <td>[(group_2, 106), (group_3, 108), (group_1, 114...</td>
          <td>[(group_1, 100), (group_2, 102), (group_3, 104...</td>
          <td>Optimize(optimize_with_info=True, pipeline=MyP...</td>
          <td>0.975938</td>
          <td>0.978213</td>
          <td>0.977058</td>
          <td>[0.9995600527936648, 0.9723119520073835, 0.962...</td>
          <td>[0.9995600527936648, 0.9634202103337905, 0.968...</td>
          <td>[0.9995600527936648, 0.9678456591639871, 0.965...</td>
          <td>0.940084</td>
          <td>0.775055</td>
          <td>0.813250</td>
          <td>[0.9470529470529471, 0.9077277970011534, 0.919...</td>
          <td>[0.9353724716329551, 0.44639818491208166, 0.15...</td>
          <td>[0.9411764705882353, 0.5984790874524715, 0.260...</td>
        </tr>
        <tr>
          <th>1</th>
          <td>0.299644</td>
          <td>0.485168</td>
          <td>[(group_1, 100), (group_2, 102), (group_3, 104...</td>
          <td>[(group_2, 106), (group_3, 108), (group_1, 114...</td>
          <td>Optimize(optimize_with_info=True, pipeline=MyP...</td>
          <td>0.918491</td>
          <td>0.647829</td>
          <td>0.710830</td>
          <td>[0.9192846785886902, 0.8467005076142132, 0.910...</td>
          <td>[0.938332511100148, 0.47305728871242203, 0.189...</td>
          <td>[0.9287109375, 0.6069868995633187, 0.313656387...</td>
          <td>0.954602</td>
          <td>0.952627</td>
          <td>0.953583</td>
          <td>[0.9995600527936648, 0.9725022914757103, 0.962...</td>
          <td>[0.9995600527936648, 0.9702789208962048, 0.968...</td>
          <td>[0.9995600527936648, 0.9713893339436942, 0.965...</td>
        </tr>
        <tr>
          <th>2</th>
          <td>0.293074</td>
          <td>0.493561</td>
          <td>[(group_1, 100), (group_2, 102), (group_3, 104...</td>
          <td>[(group_3, 119), (group_1, 121), (group_2, 123...</td>
          <td>Optimize(optimize_with_info=True, pipeline=MyP...</td>
          <td>0.941766</td>
          <td>0.909605</td>
          <td>0.925066</td>
          <td>[0.9984909456740443, 1.0, 0.9993416721527321, ...</td>
          <td>[0.9989934574735783, 0.9243156199677939, 1.0, ...</td>
          <td>[0.99874213836478, 0.9606694560669455, 0.99967...</td>
          <td>0.965628</td>
          <td>0.799168</td>
          <td>0.833407</td>
          <td>[0.9995600527936648, 0.972183588317107, 0.9638...</td>
          <td>[0.9995600527936648, 0.9588477366255144, 0.968...</td>
          <td>[0.9995600527936648, 0.9654696132596684, 0.965...</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 126-133

Understanding the Results
-------------------------
The cross validation provides a lot of outputs (some of them can be disabled using the function parameters).
To simplify things a little, we will split the output into four parts:

The main output are the test set performance values.
Each row corresponds to performance in respective fold.

.. GENERATED FROM PYTHON SOURCE LINES 133-136

.. code-block:: default

    performance = result_df[["test_precision", "test_recall", "test_f1_score"]]
    performance


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>test_precision</th>
          <th>test_recall</th>
          <th>test_f1_score</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>0.975938</td>
          <td>0.978213</td>
          <td>0.977058</td>
        </tr>
        <tr>
          <th>1</th>
          <td>0.918491</td>
          <td>0.647829</td>
          <td>0.710830</td>
        </tr>
        <tr>
          <th>2</th>
          <td>0.941766</td>
          <td>0.909605</td>
          <td>0.925066</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 137-140

The final generalization performance you would report is usually the average over all folds.
The STD can also be interesting, as it tells you how stable your optimization is and if your splits provide
comparable data distributions.

.. GENERATED FROM PYTHON SOURCE LINES 140-143

.. code-block:: default

    generalization_performance = performance.agg(["mean", "std"])
    generalization_performance


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>test_precision</th>
          <th>test_recall</th>
          <th>test_f1_score</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>mean</th>
          <td>0.945399</td>
          <td>0.845216</td>
          <td>0.870985</td>
        </tr>
        <tr>
          <th>std</th>
          <td>0.028895</td>
          <td>0.174350</td>
          <td>0.141113</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 144-150

If you need more insight into the results (e.g. when the std of your results is high), you can inspect the
individual score for each data point.
In this example this is only a list with a single element per score, as we only had a single datapoint per fold.
In a real scenario, this will be a list of all datapoints.
Inspecting this list can help to identify potential issues with certain parts of your dataset.
To link the performance values to a specific datapoint, you can look at the `test_data_labels` field.

.. GENERATED FROM PYTHON SOURCE LINES 150-155

.. code-block:: default

    single_performance = result_df[
        ["test_single_precision", "test_single_recall", "test_single_f1_score", "test_data_labels"]
    ]
    single_performance


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>test_single_precision</th>
          <th>test_single_recall</th>
          <th>test_single_f1_score</th>
          <th>test_data_labels</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>[0.9995600527936648, 0.9723119520073835, 0.962...</td>
          <td>[0.9995600527936648, 0.9634202103337905, 0.968...</td>
          <td>[0.9995600527936648, 0.9678456591639871, 0.965...</td>
          <td>[(group_1, 100), (group_2, 102), (group_3, 104...</td>
        </tr>
        <tr>
          <th>1</th>
          <td>[0.9192846785886902, 0.8467005076142132, 0.910...</td>
          <td>[0.938332511100148, 0.47305728871242203, 0.189...</td>
          <td>[0.9287109375, 0.6069868995633187, 0.313656387...</td>
          <td>[(group_2, 106), (group_3, 108), (group_1, 114...</td>
        </tr>
        <tr>
          <th>2</th>
          <td>[0.9984909456740443, 1.0, 0.9993416721527321, ...</td>
          <td>[0.9989934574735783, 0.9243156199677939, 1.0, ...</td>
          <td>[0.99874213836478, 0.9606694560669455, 0.99967...</td>
          <td>[(group_3, 119), (group_1, 121), (group_2, 123...</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 156-160

Even further insight is provided by the train results (if activated in parameters).
These are the performance results on the train set and can indicate if the training provided meaningful results and
can also indicate over-fitting, if the performance of the test set is much worse than the performance on the train
set.

.. GENERATED FROM PYTHON SOURCE LINES 160-173

.. code-block:: default

    train_performance = result_df[
        [
            "train_precision",
            "train_recall",
            "train_f1_score",
            "train_single_precision",
            "train_single_recall",
            "train_single_f1_score",
            "train_data_labels",
        ]
    ]
    train_performance


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>train_precision</th>
          <th>train_recall</th>
          <th>train_f1_score</th>
          <th>train_single_precision</th>
          <th>train_single_recall</th>
          <th>train_single_f1_score</th>
          <th>train_data_labels</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>0.940084</td>
          <td>0.775055</td>
          <td>0.813250</td>
          <td>[0.9470529470529471, 0.9077277970011534, 0.919...</td>
          <td>[0.9353724716329551, 0.44639818491208166, 0.15...</td>
          <td>[0.9411764705882353, 0.5984790874524715, 0.260...</td>
          <td>[(group_2, 106), (group_3, 108), (group_1, 114...</td>
        </tr>
        <tr>
          <th>1</th>
          <td>0.954602</td>
          <td>0.952627</td>
          <td>0.953583</td>
          <td>[0.9995600527936648, 0.9725022914757103, 0.962...</td>
          <td>[0.9995600527936648, 0.9702789208962048, 0.968...</td>
          <td>[0.9995600527936648, 0.9713893339436942, 0.965...</td>
          <td>[(group_1, 100), (group_2, 102), (group_3, 104...</td>
        </tr>
        <tr>
          <th>2</th>
          <td>0.965628</td>
          <td>0.799168</td>
          <td>0.833407</td>
          <td>[0.9995600527936648, 0.972183588317107, 0.9638...</td>
          <td>[0.9995600527936648, 0.9588477366255144, 0.968...</td>
          <td>[0.9995600527936648, 0.9654696132596684, 0.965...</td>
          <td>[(group_1, 100), (group_2, 102), (group_3, 104...</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 174-176

The final level of debug information is provided via the timings (note the long runtime in fold 0 can be explained
by the jit-compiler used in `BarthDtw`) ...

.. GENERATED FROM PYTHON SOURCE LINES 176-179

.. code-block:: default

    timings = result_df[["score_time", "optimize_time"]]
    timings


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>score_time</th>
          <th>optimize_time</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>0.304279</td>
          <td>0.416829</td>
        </tr>
        <tr>
          <th>1</th>
          <td>0.299644</td>
          <td>0.485168</td>
        </tr>
        <tr>
          <th>2</th>
          <td>0.293074</td>
          <td>0.493561</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 180-184

... and the optimized pipeline object.
This is the actual trained object generated in this fold.
You can apply it to other data for testing or inspect the actual object for further debug information that might be
stored on it.

.. GENERATED FROM PYTHON SOURCE LINES 184-187

.. code-block:: default

    optimized_pipeline = result_df["optimizer"][0]
    optimized_pipeline


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Optimize(optimize_with_info=True, pipeline=MyPipeline(algorithm=OptimizableQrsDetector(high_pass_filter_cutoff_hz=1, max_heart_rate_bpm=200.0, min_r_peak_height_over_baseline=1.0, r_peak_match_tolerance_s=0.01)), safe_optimize=True)


.. GENERATED FROM PYTHON SOURCE LINES 188-190

.. code-block:: default

    optimized_pipeline.optimized_pipeline_.get_params()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    {'algorithm__high_pass_filter_cutoff_hz': 1, 'algorithm__max_heart_rate_bpm': 200.0, 'algorithm__min_r_peak_height_over_baseline': 0.6322168257130579, 'algorithm__r_peak_match_tolerance_s': 0.01, 'algorithm': OptimizableQrsDetector(high_pass_filter_cutoff_hz=1, max_heart_rate_bpm=200.0, min_r_peak_height_over_baseline=0.6322168257130579, r_peak_match_tolerance_s=0.01)}


.. GENERATED FROM PYTHON SOURCE LINES 191-200

Further Notes
-------------
We also support grouped cross validation.
Check the :ref:`dataset guide <custom_dataset_basics>` on how you can group the data before cross-validation or
generate data labels to be used with `GroupedKFold`.

`Optimize` is just an example of an optimizer that can be passed to cross validation.
You can pass any `tpcp` optimizer like `GridSearch` or `GridSearchCV` or custom optimizer that implement the
`tpcp.optimize.BaseOptimize` interface.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 6.580 seconds)

**Estimated memory usage:**  28 MB


.. _sphx_glr_download_auto_examples_validation__02_cross_validation.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: _02_cross_validation.py <_02_cross_validation.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: _02_cross_validation.ipynb <_02_cross_validation.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_