.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/recipies/_02_dataclasses.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_recipies__02_dataclasses.py: .. _dataclasses: Dataclass and Attrs support =========================== When using `tpcp` you have to write a lot of classes with a lot of parameters. For each class you need to repeat all parameter names up to 3 times, even before writing any documentation. Below you can see the relevant part of the `QRSDetection` algorithm we implemented in another example. Even though it has only 3 parameters, it requires over 20 lines of code to define the basic initialization. .. GENERATED FROM PYTHON SOURCE LINES 13-44 .. code-block:: default import pandas as pd from tpcp import Algorithm, Parameter class QRSDetector(Algorithm): _action_methods = "detect" # Input Parameters high_pass_filter_cutoff_hz: Parameter[float] max_heart_rate_bpm: Parameter[float] min_r_peak_height_over_baseline: Parameter[float] # Results r_peak_positions_: pd.Series # Some internal constants _HIGH_PASS_FILTER_ORDER: int = 4 def __init__( self, max_heart_rate_bpm: float = 200.0, min_r_peak_height_over_baseline: float = 1.0, high_pass_filter_cutoff_hz: float = 0.5, ): self.max_heart_rate_bpm = max_heart_rate_bpm self.min_r_peak_height_over_baseline = min_r_peak_height_over_baseline self.high_pass_filter_cutoff_hz = high_pass_filter_cutoff_hz .. GENERATED FROM PYTHON SOURCE LINES 45-54 Luckily, Python has a built-in solution for that, called `dataclasses`. With that, we can write the class above much more compact. The only downside is that the annotation of result fields and constants is a little more verbose, and you **need** to make sure that these parameters are excluded from the init. Otherwise, tpcp will explode ;) Note, if you are using Python >=3.10, we highly recommend to use the `kw_only` option for dataclasses, which prevent some of the inheritance issues of dataclasses. .. GENERATED FROM PYTHON SOURCE LINES 54-76 .. code-block:: default from dataclasses import dataclass, field from typing import ClassVar @dataclass(repr=False) # We disable the automatic repr generation, as we have one. The default one might cause errors. class QRSDetector(Algorithm): _action_methods: ClassVar[str] = "detect" # Input Parameters high_pass_filter_cutoff_hz: Parameter[float] = 200.0 max_heart_rate_bpm: Parameter[float] = 1.0 min_r_peak_height_over_baseline: Parameter[float] = 0.5 # Results # We need to add the special field annotation, to exclude the parameter from the init r_peak_positions_: pd.Series = field(init=False, repr=False) # Some internal constants # Using the ClassVar annotation, will mark this value as a constant and dataclasses will ignore it. _HIGH_PASS_FILTER_ORDER: ClassVar[int] = 4 .. GENERATED FROM PYTHON SOURCE LINES 77-78 We still get all parameters in the init: .. GENERATED FROM PYTHON SOURCE LINES 78-81 .. code-block:: default QRSDetector(high_pass_filter_cutoff_hz=4, max_heart_rate_bpm=200, min_r_peak_height_over_baseline=1) .. rst-class:: sphx-glr-script-out .. code-block:: none QRSDetector(high_pass_filter_cutoff_hz=4, max_heart_rate_bpm=200, min_r_peak_height_over_baseline=1) .. GENERATED FROM PYTHON SOURCE LINES 82-92 Inheritance ----------- Creating child classes of `dataclasses` is also simple. Instead of repeating all parameters, you just need to specify the new once. However, you need to make sure that you also apply the `dataclass` decorator to the child class! ... warning :: New parameters will be added at the end in the positional order in the init method. To avoid passing the wrong values to the wrong parameters, we highly recommend to pass parameters only by name and not by position, or use the `kw_only` parameter of dataclasses supported in Python >=3.10. .. GENERATED FROM PYTHON SOURCE LINES 92-102 .. code-block:: default @dataclass(repr=False) class ModifiedQRSDetector(QRSDetector): new_parameter: Parameter[float] = 3 ModifiedQRSDetector( high_pass_filter_cutoff_hz=4, max_heart_rate_bpm=200, min_r_peak_height_over_baseline=1, new_parameter=3 ) .. rst-class:: sphx-glr-script-out .. code-block:: none ModifiedQRSDetector(high_pass_filter_cutoff_hz=4, max_heart_rate_bpm=200, min_r_peak_height_over_baseline=1, new_parameter=3) .. GENERATED FROM PYTHON SOURCE LINES 103-117 Inheritance from complex tpcp classes -------------------------------------- While inheriting from other dataclasses works without issues, be aware that you can not subclass a class that is not a `dataclass` and also has a `__init__` method! For example, you can not subclass :class:`~tpcp.optimize.GridSearch` with a dataclass, as it already defines its own `__init__`. In this case you need to use a regular class and manually repeat all parent parameters (and call `super().__init__()`). While this might not be a big deal for the GridSearch class, as you are not expected to subclass it on a regular, it can become annoying for classes like `~tpcp.Dataset` and `~tpcp.optimize.optuna.CustomOptunaOptimize`, which already have an init and you need to subclass to work with them. For these two classes (and other classes with predefined inits, we expect you to subclass from), we provide a `as_dataclass` class method that returns a data class version of the respective class: .. GENERATED FROM PYTHON SOURCE LINES 117-135 .. code-block:: default from itertools import product from tpcp import Dataset @dataclass(repr=False) class CustomDataset(Dataset.as_dataclass()): # Note the `as_dataclass` call here! def create_index(self) -> pd.DataFrame: return pd.DataFrame( list(product(("patient_1", "patient_2", "patient_3"), ("test_1", "test_2"), ("1", "2"))), columns=["patient", "test", "extra"], ) custom_param: float = 2 # This must have a default value, as the baseclass has parameters with defautls CustomDataset(custom_param=3) .. raw:: html

CustomDataset [12 groups/rows]

patient test extra
0 patient_1 test_1 1
1 patient_1 test_1 2
2 patient_1 test_2 1
3 patient_1 test_2 2
4 patient_2 test_1 1
5 patient_2 test_1 2
6 patient_2 test_2 1
7 patient_2 test_2 2
8 patient_3 test_1 1
9 patient_3 test_1 2
10 patient_3 test_2 1
11 patient_3 test_2 2


.. GENERATED FROM PYTHON SOURCE LINES 136-141 Mutable Defaults ---------------- In `tpcp` we usually deal with the issue of mutable defaults by using the :class:`~tpcp.CloneFactory` ( :func:`~tpcp.cf`). However, when using dataclasses, we can use the (more elegant) `field` annotation to define mutable defaults. .. GENERATED FROM PYTHON SOURCE LINES 141-160 .. code-block:: default @dataclass(repr=False) class FilterAlgorithm(Algorithm): _action_methods: ClassVar = "filter" # Input Parameters cutoff_hz: Parameter[float] = 2 order: Parameter[int] = 5 # Results filtered_signal_: pd.Series = field(init=False, repr=False) @dataclass class HigherLevelFilter(QRSDetector): filter_algorithm: Parameter[FilterAlgorithm] = field(default_factory=lambda: FilterAlgorithm(3, 2)) .. GENERATED FROM PYTHON SOURCE LINES 161-162 We can see that each instance will get a copy of the default value. .. GENERATED FROM PYTHON SOURCE LINES 162-169 .. code-block:: default v1 = HigherLevelFilter() v2 = HigherLevelFilter() nested_object_is_different = v1.filter_algorithm is not v2.filter_algorithm nested_object_is_different .. rst-class:: sphx-glr-script-out .. code-block:: none True .. GENERATED FROM PYTHON SOURCE LINES 170-185 Attrs ----- A popular alternative to dataclasses is `attrs` (_`attrs.org`). It has the similar features as `dataclasses`, but has some additional features that can be helpfully. It also supports `kw_only` for all Python version (`kw_only` is great! Use it). You can use it simply be replacing the `dataclass` decorator with the `attrs.define` decorator in most examples above. Further, `attrs` has a `field` function, that works like `dataclasses.field`. Only the `default_factory` is called `factory`. .. warning:: `attrs` creates classes using `slots` instead of `__dict__` by default. This does not work nicely with tpcp! Use the `slot=False` parameter of define. Here are all the classes from above using attrs. .. GENERATED FROM PYTHON SOURCE LINES 185-222 .. code-block:: default from attrs import Factory, define, field @define(kw_only=True, slots=False, repr=False) # Slots Don't play nice with tpcp! class QRSDetector(Algorithm): _action_methods: ClassVar[str] = "detect" # Input Parameters high_pass_filter_cutoff_hz: Parameter[float] = 200.0 max_heart_rate_bpm: Parameter[float] = 1.0 min_r_peak_height_over_baseline: Parameter[float] = 0.5 # Results r_peak_positions_: pd.Series = field(init=False) # Some internal constants _HIGH_PASS_FILTER_ORDER: ClassVar[int] = 4 @define(kw_only=True, slots=False, repr=False) # Slots Don't play nice with tpcp! class FilterAlgorithm(Algorithm): _action_methods: ClassVar = "filter" # Input Parameters cutoff_hz: Parameter[float] = 2 order: Parameter[int] = 5 # Results filtered_signal_: pd.Series = field(init=False) @define(kw_only=True, slots=False, repr=False) # Slots Don't play nice with tpcp! class HigherLevelFilter(QRSDetector): filter_algorithm: Parameter[FilterAlgorithm] = Factory(lambda: FilterAlgorithm(cutoff_hz=3, order=2)) HigherLevelFilter() .. rst-class:: sphx-glr-script-out .. code-block:: none HigherLevelFilter(filter_algorithm=FilterAlgorithm(cutoff_hz=3, order=2), high_pass_filter_cutoff_hz=200.0, max_heart_rate_bpm=1.0, min_r_peak_height_over_baseline=0.5) .. GENERATED FROM PYTHON SOURCE LINES 223-224 To support subclassing tpcp parameters with existing inits, we provide a `as_attrs` method on the respective classes. .. GENERATED FROM PYTHON SOURCE LINES 224-238 .. code-block:: default @define(kw_only=True, slots=False, repr=False) # Slots Don't play nice with tpcp! class CustomDataset(Dataset.as_attrs()): # Note the `as_attrs` call here! custom_param: float # We don't need a default, as we are using `kw_only` in define def create_index(self) -> pd.DataFrame: return pd.DataFrame( list(product(("patient_1", "patient_2", "patient_3"), ("test_1", "test_2"), ("1", "2"))), columns=["patient", "test", "extra"], ) CustomDataset(custom_param=3) .. raw:: html

CustomDataset [12 groups/rows]

patient test extra
0 patient_1 test_1 1
1 patient_1 test_1 2
2 patient_1 test_2 1
3 patient_1 test_2 2
4 patient_2 test_1 1
5 patient_2 test_1 2
6 patient_2 test_2 1
7 patient_2 test_2 2
8 patient_3 test_1 1
9 patient_3 test_1 2
10 patient_3 test_2 1
11 patient_3 test_2 2


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 2.442 seconds) **Estimated memory usage:** 9 MB .. _sphx_glr_download_auto_examples_recipies__02_dataclasses.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: _02_dataclasses.py <_02_dataclasses.py>` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: _02_dataclasses.ipynb <_02_dataclasses.ipynb>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_