Composite-Algorithms and Pipelines#

Sometimes a pipeline or algorithms requires a list of parameters or nested objects. As we can not support parameters which names are not known when the class is defined, such cases need to be handled via composite fields.

A composite field is a parameter expecting a value of the shape [(name1, sub_para1), (name2, sub_para2), ...]. The sub-paras can themselves be tpcp objects.

As it is difficult at runtime to know, if a parameter is expected to be a composite field, you need to actively specify all fields that should be considered composite fields during class definition using the _composite_params attribute:

import dataclasses
import traceback
from typing import Optional

from tpcp import Pipeline
from tpcp.exceptions import ValidationError


@dataclasses.dataclass
class Workflow(Pipeline):
    _composite_params = ("pipelines",)

    pipelines: Optional[list[tuple[str, Pipeline]]] = None

    def __init__(self, pipelines=None):
        self.pipelines = pipelines

That’s it! Now tpcp knows, that pipelines should be a composite field and will actually complain, if we try to assign something invalid. Composite fields are allowed to either have the value None, or be a list of tuples as explained above

instance = Workflow()
instance.pipelines  # Our default value of None
instance.pipelines = "something invalid"
try:
    print(instance.get_params())
except ValidationError:
    traceback.print_exc()
Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/tpcp/checkouts/latest/examples/recipies/_03_composite_objects.py", line 47, in <module>
    print(instance.get_params())
  File "/home/docs/checkouts/readthedocs.org/user_builds/tpcp/checkouts/latest/tpcp/_base.py", line 370, in get_params
    return _get_params(self, deep)
  File "/home/docs/checkouts/readthedocs.org/user_builds/tpcp/checkouts/latest/tpcp/_base.py", line 500, in _get_params
    _assert_is_allowed_composite_value(v, key, i)
  File "/home/docs/checkouts/readthedocs.org/user_builds/tpcp/checkouts/latest/tpcp/_base.py", line 456, in _assert_is_allowed_composite_value
    raise ValidationError(
tpcp.exceptions.ValidationError: The provided parameters for the composite field pipelines does not seem to be the right type. It should be a sequence of `(name, value)` tuples, but the obj at position 0 in the sequence was not a tuple but:
`s`

While you could set the individual sub-params in a composite field to whatever you want, the real value of explicit composite fields are the use of tpcp-objects

@dataclasses.dataclass
class MyPipeline(Pipeline):
    param: float = 4
    param2: int = 10


workflow_instance = Workflow(pipelines=[("pipe1", MyPipeline()), ("pipe2", MyPipeline(param2=5))])

We can now use get_params to get a deep inspection of the nested objects:

{'pipelines__pipe1': MyPipeline(param=4, param2=10), 'pipelines__pipe1__param': 4, 'pipelines__pipe1__param2': 10, 'pipelines__pipe2': MyPipeline(param=4, param2=5), 'pipelines__pipe2__param': 4, 'pipelines__pipe2__param2': 5, 'pipelines': [('pipe1', MyPipeline(param=4, param2=10)), ('pipe2', MyPipeline(param=4, param2=5))]}

Or we can set params using the following syntax:

workflow_instance = workflow_instance.set_params(pipelines__pipe1__param=2, pipelines__pipe2=MyPipeline(param2=4))
workflow_instance.get_params(deep=True)
{'pipelines__pipe1': MyPipeline(param=2, param2=10), 'pipelines__pipe1__param': 2, 'pipelines__pipe1__param2': 10, 'pipelines__pipe2': MyPipeline(param=4, param2=4), 'pipelines__pipe2__param': 4, 'pipelines__pipe2__param2': 4, 'pipelines': [('pipe1', MyPipeline(param=2, param2=10)), ('pipe2', MyPipeline(param=4, param2=4))]}

Note that it is not possible to set parameters for keys that don’t exist yet! In such a case, you would manually recreate the full list.

Total running time of the script: (0 minutes 1.364 seconds)

Estimated memory usage: 9 MB

Gallery generated by Sphinx-Gallery