tpcp.misc.TypedIterator#

class tpcp.misc.TypedIterator(data_type: type[DataclassT], aggregations: Sequence[tuple[str, Callable[[list[TypedIteratorResultTuple[InputTypeT, DataclassT]]], Any]]] = cf([]))[source]#

Helper to iterate over data and collect results.

Parameters:
data_type

A dataclass that defines the result type you expect from each iteration.

aggregations

An optional list of aggregations to apply to the results. This has the form [(result_name, aggregation_function), ...]. Each aggregation function gets raw_results_ provided as input and can return an arbitrary object. If a result-name is in the list, the aggregation will be applied to it, when accessing the results_ (i.e. results_.{result_name}). If no aggregation is defined for a result, a simple list of all results will be returned. .. note:: It is possible to define aggregations with names not present in the result dataclass.

This allows to provide multiple aggregations from the same data. The results can still be accessed via the additional_results_ attribute.

NULL_VALUE

(Class attribute) The value that is used to initialize the result dataclass and will remain in the results, if no result was for a specific attribute in one or more iterations.

IteratorResult

(Class attribute) Type alias for the result-type of the iterator. raw_results_ will be a list of these. Note, that when using this outside of the class, this type will be a generic without a type for the input and result field.

Attributes:
results_

The aggregated results.

additional_results_

A dictionary with additional results that were created by aggregators with names not present in the result dataclass.

raw_results_

List of all results as TypedIteratorResultTuple instances. This is the input to the aggregation functions. The attribute of the result dataclass instance will have the value of _NOT_SET if no result was set. To check for this, you can use isinstance(val, TypedIterator.NULL_VALUE) or the TypedIterator.filter_iterator_results method to remove all results with a NULL_VALUE.

done_

A dictionary indicating of a specific iterator is done. This usually only has the key __main__ for the main iteration triggered by iterate. However, subclasses can define nested iterations with more complex logic. The value will be True if the respective iteration is done, False if it is currently running and missing if it was never started. If the main iterator is not done, but you try to access the results, an error will be raised.

Methods

clone()

Create a new instance of the class with all parameters copied over.

get_params([deep])

Get parameters for this algorithm.

iterate(iterable)

Iterate over the given iterable and yield the input and a new empty result object for each iteration.

set_params(**params)

Set the parameters of this Algorithm.

IteratorResult

filter_iterator_results

clone() Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

get_params(deep: bool = True) dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:
deep

Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:
params

Parameter names mapped to their values.

iterate(iterable: Iterable[T]) Iterator[tuple[T, DataclassT]][source]#

Iterate over the given iterable and yield the input and a new empty result object for each iteration.

Parameters:
iterable

The iterable to iterate over.

Yields:
input, result_object

The input and a new empty result object. The result object is a dataclass instance of the type defined in self.data_type. All values of the result object are set to TypedIterator.NULL_VALUE by default.

property results_: DataclassT#

The aggregated results.

Note, that this returns an instance of the result object, even-though the datatypes of the attributes might be different depending on the aggregation function. We still decided it makes sense to return an instance of the result object, as it will allow to autocomplete the attributes, even-though the associated times might not be correct.

set_params(**params: Any) Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.