MacroFloatAggregator#

class tpcp.validate.MacroFloatAggregator(*, groupby: str | list[str], group_agg: ~typing.Callable[[~pandas.core.frame.DataFrame], float] = <function mean>, final_agg: ~typing.Callable[[~pandas.core.frame.DataFrame], float] = <function mean>, final_agg_name: str = 'macro', return_raw_scores: bool = True)[source]#

Aggregate first on the provided groupby level and then aggregate these results.

The aggregation returns the individual values per group and the final aggregated value.

Parameters:
groupby

The dataset index columns to groupby. This must be a valid subset of the dataset index columns used in the scoring.

group_agg

The function that is applied per group. Note, that we only support functions that return a single float value. Due to the internal implementation, this function actually gets passed a Series of float values. Default is the mean.

final_agg

The function that is applied across the per-group results. Note, that we only support functions that return a single float value. Due to the internal implementation, this function actually gets passed a Series of float values. Default is the mean.

final_agg_name

The name of the final aggregated value. This is the key in the returned dictionary.

return_raw_scores

If True, the raw scores are returned in the result

Methods

__call__(value)

Set the value of the aggregator.

aggregate(values, datapoints)

Aggregate the values.

clone()

Create a new instance of the class with all parameters copied over.

get_params([deep])

Get parameters for this algorithm.

get_value()

Return the value wrapped by aggregator.

set_params(**params)

Set the parameters of this Algorithm.

__init__(*, groupby: str | list[str], group_agg: ~typing.Callable[[~pandas.core.frame.DataFrame], float] = <function mean>, final_agg: ~typing.Callable[[~pandas.core.frame.DataFrame], float] = <function mean>, final_agg_name: str = 'macro', return_raw_scores: bool = True) None[source]#
_assert_is_all_valid(values: Sequence[Any], _key_name: str)[source]#

Check if all scoring values are consistently of the same type.

This methods is called on the first aggregator instance acountered of a scoring value.

It’s role is to check, if all other values are of the same type (aka the same class and same config) as the first one.

_get_emtpy_instance() Self[source]#

Return an empty instance of the aggregator with the same config, but no value.

aggregate(values: Sequence[T], datapoints: Sequence[Dataset]) dict[str, float][source]#

Aggregate the values.

clone() Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

get_params(deep: bool = True) dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:
deep

Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:
params

Parameter names mapped to their values.

get_value() T[source]#

Return the value wrapped by aggregator.

set_params(**params: Any) Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

Examples using tpcp.validate.MacroFloatAggregator#

Custom Scorer

Custom Scorer