tpcp.parallel.delayed#
- tpcp.parallel.delayed(func)[source]#
Wrap a function to be used in a parallel context.
This is a modified version of joblib.delayed that can run arbitrary callbacks when
delayed
is called in the main process and when the function is called in the parallel process.This is useful for example to restore the global config in the parallel process. For this to work, callbacks must be registered using
register_global_parallel_callback
first.All uses of
delayed
in tpcp are using this implementation. This means you can configure your custom callbacks and expect them to work in all tpcp functions that use multiprocessing.If you need to write your own multiprocessing method using joblib, refer to the example below.
Notes
The getters are called as soon as the delayed function is called in the main process. This means, if you are calling
delayed
long before the actual parallel execution, the getters might not capture the correct state of the global variables.Examples
This example shows how to use this to make sure the global scikit-learn config is restored in the parallel process used in tpcp. Note, sklearn has a custom workaround for this, which is not compatible with tpcp.
>>> from tpcp.parallel import delayed, register_global_parallel_callback >>> from joblib import Parallel >>> from sklearn import get_config, set_config >>> >>> set_config(assume_finite=True) >>> def callback(): ... def setter(config): ... set_config(**config) ... ... return get_config(), setter >>> >>> def worker_func(): ... # This is what would be called in the parallel process ... # We just return the config here for demonstration purposes ... config = get_config() ... return config["assume_finite"] >>> >>> # register the callback >>> register_global_parallel_callback(callback) >>> # call the worker function in parallel >>> Parallel(n_jobs=2)(delayed(worker_func)() for _ in range(2)) [True, True]