espnet3.parallel.env_provider.EnvironmentProvider
espnet3.parallel.env_provider.EnvironmentProvider
class espnet3.parallel.env_provider.EnvironmentProvider(config: DictConfig)
Bases: ABC
A base interface to build and inject per-process environments.
This class separates responsibilities for constructing shared resources (e.g., dataset/model/tokenizer) that need to be created either once on the driver or once per worker process.
Subclasses should implement build_env_local and make_worker_setup_fn to define how the environment is built in local (driver) execution and distributed (worker) execution.
- Parameters:config (DictConfig) β A Hydra/OmegaConf configuration object that contains all parameters needed to build the environment.
Notes
- The environment returned by these builders must be a plain dictionary of lightweight, pickleable objects or handles that are safe to share within a worker process.
- For distributed runs, heavy initialization should be done inside the worker setup function so each worker constructs its own copy.
abstractmethod build_env_local() β Dict[str, Any]
Build the environment once on the driver for local execution.
This method is called in purely local runs (no Dask). Typical usage is to instantiate dataset/model/tokenizer directly and return them as a dictionary.
- Returns: A dictionary containing environment objects (e.g.,
{"dataset": ds, "model": md, ...}). - Return type: Dict[str, Any]
- Raises:NotImplementedError β If the subclass does not override this method.
Example
>>> class MyProvider(EnvironmentProvider):
... def build_env_local(self):
... ds = build_dataset(self.config)
... md = build_model(self.config)
... return {"dataset": ds, "model": md}abstractmethod make_worker_setup_fn() β Callable[[], Dict[str, Any]]
Create a worker setup function for distributed execution.
The returned callable will be executed once per worker to build that workerβs environment and cached via a Dask WorkerPlugin. The resulting dictionary is later injected into user functions by name-matching of keyword parameters.
- Returns: A zero-arg function that returns an environment dictionary when called on a worker.
- Return type: Callable[[], Dict[str, Any]]
- Raises:NotImplementedError β If the subclass does not override this method.
Example
>>> class MyProvider(EnvironmentProvider):
... def make_worker_setup_fn(self):
... cfg = self.config
... def setup():
... ds = build_dataset(cfg)
... md = build_model(cfg)
... return {"dataset": ds, "model": md}
... return setup # return setup function!