espnet2.train.distributed_utils.get_world_size
Less than 1 minute
espnet2.train.distributed_utils.get_world_size
espnet2.train.distributed_utils.get_world_size(prior=None, launcher: str | None = None) → int
Get the world size for distributed training.
The world size refers to the total number of processes participating in the distributed training. If the prior argument is provided, it will be used as the world size. Otherwise, the function will attempt to read the world size from environment variables based on the launcher type.
- Parameters:
- prior (Optional *[*int ]) – The world size to use if specified.
- launcher (Optional *[*str ]) – The type of launcher used, e.g., “slurm”, “mpi”, or None.
- Returns: The world size, which defaults to 1 if no valid value can be determined.
- Return type: int
- Raises:
- RuntimeError – If the specified launcher is not supported or if
- the process is not launched correctly for the specified launcher. –
Examples
>>> get_world_size() # Assuming WORLD_SIZE=4 in the environment
4
>>> get_world_size(prior=2)
2
>>> get_world_size(launcher='slurm') # Assuming SLURM_NTASKS=3
3