espnet2.train.distributed_utils.get_node_rank
Less than 1 minute
espnet2.train.distributed_utils.get_node_rank
Get Node Rank.
This function is used for “multiprocessing distributed” mode. The initial RANK equals the Node ID in this case, and the real Rank is set as (nGPU * NodeID) + LOCAL_RANK in torch.distributed.
- Parameters:
- prior (Optional *[*int ]) – The prior rank to return if provided.
- launcher (Optional *[*str ]) – The launcher type, e.g., “slurm” or “mpi”.
- Returns: The node rank or None if not determined.
- Return type: Optional[int]
- Raises:
- RuntimeError – If not launched by ‘srun’ when using ‘slurm’ launcher.
- RuntimeError – If ntasks_per_node does not equal SLURM_NTASKS.
- RuntimeError – If an unsupported launcher is specified.
Examples
>>> get_node_rank() # Assuming proper environment variables are set
0 # returns the node rank for the current process.
NOTE
This function assumes that ntasks_per_node is 1. If this assumption is violated, the behavior may be undefined.