espnet3.publication.inference_session.InferenceSession
espnet3.publication.inference_session.InferenceSession
class espnet3.publication.inference_session.InferenceSession(model: Any, , input_key: str | Sequence[str] = 'speech', output_fn_path: str | None = None, output_fn=None, prefer_model_batch: bool = False, fallback_to_single_on_batch_error: bool = True, bundle_root: Path | None = None, bundle_metadata: Mapping[str, Any] | None = None, artifacts: Mapping[str, Any] | None = None)
Bases: object
User-facing inference wrapper for packaged ESPnet models.
This class exposes a small direct-inference API around packaged ESPnet backends such as espnet2.bin.asr_inference.Speech2Text. It is intended for use outside the stage runner, for example from a pixi shell session or a standalone Python environment after installing the model dependencies.
Construction paths.
- an inference config via
from_config() - a packaged model tag via
from_pretrained() - already resolved artifact paths via
from_artifacts()
When enable_user_code=True, the bundle root and recipe-managed import directories such as src/ are added to sys.path before backend construction. This is meant for explicitly trusted recipe code bundled with the published model.
- Parameters:
- model – Instantiated inference backend.
- input_key – Input field name or names expected by the backend.
- output_fn_path – Optional dotted output-function path compatible with the recipe
output_fn(data=..., model_output=..., idx=...)contract. - output_fn – Optional already imported output function. Use this instead of
output_fn_pathwhen the caller already has the function. - prefer_model_batch – Whether
forward_batch()should try a single batched backend call before falling back to per-sample execution. - fallback_to_single_on_batch_error – Whether
forward_batch()should transparently fall back to per-sample execution after a batched call fails. - bundle_root – Optional unpacked model bundle root.
- bundle_metadata – Optional metadata loaded from
meta.yaml. - artifacts – Optional resolved artifact mapping used to build the backend.
- Raises:ValueError – If both
output_fn_pathandoutput_fnare provided.
Notes
This wrapper does not require dataset objects. Single-sample inference accepts either a raw value for single-input models or a mapping that contains the configured input_key fields.
####### Examples
>>> session = InferenceSession.from_pretrained(
... "espnet/some_model",
... trust_user_code=True,
... )
>>> result = session(audio_array)
>>> batch = session.forward_batch([audio_a, audio_b])Initialize the inference session.
forward(sample: Any, , idx: Any = 0) → Any
Run inference for a single sample.
- Parameters:
- sample – Either a raw input value for single-input models or a mapping containing the configured input key(s).
- idx – Optional sample identifier forwarded to
output_fn.
- Returns: Backend output, or the transformed output from
output_fn. - Return type: Any
- Raises:
- KeyError – If a required input field is missing.
- RuntimeError – If a scalar sample is used with multiple input keys.
forward_batch(samples: Sequence[Any], , indices: Sequence[Any] | None = None, use_model_batch: bool | None = None, fallback_to_single_on_error: bool | None = None) → list[Any]
Run inference for a batch of samples.
- Parameters:
- samples – Sequence of raw inputs or sample mappings.
- indices – Optional per-sample identifiers forwarded to
output_fnduring fallback execution. - use_model_batch – Whether to try one batched backend call. When
None, usesprefer_model_batchconfigured on the session. - fallback_to_single_on_error – Whether to fall back to per-sample execution after a failed batched call. When
None, uses the session default.
- Returns: One output per sample.
- Return type: list[Any]
- Raises:
- RuntimeError – If batched backend execution fails and fallback is disabled.
- ValueError – If
indiceslength does not matchsamples.
classmethod from_artifacts(artifacts: Mapping[str, Any], , backend_class: str | type | None = None, input_key: str | Sequence[str] | None = None, output_fn_path: str | None = None, enable_user_code: bool | None = None, trust_user_code: bool = False, user_code_paths: Sequence[str] | None = None, prefer_model_batch: bool = False, fallback_to_single_on_batch_error: bool = True, **backend_kwargs: Any) → InferenceSession
Build a session from already resolved model artifact paths.
- Parameters:
- artifacts – Mapping returned by a model downloader or unpacker.
- backend_class – Optional backend class override. When omitted, this method first looks for an inference config in the bundle and instantiates
config.modelfrom there. - input_key – Optional input field override. When omitted and an inference config is available,
config.input_keyis used. - output_fn_path – Optional dotted recipe output-function path.
- enable_user_code – Whether to activate bundled user code before backend construction.
- trust_user_code – Explicit opt-in required when
enable_user_code=True. - user_code_paths – Relative import roots under the bundle root.
- prefer_model_batch – Whether
forward_batch()should try a single batched backend call first. - fallback_to_single_on_batch_error – Whether failed batched calls should fall back to per-sample execution.
- **backend_kwargs – Additional constructor kwargs forwarded to the backend class.
- Returns: Artifact-backed inference session.
- Return type:InferenceSession
- Raises:
- RuntimeError – If user code was requested but the bundle root could not be inferred.
- ValueError – If untrusted user code execution was requested.
classmethod from_config(inference_config: Mapping[str, Any] | DictConfig, , enable_user_code: bool | None = None, trust_user_code: bool = False, user_code_paths: Sequence[str] | None = None, prefer_model_batch: bool = False, fallback_to_single_on_batch_error: bool = True) → InferenceSession
Build a session directly from an ESPnet3 inference config.
- Parameters:
- inference_config – Inference config containing at least
modeland optionallyinput_key,output_fn, andrecipe_dir. - enable_user_code – Whether to add recipe-local user code such as
src/tosys.pathbefore backend construction. - trust_user_code – Explicit opt-in required when
enable_user_code=True. - user_code_paths – Relative import roots under
recipe_dirto add when user code is enabled. Defaults to("src",). - prefer_model_batch – Whether batched backend execution should be attempted by default in
forward_batch(). - fallback_to_single_on_batch_error – Whether failed batched calls should fall back to per-sample execution.
- inference_config – Inference config containing at least
- Returns: Config-backed inference session.
- Return type:InferenceSession
- Raises:
- RuntimeError – If user code was requested but
recipe_diris unavailable. - ValueError – If untrusted user code execution was requested.
- RuntimeError – If user code was requested but
Notes
Backend construction reuses the same device-resolution logic as the stage inference provider.
classmethod from_pretrained(model_tag: str, , backend_class: str | type | None = None, downloader_class: str | type = 'espnet_model_zoo.downloader.ModelDownloader', input_key: str | Sequence[str] | None = None, output_fn_path: str | None = None, enable_user_code: bool | None = None, trust_user_code: bool = False, user_code_paths: Sequence[str] | None = None, prefer_model_batch: bool = False, fallback_to_single_on_batch_error: bool = True, **backend_kwargs: Any) → InferenceSession
Download a packaged model and build an inference session from it.
- Parameters:
- model_tag – Pretrained model identifier understood by
espnet_model_zoo. - backend_class – Optional backend class override. When omitted, the published inference config is used to instantiate
model. - downloader_class – Dotted model-downloader class path or class object. Defaults to
espnet_model_zoo.downloader.ModelDownloader. - input_key – Optional input field override.
- output_fn_path – Optional dotted recipe output-function path.
- enable_user_code – Whether to activate bundled user code such as
src/before backend construction. - trust_user_code – Explicit opt-in required when
enable_user_code=True. - user_code_paths – Relative import roots under the unpacked bundle.
- prefer_model_batch – Whether
forward_batch()should try a single batched backend call first. - fallback_to_single_on_batch_error – Whether failed batched calls should fall back to per-sample execution.
- **backend_kwargs – Additional constructor kwargs forwarded to the backend class.
- model_tag – Pretrained model identifier understood by
- Returns: Downloaded inference session.
- Return type:InferenceSession
- Raises:
- Any exception raised by the configured downloader or backend –
- constructor. –
####### Examples
>>> session = InferenceSession.from_pretrained(
... "espnet/some_model",
... trust_user_code=True,
... )
>>> text = session(audio_array)property primary_input_key : str
Return the single configured input key.
- Raises:RuntimeError – If the session expects multiple input keys.
