espnet3.publication.inference_session.InferenceSession

About 4 min

espnet3.publication.inference_session.InferenceSession

class espnet3.publication.inference_session.InferenceSession(model: Any, , input_key: str | Sequence[str] = 'speech', output_fn_path: str | None = None, output_fn=None, prefer_model_batch: bool = False, fallback_to_single_on_batch_error: bool = True, bundle_root: Path | None = None, bundle_metadata: Mapping[str, Any] | None = None, artifacts: Mapping[str, Any] | None = None)

Bases: object

User-facing inference wrapper for packaged ESPnet models.

This class exposes a small direct-inference API around packaged ESPnet backends such as espnet2.bin.asr_inference.Speech2Text. It is intended for use outside the stage runner, for example from a pixi shell session or a standalone Python environment after installing the model dependencies.

Construction paths.

an inference config via from_config()
a packaged model tag via from_pretrained()
already resolved artifact paths via from_artifacts()

When enable_user_code=True, the bundle root and recipe-managed import directories such as src/ are added to sys.path before backend construction. This is meant for explicitly trusted recipe code bundled with the published model.

Parameters:
- model – Instantiated inference backend.
- input_key – Input field name or names expected by the backend.
- output_fn_path – Optional dotted output-function path compatible with the recipe output_fn(data=..., model_output=..., idx=...) contract.
- output_fn – Optional already imported output function. Use this instead of output_fn_path when the caller already has the function.
- prefer_model_batch – Whether forward_batch() should try a single batched backend call before falling back to per-sample execution.
- fallback_to_single_on_batch_error – Whether forward_batch() should transparently fall back to per-sample execution after a batched call fails.
- bundle_root – Optional unpacked model bundle root.
- bundle_metadata – Optional metadata loaded from meta.yaml.
- artifacts – Optional resolved artifact mapping used to build the backend.
Raises:ValueError – If both output_fn_path and output_fn are provided.

Notes

This wrapper does not require dataset objects. Single-sample inference accepts either a raw value for single-input models or a mapping that contains the configured input_key fields.

####### Examples

>>> session = InferenceSession.from_pretrained(
...     "espnet/some_model",
...     trust_user_code=True,
... )
>>> result = session(audio_array)
>>> batch = session.forward_batch([audio_a, audio_b])

Initialize the inference session.

forward(sample: Any, , idx: Any = 0) → Any

Run inference for a single sample.

Parameters:
- sample – Either a raw input value for single-input models or a mapping containing the configured input key(s).
- idx – Optional sample identifier forwarded to output_fn.
Returns: Backend output, or the transformed output from output_fn.
Return type: Any
Raises:
- KeyError – If a required input field is missing.
- RuntimeError – If a scalar sample is used with multiple input keys.

forward_batch(samples: Sequence[Any], , indices: Sequence[Any] | None = None, use_model_batch: bool | None = None, fallback_to_single_on_error: bool | None = None) → list[Any]

Run inference for a batch of samples.

Parameters:
- samples – Sequence of raw inputs or sample mappings.
- indices – Optional per-sample identifiers forwarded to output_fn during fallback execution.
- use_model_batch – Whether to try one batched backend call. When None, uses prefer_model_batch configured on the session.
- fallback_to_single_on_error – Whether to fall back to per-sample execution after a failed batched call. When None, uses the session default.
Returns: One output per sample.
Return type: list[Any]
Raises:
- RuntimeError – If batched backend execution fails and fallback is disabled.
- ValueError – If indices length does not match samples.

classmethod from_artifacts(artifacts: Mapping[str, Any], , backend_class: str | type | None = None, input_key: str | Sequence[str] | None = None, output_fn_path: str | None = None, enable_user_code: bool | None = None, trust_user_code: bool = False, user_code_paths: Sequence[str] | None = None, prefer_model_batch: bool = False, fallback_to_single_on_batch_error: bool = True, **backend_kwargs: Any) → InferenceSession

Build a session from already resolved model artifact paths.

Parameters:
- artifacts – Mapping returned by a model downloader or unpacker.
- backend_class – Optional backend class override. When omitted, this method first looks for an inference config in the bundle and instantiates config.model from there.
- input_key – Optional input field override. When omitted and an inference config is available, config.input_key is used.
- output_fn_path – Optional dotted recipe output-function path.
- enable_user_code – Whether to activate bundled user code before backend construction.
- trust_user_code – Explicit opt-in required when enable_user_code=True.
- user_code_paths – Relative import roots under the bundle root.
- prefer_model_batch – Whether forward_batch() should try a single batched backend call first.
- fallback_to_single_on_batch_error – Whether failed batched calls should fall back to per-sample execution.
- **backend_kwargs – Additional constructor kwargs forwarded to the backend class.
Returns: Artifact-backed inference session.
Return type:InferenceSession
Raises:
- RuntimeError – If user code was requested but the bundle root could not be inferred.
- ValueError – If untrusted user code execution was requested.

classmethod from_config(inference_config: Mapping[str, Any] | DictConfig, , enable_user_code: bool | None = None, trust_user_code: bool = False, user_code_paths: Sequence[str] | None = None, prefer_model_batch: bool = False, fallback_to_single_on_batch_error: bool = True) → InferenceSession

Build a session directly from an ESPnet3 inference config.

Parameters:
- inference_config – Inference config containing at least model and optionally input_key, output_fn, and recipe_dir.
- enable_user_code – Whether to add recipe-local user code such as src/ to sys.path before backend construction.
- trust_user_code – Explicit opt-in required when enable_user_code=True.
- user_code_paths – Relative import roots under recipe_dir to add when user code is enabled. Defaults to ("src",).
- prefer_model_batch – Whether batched backend execution should be attempted by default in forward_batch().
- fallback_to_single_on_batch_error – Whether failed batched calls should fall back to per-sample execution.
Returns: Config-backed inference session.
Return type:InferenceSession
Raises:
- RuntimeError – If user code was requested but recipe_dir is unavailable.
- ValueError – If untrusted user code execution was requested.

Notes

Backend construction reuses the same device-resolution logic as the stage inference provider.

classmethod from_pretrained(model_tag: str, , backend_class: str | type | None = None, downloader_class: str | type = 'espnet_model_zoo.downloader.ModelDownloader', input_key: str | Sequence[str] | None = None, output_fn_path: str | None = None, enable_user_code: bool | None = None, trust_user_code: bool = False, user_code_paths: Sequence[str] | None = None, prefer_model_batch: bool = False, fallback_to_single_on_batch_error: bool = True, **backend_kwargs: Any) → InferenceSession

Download a packaged model and build an inference session from it.

Parameters:
- model_tag – Pretrained model identifier understood by espnet_model_zoo.
- backend_class – Optional backend class override. When omitted, the published inference config is used to instantiate model.
- downloader_class – Dotted model-downloader class path or class object. Defaults to espnet_model_zoo.downloader.ModelDownloader.
- input_key – Optional input field override.
- output_fn_path – Optional dotted recipe output-function path.
- enable_user_code – Whether to activate bundled user code such as src/ before backend construction.
- trust_user_code – Explicit opt-in required when enable_user_code=True.
- user_code_paths – Relative import roots under the unpacked bundle.
- prefer_model_batch – Whether forward_batch() should try a single batched backend call first.
- fallback_to_single_on_batch_error – Whether failed batched calls should fall back to per-sample execution.
- **backend_kwargs – Additional constructor kwargs forwarded to the backend class.
Returns: Downloaded inference session.
Return type:InferenceSession
Raises:
- Any exception raised by the configured downloader or backend –
- constructor. –

####### Examples

>>> session = InferenceSession.from_pretrained(
...     "espnet/some_model",
...     trust_user_code=True,
... )
>>> text = session(audio_array)

property primary_input_key : str

Return the single configured input key.

Raises:RuntimeError – If the session expects multiple input keys.