espnet3.publication.inference_model.InferenceModel

About 4 min

espnet3.publication.inference_model.InferenceModel

class espnet3.publication.inference_model.InferenceModel(inference_config: DictConfig)

Bases: object

User-facing inference wrapper for packaged ESPnet models.

This class is the public runtime API for a bundle produced by espnet3.utils.publish_utils.pack_model(). It sits on the publication side of the pipeline: stage runners produce the packed directory, then external callers use InferenceModel to reopen that directory and execute the bundled inference configuration without going back through run.py.

Internally the wrapper rebuilds the backend declared in conf/inference.yaml through InferenceProvider, normalizes sample inputs to match input_key, and optionally applies the recipe’s output_fn so the published model returns the same payload shape used by recipe inference.

The inference model can be built from:

a packaged model tag via from_pretrained()
a packed model directory via from_packed()

When bundled user code is enabled, the packed bundle root is added to sys.path before backend construction. This is meant for explicitly trusted recipe code bundled with the published model.

:param The constructor is usually reached through from_packed() or: :param from_pretrained(): :param not called directly. Those classmethods handle: :param bundle lookup: :param config loading: :param and bundled-code trust checks before: :param passing the resolved inference config here.:

Notes

This wrapper does not require dataset objects. Single-sample inference accepts either a raw value for single-input models or a mapping that contains the configured input_key fields.

############# Examples

>>> model = InferenceModel.from_pretrained(
...     "espnet/some_model",
...     trust_user_code=True,
... )
>>> result = model(audio_array)
>>> batch = model.forward_batch([audio_a, audio_b])

Initialize the inference model from a resolved inference config.

Called by from_packed() and :meth:from_pretrained after bundle discovery and trust checks are complete. This constructor instantiates the backend model, normalizes ``input_key into either a string or a list of strings, and loads the optional recipe output_fn.

Parameters:inference_config – Resolved inference config loaded from the packed bundle.

forward(sample: Any, idx: Any = 0, **extra_kwargs: Any) → Any

Run inference for a single sample.

This is the main execution method used by __call__() and by forward_batch(). It normalizes the sample to match the configured model input signature, calls the instantiated backend, and then applies the optional recipe output_fn.

Parameters:
- sample – Either a raw input value for single-input models or a mapping containing the configured input key(s).
- idx – Optional sample identifier forwarded to output_fn.
- **extra_kwargs – Additional keyword arguments forwarded to the underlying model callable (e.g. beam_size for demo overrides).
Returns: Backend output, or the transformed output from output_fn.
Return type: Any
Raises:
- KeyError – If a required input field is missing.
- RuntimeError – If a scalar sample is used with multiple input keys.

############# Examples

>>> model = InferenceModel.from_packed("/path/to/packed_model")
>>> result = model.forward(audio_array)

>>> result = model.forward(
...     {"speech": audio_array, "text": "prompt"},
...     idx="utt-0001",
... )

forward_batch(samples: Sequence[Any], indices: Sequence[Any] | None = None) → list[Any]

Run inference for a batch of samples.

This helper first tries the same batched execution path used by InferenceRunner, so published models can benefit from recipe backends that already support batched inputs. If that batched call fails, or if the returned value does not preserve the one-result-per- sample contract of InferenceModel, it falls back to per-sample forward() calls.

Parameters:
- samples – Sequence of raw inputs or sample mappings.
- indices – Optional per-sample identifiers forwarded to output_fn. Defaults to range(len(samples)).
Returns: One output per sample.
Return type: list[Any]
Raises:ValueError – If indices length does not match samples.

Notes

The output list preserves input order. An empty samples sequence returns an empty list.

############# Examples

>>> model = InferenceModel.from_packed("/path/to/packed_model")
>>> results = model.forward_batch([audio_a, audio_b])

classmethod from_packed(pack_dir: str | Path, trust_user_code: bool = False) → InferenceModel

Build an inference model from a packed model directory.

This is the main entry point for local publication bundles. It is called by external users, CI checks, and any runtime that already has an unpacked pack_model() output directory. The method validates the bundle layout, loads meta.yaml, finds yaml_files.inference_config, resolves the inference config against the bundle root, and then instantiates InferenceModel.

If the config references modules bundled alongside the model, the load is blocked unless trust_user_code=True. In that case the bundle root is inserted into sys.path and the config is reloaded so import-based objects resolve against the newly trusted code.

Parameters:
- pack_dir – Path to the output directory created by espnet3.utils.publish_utils.pack_model(). This directory must contain conf/inference.yaml and any files referenced by that config.
- trust_user_code – Set to True to allow importing bundled recipe code from the pack directory. Required when the inference config references modules shipped inside the bundle.
Returns: Inference model loaded from pack_model output.
Return type:InferenceModel
Raises:
- FileNotFoundError – If the bundle directory, meta.yaml, or the referenced inference config is missing.
- ValueError – If the config requires bundled user code but trust_user_code is False.

Notes

meta.yaml is treated as the source of truth for locating the packed inference config. The method does not assume that the config lives at a fixed path other than the metadata contract written by pack_model().

############# Examples

>>> model = InferenceModel.from_packed("/path/to/packed_model")
>>> result = model(audio_array)

>>> model = InferenceModel.from_packed(
...     "/path/to/packed_model",
...     trust_user_code=True,
... )

classmethod from_pretrained(model_tag: str, trust_user_code: bool = False) → InferenceModel

Download a packaged model and build an inference model from it.

This is the remote-loading companion to from_packed(). It is called when the caller has an espnet_model_zoo tag rather than a local packed directory. The downloader fetches and unpacks the model assets first, then this method locates the unpacked bundle root and delegates to from_packed() for the actual config loading and backend construction.

Parameters:
- model_tag – Pretrained model identifier understood by espnet_model_zoo.
- trust_user_code – Forwarded to from_packed().
Returns: Downloaded inference model.
Return type:InferenceModel
Raises:RuntimeError – If the downloaded artifacts do not include an inference_config entry.

Notes

The downloader returns individual artifact paths. This method uses the downloaded inference_config path to recover the enclosing pack directory expected by from_packed().

############# Examples

>>> model = InferenceModel.from_pretrained("espnet/some_model")
>>> text = model(audio_array)

>>> model = InferenceModel.from_pretrained(
...     "espnet/some_model",
...     trust_user_code=True,
... )

property primary_input_key : str

Return the single configured input key.