espnet3.publication.inference_model.InferenceModel
espnet3.publication.inference_model.InferenceModel
class espnet3.publication.inference_model.InferenceModel(inference_config: DictConfig)
Bases: object
User-facing inference wrapper for packaged ESPnet models.
This class is the public runtime API for a bundle produced by espnet3.utils.publish_utils.pack_model(). It sits on the publication side of the pipeline: stage runners produce the packed directory, then external callers use InferenceModel to reopen that directory and execute the bundled inference configuration without going back through run.py.
Internally the wrapper rebuilds the backend declared in conf/inference.yaml through InferenceProvider, normalizes sample inputs to match input_key, and optionally applies the recipe’s output_fn so the published model returns the same payload shape used by recipe inference.
The inference model can be built from:
- a packaged model tag via
from_pretrained() - a packed model directory via
from_packed()
When bundled user code is enabled, the packed bundle root is added to sys.path before backend construction. This is meant for explicitly trusted recipe code bundled with the published model.
:param The constructor is usually reached through from_packed() or: :param from_pretrained(): :param not called directly. Those classmethods handle: :param bundle lookup: :param config loading: :param and bundled-code trust checks before: :param passing the resolved inference config here.:
Notes
This wrapper does not require dataset objects. Single-sample inference accepts either a raw value for single-input models or a mapping that contains the configured input_key fields.
############# Examples
>>> model = InferenceModel.from_pretrained(
... "espnet/some_model",
... trust_user_code=True,
... )
>>> result = model(audio_array)
>>> batch = model.forward_batch([audio_a, audio_b])Initialize the inference model from a resolved inference config.
Called by from_packed() and :meth:from_pretrained after bundle discovery and trust checks are complete. This constructor instantiates the backend model, normalizes ``input_key into either a string or a list of strings, and loads the optional recipe output_fn.
- Parameters:inference_config – Resolved inference config loaded from the packed bundle.
forward(sample: Any, idx: Any = 0, **extra_kwargs: Any) → Any
Run inference for a single sample.
This is the main execution method used by __call__() and by forward_batch(). It normalizes the sample to match the configured model input signature, calls the instantiated backend, and then applies the optional recipe output_fn.
- Parameters:
- sample – Either a raw input value for single-input models or a mapping containing the configured input key(s).
- idx – Optional sample identifier forwarded to
output_fn. - **extra_kwargs – Additional keyword arguments forwarded to the underlying model callable (e.g.
beam_sizefor demo overrides).
- Returns: Backend output, or the transformed output from
output_fn. - Return type: Any
- Raises:
- KeyError – If a required input field is missing.
- RuntimeError – If a scalar sample is used with multiple input keys.
############# Examples
>>> model = InferenceModel.from_packed("/path/to/packed_model")
>>> result = model.forward(audio_array)>>> result = model.forward(
... {"speech": audio_array, "text": "prompt"},
... idx="utt-0001",
... )forward_batch(samples: Sequence[Any], indices: Sequence[Any] | None = None) → list[Any]
Run inference for a batch of samples.
This helper first tries the same batched execution path used by InferenceRunner, so published models can benefit from recipe backends that already support batched inputs. If that batched call fails, or if the returned value does not preserve the one-result-per- sample contract of InferenceModel, it falls back to per-sample forward() calls.
- Parameters:
- samples – Sequence of raw inputs or sample mappings.
- indices – Optional per-sample identifiers forwarded to
output_fn. Defaults torange(len(samples)).
- Returns: One output per sample.
- Return type: list[Any]
- Raises:ValueError – If
indiceslength does not matchsamples.
Notes
The output list preserves input order. An empty samples sequence returns an empty list.
############# Examples
>>> model = InferenceModel.from_packed("/path/to/packed_model")
>>> results = model.forward_batch([audio_a, audio_b])classmethod from_packed(pack_dir: str | Path, trust_user_code: bool = False) → InferenceModel
Build an inference model from a packed model directory.
This is the main entry point for local publication bundles. It is called by external users, CI checks, and any runtime that already has an unpacked pack_model() output directory. The method validates the bundle layout, loads meta.yaml, finds yaml_files.inference_config, resolves the inference config against the bundle root, and then instantiates InferenceModel.
If the config references modules bundled alongside the model, the load is blocked unless trust_user_code=True. In that case the bundle root is inserted into sys.path and the config is reloaded so import-based objects resolve against the newly trusted code.
- Parameters:
- pack_dir – Path to the output directory created by
espnet3.utils.publish_utils.pack_model(). This directory must containconf/inference.yamland any files referenced by that config. - trust_user_code – Set to
Trueto allow importing bundled recipe code from the pack directory. Required when the inference config references modules shipped inside the bundle.
- pack_dir – Path to the output directory created by
- Returns: Inference model loaded from
pack_modeloutput. - Return type:InferenceModel
- Raises:
- FileNotFoundError – If the bundle directory,
meta.yaml, or the referenced inference config is missing. - ValueError – If the config requires bundled user code but
trust_user_codeisFalse.
- FileNotFoundError – If the bundle directory,
Notes
meta.yaml is treated as the source of truth for locating the packed inference config. The method does not assume that the config lives at a fixed path other than the metadata contract written by pack_model().
############# Examples
>>> model = InferenceModel.from_packed("/path/to/packed_model")
>>> result = model(audio_array)>>> model = InferenceModel.from_packed(
... "/path/to/packed_model",
... trust_user_code=True,
... )classmethod from_pretrained(model_tag: str, trust_user_code: bool = False) → InferenceModel
Download a packaged model and build an inference model from it.
This is the remote-loading companion to from_packed(). It is called when the caller has an espnet_model_zoo tag rather than a local packed directory. The downloader fetches and unpacks the model assets first, then this method locates the unpacked bundle root and delegates to from_packed() for the actual config loading and backend construction.
- Parameters:
- model_tag – Pretrained model identifier understood by
espnet_model_zoo. - trust_user_code – Forwarded to
from_packed().
- model_tag – Pretrained model identifier understood by
- Returns: Downloaded inference model.
- Return type:InferenceModel
- Raises:RuntimeError – If the downloaded artifacts do not include an
inference_configentry.
Notes
The downloader returns individual artifact paths. This method uses the downloaded inference_config path to recover the enclosing pack directory expected by from_packed().
############# Examples
>>> model = InferenceModel.from_pretrained("espnet/some_model")
>>> text = model(audio_array)>>> model = InferenceModel.from_pretrained(
... "espnet/some_model",
... trust_user_code=True,
... )property primary_input_key : str
Return the single configured input key.
