espnet2.fileio.datadir_writer.DatadirWriter
espnet2.fileio.datadir_writer.DatadirWriter
class espnet2.fileio.datadir_writer.DatadirWriter(p: Path | str)
Bases: object
Writer class to create a Kaldi-like data directory.
This class facilitates the creation of a structured data directory similar to those used in Kaldi. It allows for writing key-value pairs representing utterance IDs and their corresponding audio file paths, as well as creating subdirectories for organizing the data.
####### Examples
>>> with DatadirWriter("output") as writer:
... # output/sub.txt is created here
... subwriter = writer["sub.txt"]
... # Write "uttidA some/where/a.wav"
... subwriter["uttidA"] = "some/where/a.wav"
... subwriter["uttidB"] = "some/where/b.wav"
path
The path to the data directory being created.
- Type: Path
children
A dictionary holding references to child DatadirWriter instances.
- Type: dict
fd
File descriptor for writing to a file, or None if writing to a directory.
- Type: TextIOWrapper or None
has_children
Flag indicating if there are child DatadirWriter instances.
- Type: bool
keys
A set of keys that have been written to the current data directory or file.
Type: set
Parameters:p (Union *[*Path , str ]) – The path where the data directory should be created.
Raises:RuntimeError – If attempting to write to a file when a subdirectory exists or vice versa.
NOTE
The __enter__ and __exit__ methods allow for the use of this class in a context manager, ensuring that resources are properly managed.
close()
Close the data writer and all associated child writers.
This method is responsible for properly closing the current writer instance. If there are any child writers, it will recursively close each of them and will also check for key mismatches between siblings, issuing warnings if discrepancies are found. If the current writer has an open file descriptor, it will be closed as well.
- Raises:
- RuntimeError – If there is an issue while closing the file
- descriptor or child writers. –
####### Examples
>>> with DatadirWriter("output") as writer:
... subwriter = writer["sub.txt"]
... subwriter["uttidA"] = "some/where/a.wav"
... # When exiting the context, close is called automatically.
NOTE
This method is automatically invoked when exiting the context manager. It is recommended not to call this method directly unless you are managing the writer lifecycle manually.