espnet2.gan_svs.avocodo.avocodo.MDCDConfig
espnet2.gan_svs.avocodo.avocodo.MDCDConfig
class espnet2.gan_svs.avocodo.avocodo.MDCDConfig(h)
Bases: object
Configuration class for the Multi-band Discriminator (MDC).
This class holds the configuration parameters required for the Multi-band Discriminator, including PQMF parameters, filter sizes, kernel sizes, dilations, strides, band ranges, and segment size.
pqmf_params
Parameters for PQMF configuration.
- Type: List[int]
f_pqmf_params
Parameters for filtered PQMF configuration.
- Type: List[int]
filters
List of filter sizes for the discriminator.
- Type: List[List[int]]
kernel_sizes
List of kernel sizes for the convolutional layers.
- Type: List[List[int]]
dilations
List of dilations for the convolutional layers.
- Type: List[List[int]]
strides
List of strides for the convolutional layers.
- Type: List[List[int]]
band_ranges
List of ranges for each frequency band.
- Type: List[List[int]]
transpose
Indicates whether to transpose the input for each band.
- Type: List[bool]
segment_size
Size of segments to be processed.
Type: int
Parameters:h (Dict *[*str , Any ]) – A dictionary containing configuration parameters.
Examples
>>> config = MDCDConfig({
... "pqmf_config": {"sbd": [16, 256, 0.03, 10.0],
... "fsbd": [64, 256, 0.1, 9.0]},
... "sbd_filters": [[64, 128, 256], [32, 64, 128]],
... "sbd_kernel_sizes": [[[3, 3, 3], [5, 5, 5]]],
... "sbd_dilations": [[[1, 2, 3], [1, 2, 3]]],
... "sbd_strides": [[1, 1, 1]],
... "sbd_band_ranges": [[0, 16]],
... "sbd_transpose": [False],
... "segment_size": 8192
... })
>>> print(config.filters)
[[64, 128, 256], [32, 64, 128]]