> tts_models/pl/mai_female/vits is already downloaded.
> vocoder_models/universal/libri-tts/wavegrad is already downloaded.
> Using model: vits
> Setting up Audio Processor...
| > sample_rate:22050
| > resample:False
| > num_mels:80
| > log_func:np.log10
| > min_level_db:0
| > frame_shift_ms:None
| > frame_length_ms:None
| > ref_level_db:None
| > fft_size:1024
| > power:None
| > preemphasis:0.0
| > griffin_lim_iters:None
| > signal_norm:None
| > symmetric_norm:None
| > mel_fmin:0
| > mel_fmax:None
| > pitch_fmin:None
| > pitch_fmax:None
| > spec_gain:20.0
| > stft_pad_mode:reflect
| > max_norm:1.0
| > clip_norm:True
| > do_trim_silence:False
| > trim_db:60
| > do_sound_norm:False
| > do_amp_to_db_linear:True
| > do_amp_to_db_mel:True
| > do_rms_norm:False
| > db_level:None
| > stats_path:None
| > base:10
| > hop_length:256
| > win_length:1024
> initialization of speaker-embedding layers.
> initialization of language-embedding layers.
> Vocoder Model: wavegrad
> Text: Czesc, jestem syntezator mowy
> Text splitted to sentences.
['Czesc, jestem syntezator mowy']
Traceback (most recent call last):
File "/usr/local/bin/tts", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/TTS/bin/synthesize.py", line 439, in main
reference_speaker_name=args.reference_speaker_idx,
File "/usr/local/lib/python3.7/site-packages/TTS/utils/synthesizer.py", line 393, in tts
vocoder_input = self.vocoder_ap.normalize(mel_postnet_spec.T)
File "/usr/local/lib/python3.7/site-packages/TTS/utils/audio/processor.py", line 286, in normalize
raise RuntimeError(" [!] Mean-Var stats does not match the given feature dimensions.")
RuntimeError: [!] Mean-Var stats does not match the given feature dimensions.
root@1affd8f1442e:/#
Command:
tts --text "Czesc, jestem syntezator mowy" --model_name "tts_models/pl/mai_female/vits" --vocoder_name "vocoder_models/universal/libri-tts/wavegrad" --out_path /output/Enviroument:
I have too new python so I use docker. Here is my dockerfile:
Dockerfile
output:
console output
nvidia-smi output
I followed instructions from readme. Am I missing something?