TTS WebUI

Kokoro is a fast and lightweight TTS model with 82 million parameters. Small but comparable in quality to larger models.

Expressive text-to-speech model with reference audio support for voice cloning.

Bark is a text-to-speech model that can generate speech from text.

Tortoise is a text-to-speech model that can generate speech from text.

Maha TTS is a text-to-speech model that can generate speech from text, supports many Indian languages.

Fairseq based text-to-speech model that supports 1000+ languages

Multilingual TTS: Speak in three languages - English, Chinese, and Japanese - with natural and expressive speech synthesis.