Local TTS model
Fish Speech
Fast, high-quality TTS with voice cloning and multilingual support. Optimized for real-time applications.
Apple Silicon ready
text-to-speech generation
10 languages
Apache 2.0
Quality
9/10
Speed
8.5/10
Model size
3 GB
Voices
Unlimited cloning
Can Fish Speech run locally?
Fish Speech can generate speech locally for private voice workflows. Start with pip install fish-speech.
Apache 2.0 license. Still verify upstream usage notes before shipping.
pip install fish-speech
Upstream source
streamingrealtimecloningmultilingual
Audio profile
Best fit
Fish Speech is best for local voice cloning and expressive speech generation.
Hardware: gpuapple
Model details
Type
Local TTS model
Family
fish
Latency
ultra-low
Formats
pytorchonnx
Languages
en, zh, ja, ko, fr, de, es, it, pt, ru
Context
Instant cloning
Install locally
01
Check runtimeConfirm the backend supports pytorch, onnx on your machine.02
Install modelUse the upstream command or repository instructions.03
Test locallyRun a short private audio prompt before moving into production workflows.pip install fish-speech
Good for
- text-to-speech generation
- Apple Silicon ready local workflows
- streaming, realtime, cloning
Watch before shipping
- Validate pronunciation, latency and artifacts with your own voice samples.
- Review the upstream license and acceptable-use notes.
- Benchmark on your target CPU, Apple Silicon or GPU setup.
Related TTS and speech models
Speech Research (SWivid)
F5-TTS v1.1
Local TTS model · Q 9.5 · Speed 9.2
Alibaba FunAudioLLM
CosyVoice 2
Local TTS model · Q 9.3 · Speed 8.8
OpenBMB
VoxCPM2
Local TTS model · Q 9.4 · Speed 8.3
OpenMOSS / MOSI.AI
MOSS-TTS-Nano
Local TTS model · Q 8.5 · Speed 9.7
hexgrad
Kokoro TTS
Local TTS model · Q 9.2 · Speed 9.8
Speech Research
F5-TTS
Local TTS model · Q 9.4 · Speed 9
Amphion Team
MaskGCT
Local TTS model · Q 9.4 · Speed 9
Alibaba Cloud (Qwen Team)
Qwen3 TTS
Local TTS model · Q 9.5 · Speed 8.5