Local TTS model
Higgs Audio v2
SOTA expressive TTS built on an LLM-audio backbone. Generates natural multi-speaker dialogue, spontaneous laughter, whispers and even background music. Beats ElevenLabs on MOS naturalness in several languages.
GPU recommended
text-to-speech generation
8 languages
Apache 2.0
Quality
9.7/10
Speed
7/10
Model size
5.5 GB
Voices
Zero-shot cloning + multi-speaker
Can Higgs Audio v2 run locally?
Higgs Audio v2 can generate speech locally for private voice workflows. Start with pip install higgs-audio.
Apache 2.0 license. Still verify upstream usage notes before shipping.
pip install higgs-audio
Upstream source
emotiondialoguecloningstreamingmultilingual
Audio profile
Best fit
Higgs Audio v2 is best for local voice cloning and expressive speech generation.
Hardware: gpuapple
Model details
Type
Local TTS model
Family
higgs
Latency
low
Formats
pytorchsafetensors
Languages
en, zh, fr, de, es, it, ja, ko
Context
LLM backbone with audio tokenizer, 3B params
Install locally
01
Check runtimeConfirm the backend supports pytorch, safetensors on your machine.02
Install modelUse the upstream command or repository instructions.03
Test locallyRun a short private audio prompt before moving into production workflows.pip install higgs-audio
Good for
- text-to-speech generation
- GPU recommended local workflows
- emotion, dialogue, cloning
Watch before shipping
- Validate pronunciation, latency and artifacts with your own voice samples.
- Review the upstream license and acceptable-use notes.
- Benchmark on your target CPU, Apple Silicon or GPU setup.
Related TTS and speech models
StepFun
Step-Audio 2 Mini
Local TTS model · Q 9.3 · Speed 7.5
Alibaba FunAudioLLM
CosyVoice 2
Local TTS model · Q 9.3 · Speed 8.8
OpenBMB
VoxCPM2
Local TTS model · Q 9.4 · Speed 8.3
Nari Labs
Dia
Local TTS model · Q 9.3 · Speed 7
Microsoft Research
VibeVoice 1.5B
Local TTS model · Q 9.4 · Speed 6.5
Speech Research (SWivid)
F5-TTS v1.1
Local TTS model · Q 9.5 · Speed 9.2
Alibaba Cloud (Qwen Team)
Qwen3 TTS
Local TTS model · Q 9.5 · Speed 8.5
Zyphra
Zonos v0.1
Local TTS model · Q 9.5 · Speed 8.5