Local TTS model
GPT-SoVITS
Zero-shot voice cloning TTS combining GPT and SoVITS. Clone any voice from 5 seconds of audio. Extremely popular in the open-source community with 40K+ GitHub stars.
Apple Silicon ready
text-to-speech generation
5 languages
MIT
Quality
9.1/10
Speed
7/10
Model size
2 GB
Voices
Zero-shot cloning from 5s
Can GPT-SoVITS run locally?
GPT-SoVITS can generate speech locally for private voice workflows. Start with git clone https://github.com/RVC-Boss/GPT-SoVITS.
MIT license. Still verify upstream usage notes before shipping.
git clone https://github.com/RVC-Boss/GPT-SoVITS
Upstream source
cloningmultilingualemotion
Audio profile
Best fit
GPT-SoVITS is best for local voice cloning and expressive speech generation.
Hardware: gpuapple
Model details
Type
Local TTS model
Family
gptsovits
Latency
medium
Formats
pytorch
Languages
en, zh, ja, ko, yue
Context
GPT + SoVITS hybrid
Install locally
01
Check runtimeConfirm the backend supports pytorch on your machine.02
Install modelUse the upstream command or repository instructions.03
Test locallyRun a short private audio prompt before moving into production workflows.git clone https://github.com/RVC-Boss/GPT-SoVITS
Good for
- text-to-speech generation
- Apple Silicon ready local workflows
- cloning, multilingual, emotion
Watch before shipping
- Validate pronunciation, latency and artifacts with your own voice samples.
- Review the upstream license and acceptable-use notes.
- Benchmark on your target CPU, Apple Silicon or GPU setup.
Related TTS and speech models
Alibaba FunAudioLLM
CosyVoice 2
Local TTS model · Q 9.3 · Speed 8.8
OpenBMB
VoxCPM2
Local TTS model · Q 9.4 · Speed 8.3
Boson AI
Higgs Audio v2
Local TTS model · Q 9.7 · Speed 7
StepFun
Step-Audio 2 Mini
Local TTS model · Q 9.3 · Speed 7.5
Coqui Community
XTTS v3 (Community)
Local TTS model · Q 9.1 · Speed 7
Speech Research (SWivid)
F5-TTS v1.1
Local TTS model · Q 9.5 · Speed 9.2
Alibaba Cloud (Qwen Team)
Qwen3 TTS
Local TTS model · Q 9.5 · Speed 8.5
Zyphra
Zonos v0.1
Local TTS model · Q 9.5 · Speed 8.5