Local TTS model

XTTS v3 (Community)

Community-maintained successor to XTTS v2 after Coqui shut down. Improved cloning stability, lower artefacts, and 20+ languages. Drop-in replacement for existing XTTS pipelines with better prosody.

Apple Silicon ready text-to-speech generation 20 languages MPL 2.0
Quality
9.1/10
Speed
7/10
Model size
2 GB
Voices
Unlimited via cloning (6s sample)

Can XTTS v3 (Community) run locally?

XTTS v3 (Community) can generate speech locally for private voice workflows. Start with pip install TTS-community.

MPL 2.0 license. Review upstream restrictions before commercial use.

cloningmultilingualemotion

Audio profile

Quality
9.1
Speed
7
Local
8.4

Best fit

XTTS v3 (Community) is best for local voice cloning and expressive speech generation.

Hardware: gpuapple

Model details

Type
Local TTS model
Family
coqui
Latency
low
Formats
pytorchonnx
Languages
en, es, fr, de, it, pt, pl, tr, ru, nl, cs, ar, zh, ja, hu, ko, hi, ro, sv, uk
Context
Community fork - active maintenance

Install locally

01
Check runtimeConfirm the backend supports pytorch, onnx on your machine.
02
Install modelUse the upstream command or repository instructions.
03
Test locallyRun a short private audio prompt before moving into production workflows.
pip install TTS-community

Good for

  • text-to-speech generation
  • Apple Silicon ready local workflows
  • cloning, multilingual, emotion

Watch before shipping

  • Validate pronunciation, latency and artifacts with your own voice samples.
  • Review the upstream license and acceptable-use notes.
  • Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw