Local TTS model

CosyVoice 2

Q: Can CosyVoice 2 run locally?

CosyVoice 2 is listed by LocalClaw as a local TTS option. Hardware fit depends on runtime, model size and backend support.

Industrial-grade multilingual TTS with streaming, voice cloning and emotion control. Exceptional Chinese + English quality. Used in production at Alibaba scale.

Apple Silicon ready text-to-speech generation 8 languages Apache 2.0

Compare TTS models Open source page

Quality

9.3/10

Speed

8.8/10

Model size

2.4 GB

Voices

Zero-shot + cross-lingual cloning

Can CosyVoice 2 run locally?

CosyVoice 2 can generate speech locally for private voice workflows. Start with pip install cosyvoice.

Apache 2.0 license. Still verify upstream usage notes before shipping.

pip install cosyvoice Upstream source

streamingrealtimecloningemotionmultilingual

Audio profile

Quality

9.3

Speed

8.8

Local

8.9

Best fit

CosyVoice 2 is best for local voice cloning and expressive speech generation.

Hardware: gpuapple

Model details

Type

Local TTS model

Family

cosyvoice

Latency

ultra-low

Formats

pytorchonnx

Languages

en, zh, ja, ko, yue, fr, de, es

Context

Instruct mode with natural language

Install locally

Check runtimeConfirm the backend supports pytorch, onnx on your machine.

Install modelUse the upstream command or repository instructions.

Test locallyRun a short private audio prompt before moving into production workflows.

pip install cosyvoice

Good for

text-to-speech generation
Apple Silicon ready local workflows
streaming, realtime, cloning

Watch before shipping

Validate pronunciation, latency and artifacts with your own voice samples.
Review the upstream license and acceptable-use notes.
Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw