Local TTS model

LLaSA 3B

Q: Can LLaSA 3B run locally?

LLaSA 3B is listed by LocalClaw as a local TTS option. Hardware fit depends on runtime, model size and backend support.

LLaMA-based TTS with pure next-token speech generation - no separate decoder. Scales with compute: the 3B variant matches specialised TTS SOTA on zero-shot cloning. Trained on 250K hours of Chinese + English speech.

GPU recommended text-to-speech generation 2 languages CC-BY-NC 4.0

Compare TTS models Open source page

Quality

9.2/10

Speed

7/10

Model size

6.2 GB

Voices

Zero-shot cloning from any reference

Can LLaSA 3B run locally?

LLaSA 3B can generate speech locally for private voice workflows. Start with pip install llasa.

CC-BY-NC 4.0 license. Review upstream restrictions before commercial use.

pip install llasa Upstream source

cloningstreamingrealtime

Audio profile

Quality

9.2

Speed

Local

8.0

Best fit

LLaSA 3B is best for local voice cloning and expressive speech generation.

Hardware: gpuapple

Model details

Type

Local TTS model

Family

llasa

Latency

low

Formats

pytorchsafetensors

Languages

en, zh

Context

LLaMA backbone + XCodec2 audio tokens

Install locally

Check runtimeConfirm the backend supports pytorch, safetensors on your machine.

Install modelUse the upstream command or repository instructions.

Test locallyRun a short private audio prompt before moving into production workflows.

pip install llasa

Good for

text-to-speech generation
GPU recommended local workflows
cloning, streaming, realtime

Watch before shipping

Validate pronunciation, latency and artifacts with your own voice samples.
Review the upstream license and acceptable-use notes.
Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw