Local TTS model

MaskGCT

Q: Can MaskGCT run locally?

MaskGCT is listed by LocalClaw as a local TTS option. Hardware fit depends on runtime, model size and backend support.

Fully non-autoregressive TTS - no text-phone alignment needed. Achieves human parity on naturalness and similarity metrics. Incredibly fast inference.

Apple Silicon ready text-to-speech generation 2 languages MIT

Compare TTS models Open source page

Quality

9.4/10

Speed

9/10

Model size

2.8 GB

Voices

Reference-based cloning

Can MaskGCT run locally?

MaskGCT can generate speech locally for private voice workflows. Start with pip install maskgct.

MIT license. Still verify upstream usage notes before shipping.

pip install maskgct Upstream source

cloningrealtimestreaming

Audio profile

Quality

9.4

Speed

Local

9.1

Best fit

MaskGCT is best for local voice cloning and expressive speech generation.

Hardware: gpuapple

Model details

Type

Local TTS model

Family

maskgct

Latency

ultra-low

Formats

pytorchsafetensors

Languages

en, zh

Context

Non-autoregressive, human parity

Install locally

Check runtimeConfirm the backend supports pytorch, safetensors on your machine.

Install modelUse the upstream command or repository instructions.

Test locallyRun a short private audio prompt before moving into production workflows.

pip install maskgct

Good for

text-to-speech generation
Apple Silicon ready local workflows
cloning, realtime, streaming

Watch before shipping

Validate pronunciation, latency and artifacts with your own voice samples.
Review the upstream license and acceptable-use notes.
Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw