Local ASR model

Kyutai STT 2.6B

Q: Can Kyutai STT 2.6B run locally?

Kyutai STT 2.6B is listed by LocalClaw as a local ASR option. Hardware fit depends on runtime, model size and backend support.

Production-grade streaming ASR from Kyutai (makers of Moshi). Delay-streaming transformer with 500ms latency, word-level timestamps, speaker diarization. Top of Open ASR Leaderboard for real-time French + English.

Apple Silicon ready speech-to-text transcription 2 languages CC-BY-4.0

Compare TTS models Open source page

Quality

9.4/10

Speed

9.5/10

Model size

2.7 GB

Voices

N/A (ASR: outputs text + timestamps)

Can Kyutai STT 2.6B run locally?

Kyutai STT 2.6B can run locally for offline speech-to-text. Start with pip install moshi.

CC-BY-4.0 license. Review upstream restrictions before commercial use.

pip install moshi Upstream source

streamingrealtimelow-latencymultilingual

Audio profile

Quality

9.4

Speed

9.5

Local

9.4

Best fit

Kyutai STT 2.6B is best for offline transcription, speech indexing and local voice pipelines.

Hardware: gpucpuapple

Model details

Type

Local ASR model

Family

kyutai

Latency

ultra-low

Formats

pytorchsafetensorsmlx

Languages

en, fr

Context

Delay-streaming transformer, 500ms latency

Install locally

Check runtimeConfirm the backend supports pytorch, safetensors, mlx on your machine.

Install modelUse the upstream command or repository instructions.

Test locallyRun a short private audio prompt before moving into production workflows.

pip install moshi

Good for

speech-to-text transcription
Apple Silicon ready local workflows
streaming, realtime, low-latency

Watch before shipping

Validate pronunciation, latency and artifacts with your own voice samples.
Review the upstream license and acceptable-use notes.
Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

Alibaba Cloud (Qwen Team) Qwen3-ASR Local ASR model · Q 9.5 · Speed 9 OpenAI Whisper v3 Turbo Local ASR model · Q 9.1 · Speed 9.5 NVIDIA Parakeet TDT 0.6B v2 Local ASR model · Q 9.4 · Speed 10 NVIDIA Canary 1B v2 Local ASR model · Q 9.3 · Speed 9 IBM Granite Team Granite Speech 4.1 2B Local ASR model · Q 9.2 · Speed 8 Microsoft Research VibeVoice ASR Local ASR model · Q 9.3 · Speed 7.5 Cohere Cohere Transcribe 03-2026 Local ASR model · Q 9 · Speed 8 hexgrad Kokoro TTS Local TTS model · Q 9.2 · Speed 9.8

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw