Local ASR model

Parakeet TDT 0.6B v2

NVIDIA's SOTA lightweight ASR - 0.6B params, #1 on Open ASR Leaderboard for English. TDT (Token-and-Duration Transducer) decoding makes it 50× faster than Whisper Large v3 on GPU. Real-time streaming with word-level timestamps.

Apple Silicon ready speech-to-text transcription 1 languages CC-BY-4.0
Quality
9.4/10
Speed
10/10
Model size
1.1 GB
Voices
N/A (ASR: outputs text + word timestamps)

Can Parakeet TDT 0.6B v2 run locally?

Parakeet TDT 0.6B v2 can run locally for offline speech-to-text. Start with pip install nemo_toolkit[asr].

CC-BY-4.0 license. Review upstream restrictions before commercial use.

streamingrealtimelow-latency

Audio profile

Quality
9.4
Speed
10
Local
9.8

Best fit

Parakeet TDT 0.6B v2 is best for offline transcription, speech indexing and local voice pipelines.

Hardware: gpucpuapple

Model details

Type
Local ASR model
Family
parakeet
Latency
ultra-low
Formats
nemoonnx
Languages
en
Context
TDT decoder, RNN-T architecture, 0.6B params

Install locally

01
Check runtimeConfirm the backend supports nemo, onnx on your machine.
02
Install modelUse the upstream command or repository instructions.
03
Test locallyRun a short private audio prompt before moving into production workflows.
pip install nemo_toolkit[asr]

Good for

  • speech-to-text transcription
  • Apple Silicon ready local workflows
  • streaming, realtime, low-latency

Watch before shipping

  • Validate pronunciation, latency and artifacts with your own voice samples.
  • Review the upstream license and acceptable-use notes.
  • Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw