Local ASR model

Canary 1B v2

NVIDIA multilingual ASR + speech translation in a single model. 25 European languages, bidirectional EN↔XX translation. Tops Open ASR Leaderboard multilingual category. Word-level timestamps, punctuation & capitalization.

Apple Silicon ready speech-to-text transcription 25 languages CC-BY-4.0
Quality
9.3/10
Speed
9/10
Model size
2 GB
Voices
N/A (ASR + translation: outputs text)

Can Canary 1B v2 run locally?

Canary 1B v2 can run locally for offline speech-to-text. Start with pip install nemo_toolkit[asr].

CC-BY-4.0 license. Review upstream restrictions before commercial use.

streamingmultilingualrealtime

Audio profile

Quality
9.3
Speed
9
Local
9.2

Best fit

Canary 1B v2 is best for offline transcription, speech indexing and local voice pipelines.

Hardware: gpuapple

Model details

Type
Local ASR model
Family
canary
Latency
low
Formats
nemo
Languages
en, de, es, fr, it, pt, pl, nl, sv, fi, da, cs, hu, ro, bg, hr, sk, sl, et, lv, lt, el, mt, ga
Context
ASR + speech translation, 25 languages, 1B params

Install locally

01
Check runtimeConfirm the backend supports nemo on your machine.
02
Install modelUse the upstream command or repository instructions.
03
Test locallyRun a short private audio prompt before moving into production workflows.
pip install nemo_toolkit[asr]

Good for

  • speech-to-text transcription
  • Apple Silicon ready local workflows
  • streaming, multilingual, realtime

Watch before shipping

  • Validate pronunciation, latency and artifacts with your own voice samples.
  • Review the upstream license and acceptable-use notes.
  • Benchmark on your target CPU, Apple Silicon or GPU setup.

Related TTS and speech models

CompareBrowse all TTS models Local AIBrowse LLM models macOS appGet LocalClaw