Open-weight local LLM

Command R (35B)

Cohere optimized for RAG and long-context. 128K context. Excellent retrieval-augmented generation. 354K downloads.

32 GB power user 24 GB RAM Q4_K_M RAG applications
Parameters
35B
Minimum RAM
24 GB
Model size
20 GB
Quantization
Q4_K_M

Can Command R (35B) run locally?

Command R (35B) belongs on 32 GB machines when you want stronger quality without jumping to server hardware.

Search for command-r-35b in LM Studio or another GGUF-compatible runtime.

chatgeneralpower

Install path

01
Check RAM fitMinimum 24 GB RAM. Start with the Q4_K_M quant.
02
Load the modelSearch command-r-35b in LM Studio.
03
Control locallyUse LocalClaw to manage models, agents, chat, channels and scheduled OpenClaw work.

Strengths

  • 128K context
  • Optimized for RAG
  • 10 languages
  • Excellent retrieval-augmented generation

Limitations

  • Non-commercial license
  • Needs 24GB+ RAM
  • Superseded by Command A

Best use cases

  • RAG applications
  • Long document QA
  • Multilingual search
  • Enterprise chatbot

Capability profile

speed
4
quality
8
coding
7
reasoning
8

Technical notes

Developer
Cohere
License
CC-BY-NC 4.0
Context window
131,072 tokens
Architecture
Transformer optimized for RAG

This model fits these next steps

Hardware fit is based on LocalClaw's RAM tier, model size and quantization metadata. Always leave memory headroom for your OS and runtime.

Similar models to compare

Where to go next