Open-weight local LLM
Qwen 3.5 MoE (122B/10B active)
Large MoE model with only 10B active params. 60% cheaper to run than Qwen3-Max. 256K context. Top-tier reasoning, coding and multilingual. Hybrid think/non-think. Apache 2.0.
Large-memory workstation
80 GB RAM
Q4_K_M
Maximum quality AI tasks on local hardware
Parameters
122B (10B active)
Minimum RAM
80 GB
Model size
65 GB
Quantization
Q4_K_M
Can Qwen 3.5 MoE (122B/10B active) run locally?
Qwen 3.5 MoE (122B/10B active) needs a serious workstation with large unified memory or high VRAM.
Search for qwen3.5-122b-a10b in LM Studio or another GGUF-compatible runtime.
lmstudio-community/Qwen3.5-122B-A10B-GGUFchatcodereasoningqualitypower
Install path
01
Check RAM fitMinimum 80 GB RAM. Start with the Q4_K_M quant.02
Load the modelSearch qwen3.5-122b-a10b in LM Studio.03
Control locallyUse LocalClaw to manage models, agents, chat, channels and scheduled OpenClaw work.Strengths
- 122B total params with only 10B active — 60% cheaper to run than Qwen3-Max
- 256K context window
- Top-tier reasoning, coding and multilingual quality
- Hybrid thinking mode
- Strong code generation rivaling specialized code models
- Apache 2.0 fully commercial
Limitations
- Requires ~80GB RAM (multi-GPU or Mac Pro/Studio Ultra)
- MoE loading overhead
- Files are 65GB+ even quantized
- Primarily for enthusiasts with serious hardware
Best use cases
- Maximum quality AI tasks on local hardware
- Complex multi-step reasoning chains
- Enterprise-grade code generation
- Large codebase analysis (256K context)
- Multilingual professional tasks
- Research requiring frontier-level quality
Capability profile
Technical notes
This model fits these next steps
Hardware fit is based on LocalClaw's RAM tier, model size and quantization metadata. Always leave memory headroom for your OS and runtime.