Open-weight local LLM

Qwen 3 MoE (235B/22B active)

Mixture of Experts behemoth. Only 22B params active at once = fast despite massive size. Top-tier.

Large-memory workstation 96 GB RAM Q4_K_M Maximum quality AI
Parameters
235B (22B active)
Minimum RAM
96 GB
Model size
80 GB
Quantization
Q4_K_M

Can Qwen 3 MoE (235B/22B active) run locally?

Qwen 3 MoE (235B/22B active) needs a serious workstation with large unified memory or high VRAM.

Search for qwen3-235b-a22b in LM Studio or another GGUF-compatible runtime.

chatcodereasoningquality

Install path

01
Check RAM fitMinimum 96 GB RAM. Start with the Q4_K_M quant.
02
Load the modelSearch qwen3-235b-a22b in LM Studio.
03
Control locallyUse LocalClaw to manage models, agents, chat, channels and scheduled OpenClaw work.

Strengths

  • Only 22B active despite 235B total
  • Fast for its power level
  • Apache 2.0
  • Top-tier quality

Limitations

  • Requires 96GB+ RAM
  • Complex MoE deployment
  • Very large files

Best use cases

  • Maximum quality AI
  • Enterprise deployment
  • Research
  • Complex reasoning

Capability profile

speed
3
quality
10
coding
10
reasoning
10

Technical notes

Developer
Alibaba Cloud (Qwen Team)
License
Apache 2.0
Context window
131,072 tokens
Architecture
Mixture of Experts — 235B total, 22B active per token

This model fits these next steps

Hardware fit is based on LocalClaw's RAM tier, model size and quantization metadata. Always leave memory headroom for your OS and runtime.

Similar models to compare

Where to go next