Open-weight MoE

DeepSeek V3.1 (671B MoE)

Hybrid thinking/non-thinking model. Full 671B MoE for maximum quality, 37B active at inference. Significant step up from V3.0. Requires server-grade hardware. MIT licensed.

Server-grade 512 GB RAM Q4_K_M Maximum quality outputs
Parameters
671B (37B active, MoE)
Minimum RAM
512 GB
Model size
360 GB
Quantization
Q4_K_M

Can DeepSeek V3.1 (671B MoE) run locally?

DeepSeek V3.1 (671B MoE) is server-grade locally. Keep it for comparison unless you have very large unified memory, multiple GPUs or remote inference.

Search for deepseek-v3.1 in LM Studio or another GGUF-compatible runtime.

chatreasoningquality

Install path

01
Check RAM fitMinimum 512 GB RAM. Start with the Q4_K_M quant.
02
Load the modelSearch deepseek-v3.1 in LM Studio.
03
Control locallyUse LocalClaw to manage models, agents, chat, channels and scheduled OpenClaw work.

Strengths

  • Hybrid thinking/non-thinking mode
  • Only 37B active parameters despite 671B total
  • Top-tier quality
  • Among best open models ever

Limitations

  • Requires 512GB+ RAM for full model
  • Server-grade hardware only
  • Complex setup

Best use cases

  • Maximum quality outputs
  • Research
  • Enterprise deployment
  • Frontier AI tasks

Capability profile

speed
1
quality
10
coding
10
reasoning
10

Technical notes

Developer
DeepSeek AI
License
DeepSeek License
Context window
131,072 tokens
Architecture
Mixture of Experts (MoE) — 671B total, ~37B active

This model fits these next steps

Hardware fit is based on LocalClaw's RAM tier, model size and quantization metadata. Always leave memory headroom for your OS and runtime.

Similar models to compare

Where to go next