Open-weight local LLM

LFM2.5-8B-A1B

Liquid AI hybrid model built for on-device assistants. 8.3B total / 1.5B active, 128K context, tool use, GGUF, ONNX, MLX, llama.cpp and LM Studio support. Open-weight under LFM 1.0.

Laptop ready 8 GB RAM Q4_K_M On-device personal assistant
Parameters
8.3B (1.5B active)
Minimum RAM
8 GB
Model size
5.2 GB
Quantization
Q4_K_M

Can LFM2.5-8B-A1B run locally?

LFM2.5-8B-A1B is a good fit for normal laptops and compact desktops with 8 GB RAM or more.

Search for lfm2.5-8b-a1b in LM Studio or another GGUF-compatible runtime.

chatcodereasoningspeedstandardgeneral

Install path

01
Check RAM fitMinimum 8 GB RAM. Start with the Q4_K_M quant.
02
Load the modelSearch lfm2.5-8b-a1b in LM Studio.
03
Control locallyUse LocalClaw to manage models, agents, chat, channels and scheduled OpenClaw work.

Strengths

  • Designed specifically for on-device personal assistants and local agent workflows
  • Only 1.5B active parameters at inference despite 8.3B total parameters
  • 128K context window for long local sessions and document-heavy prompts
  • Day-one GGUF, ONNX, MLX, llama.cpp and LM Studio support
  • Strong fit for structured outputs, tool use and lightweight agentic tasks
  • Runs on mainstream 8-16 GB machines with quantized weights

Limitations

  • LFM 1.0 is a custom open-weight license, not Apache 2.0
  • Liquid AI notes it is not the best fit for heavy programming or knowledge-heavy QA without retrieval
  • Hybrid architecture may need recent runtimes for best performance
  • Still a small active-parameter model; larger 20B-30B class models can beat it on raw quality

Best use cases

  • On-device personal assistant
  • Local OpenClaw agents with tool calls
  • Structured output workflows
  • Fast multilingual chat on laptops
  • Long-context local note and document workflows
  • Apple Silicon inference through MLX

Capability profile

speed
9
quality
8
coding
8
reasoning
8

Technical notes

Developer
Liquid AI
License
LFM 1.0
Context window
128,000 tokens
Architecture
Hybrid Liquid Foundation Model with 8.3B total parameters and 1.5B active parameters. The model mixes double-gated LIV convolution layers with grouped-query attention.

This model fits these next steps

Hardware fit is based on LocalClaw's RAM tier, model size and quantization metadata. Always leave memory headroom for your OS and runtime.

Similar models to compare

Where to go next