Issue · August 2026

Small open LLMs, ranked by what you'll actually do with them.

Curated picks for chat, code, agents, RAG, vision, and on-device. Plus fine-tuning notes.

Find a model Browse all models

By task

Best small models, by task

General chat 13 ranked

All-purpose models for chatbots and assistants. Instruction-following and license trump benchmarks.

Top pick · Qwen3.6-27B
Coding 9 ranked

Code completion, generation, and review. Small specialists now beat generalist 70Bs.

Top pick · Qwen3-Coder-Next
Reasoning 8 ranked

Models that think before answering. Small specialists nearly match frontier-scale on math and logic.

Top pick · Phi-4 Reasoning 14B
RAG / long context 9 ranked

Models that hold up when stuffed with retrieved context. Production's dominant small-LLM use case.

Top pick · Qwen3.6-27B
Agents & function calling 9 ranked

Models that emit clean tool calls and recover from errors gracefully.

Top pick · Qwen3-Coder-Next
On-device / mobile 5 ranked

Small enough to run on a phone, laptop, or embedded device. The 1-4B effective tier is interesting.

Top pick · Phi-4-mini 3.8B
Vision-language 6 ranked

Models that take images alongside text. Native multimodal pretraining is the 2026 default.

Top pick · Qwen2.5-VL 7B Instruct
Multilingual 6 ranked

Models that work outside English without falling off a cliff. Tokenizer choice matters most.

Top pick · Gemma 4 31B
Structured extraction 5 ranked

Models that turn messy text into clean JSON. Half of production LLM workloads run on this.

Top pick · Qwen3.5-9B

Tracked

Recently tracked models

Stats refreshed 3m ago from Hugging Face.