JavaScript is disabled - filtering is unavailable. Showing the full list:
-
Qwen3.6-27B
- 27.0B · Apache 2.0
Flagship-level coding in a 27B dense footprint. SWE-Bench Verified 77.2%, Terminal-Bench 2.0 59.3% (matches Claude 4.5 Opus). 262K native context, multimodal, Apache 2.0.
-
Gemma 4 31B
- 31.0B · Apache 2.0
31B dense, Apache 2.0, 256K context, multimodal. AIME 2026 89.2%, Codeforces ELO 2150 - leads open dense models in its size class for math and competitive programming. Bridges 'serious work' and 'fits on a 24-48GB GPU'.
-
Gemma 4 E4B
- 4.0B · Apache 2.0
Native multimodal (text, image, video, audio) at edge sizes. Apache 2.0. ~4B effective inference footprint built to preserve RAM and battery on consumer devices.
-
Qwen3.5-9B
- 9.0B · Apache 2.0
Native multimodal at the 9B mark. 262K context (1M with YaRN). Apache 2.0. Early-fusion training rolls vision into the base model rather than bolting on a separate encoder.
-
Qwen3-Coder-Next
- 3.0B · Apache 2.0
MoE coder built for agentic workflows. 3B active / 80B total. >70% on SWE-Bench Verified with the SWE-Agent scaffold. 256K native context. Apache 2.0.
-
Nemotron 3 Nano 30B-A3B
- 3.5B · NVIDIA Nemotron Open Model License
Hybrid Mamba2-Transformer-MoE: 3.5B active out of 30B total, 256K default context (1M max). Trained from scratch on 25T tokens. Strong agentic and tool-calling post-training.
-
gpt-oss-20b
- 3.6B · Apache 2.0
OpenAI's small open-weight model. 21B total / 3.6B active MoE, runs in 16GB at MXFP4. Configurable reasoning effort (low/medium/high). Matches o3-mini on common reasoning evals.
-
Mistral Small 3.2 24B
- 24.0B · Apache 2.0
Apache 2.0 mid-size all-rounder. ~81% MMLU and 3x faster than Llama 3.3 70B at similar quality. 128K context. Vision support added in 3.x line.
-
Phi-4 Reasoning 14B
- 14.0B · MIT
Punches above its weight on reasoning. Beats DeepSeek-R1-Distill-Llama-70B on AIME and GPQA at 5x smaller. Comparable to full DeepSeek-R1 (671B) on AIME 2025. MIT license.
-
Qwen3-8B Instruct
- 8.2B · Apache 2.0
Strong all-rounder in the 7-8B class. Apache 2.0. 32K native context, 131K with YaRN. Hybrid 'thinking' mode you can toggle per request.
-
Phi-4-mini 3.8B
- 3.8B · MIT
MIT license, 67% MMLU at 3.8B. Inherits the Phi reasoning lineage in a small footprint. 128K context, 200K-token vocabulary for multilingual support. Function-calling support.
-
Qwen2.5-VL 7B Instruct
- 7.6B · Apache 2.0
Vision-language specialist at 7B. Beats Llama 3.2-Vision 11B on MMMU (58.6), MathVista (68.2), DocVQA (95.7). Apache 2.0. Variable resolution and aspect ratio support, video frames.
-
Llama 3.2 3B Instruct
- 3.2B · Llama 3.2 Community
Meta's mobile-targeted small model. Largest ecosystem at this size class. 128K context. Solid baseline for on-device assistants where ecosystem maturity matters.
-
Llama 3.1 8B Instruct
- 8.0B · Llama 3.1 Community
The ecosystem baseline. Largest community of fine-tunes, quantizations, and inference-engine support of any open small model. Predictable in production.