Models / Gemma
Gemma 4 31B
Strengths
31B dense, Apache 2.0, 256K context, multimodal. AIME 2026 89.2%, Codeforces ELO 2150 - leads open dense models in its size class for math and competitive programming. Bridges 'serious work' and 'fits on a 24-48GB GPU'.
Weaknesses
Needs serious VRAM. Quantizes to fit on 24GB but bf16 wants more. Larger than 'small' depending where you draw the line - included as the largest dense Gemma 4 variant under 50B.
Gemma 4 31B is the dense flagship of Google DeepMind's April 2026 release. The model to pick when you want frontier-adjacent capability under Apache 2.0 and you can spend a 24-48GB GPU on it.
The 256K context is the practical highlight. Most open dense models in this class top out at 128K; doubling it without falling apart matters for repo-scale code and long-document workflows.
When to pick it
- Strongest dense open model that fits on a single high-end GPU.
- Long context (256K) for repo-scale or document-scale tasks.
- License clarity: Apache 2.0 with no enterprise carve-outs.
When to skip it
- Hardware-constrained: Gemma 4 E4B or Qwen3-8B.
- Pure text reasoning: Phi-4 Reasoning 14B may match it at half the footprint.