Models / Gemma

Gemma 4 31B

google/gemma-4-31b-it

general-chat coding reasoning rag agents multilingual visiongpu-24gbgpu-48gbapple-silicon-32gbdatacenter

Parameters: 31.0B
Family: Gemma
License: Apache 2.0
Context length: 262,144 tokens
Languages: en, multi
Modalities: text, image, video
Released: 2026-04-02
HF downloads (30d): 7,907,233
Stats updated: -1 days ago

Strengths

31B dense, Apache 2.0, 256K context, multimodal. AIME 2026 89.2%, Codeforces ELO 2150 - leads open dense models in its size class for math and competitive programming. Bridges 'serious work' and 'fits on a 24-48GB GPU'.

Weaknesses

Needs serious VRAM. Quantizes to fit on 24GB but bf16 wants more. Larger than 'small' depending where you draw the line - included as the largest dense Gemma 4 variant under 50B.

Gemma 4 31B is the dense flagship of Google DeepMind's April 2026 release. The model to pick when you want frontier-adjacent capability under Apache 2.0 and you can spend a 24-48GB GPU on it.

The 256K context is the practical highlight. Most open dense models in this class top out at 128K; doubling it without falling apart matters for repo-scale code and long-document workflows.

When to pick it

Strongest dense open model that fits on a single high-end GPU.
Long context (256K) for repo-scale or document-scale tasks.
License clarity: Apache 2.0 with no enterprise carve-outs.

When to skip it

Hardware-constrained: Gemma 4 E4B or Qwen3-8B.
Pure text reasoning: Phi-4 Reasoning 14B may match it at half the footprint.