Models / Gemma

Gemma 4 E4B

google/gemma-4-e4b-it

general-chat on-device vision multilingualgpu-8gbgpu-16gbgpu-24gbgpu-48gbapple-silicon-16gbapple-silicon-32gbcpu-16gbcpu-32gbdatacenter

Parameters: 4.0B
Family: Gemma
License: Apache 2.0
Context length: 131,072 tokens
Languages: en, multi
Modalities: text, image, video, audio
Released: 2026-04-02
HF downloads (30d): 5,214,452
Stats updated: -1 days ago

Strengths

Native multimodal (text, image, video, audio) at edge sizes. Apache 2.0. ~4B effective inference footprint built to preserve RAM and battery on consumer devices.

Weaknesses

4B class can't match 8B+ on hard reasoning or long-form writing. Audio and video are best-effort, not a substitute for purpose-built pipelines.

Gemma 4 E4B is Google's edge-tier release in the April 2026 Gemma 4 family. The "E" denotes effective parameters: architectural tricks keep the inference footprint at ~4B while drawing on more total parameters during training. Translation: better quality than a vanilla 4B for the same memory budget.

The headline change versus Gemma 3 is native multimodality. E4B accepts text, images, video, and audio out of the box, not as an adapter.

When to pick it

On-device or edge assistants where every megabyte matters.
Need a small model that natively handles screenshots, photos, or short audio clips.
Apache 2.0 with no commercial caveats.

When to skip it

You have GPU room for an 8B+ model: quality scales, and Qwen3-8B will outperform on most text tasks.
Heavy multimodal needs (long video, complex visual reasoning): the 26B/31B Gemma 4 variants fit better.