AI Aggregator

Categories

Best small LLMs for multilingual

Models that work outside English without falling off a cliff. Tokenizer choice matters most.

"Multilingual" is a slippery label. Most LLMs claim it; the question is which languages, at what quality, and whether the tokenizer pays its rent for the languages you serve.

Tokenizer cost is the unsung determinant. CJK-leaning tokenizers (Qwen) pack Chinese into roughly half the tokens of English-first models. The reverse is also true. For multi-region products, this drives economic decisions.

What we look for

  • Quality breadth on MMMLU and MGSM, watching the worst-performing languages, not just the average.
  • Tokenizer efficiency - tokens-per-byte for the languages you ship.
  • Code-switching - real users mix languages mid-message.
  • Localized refusals - safety tuning often skips non-English data.
  • Script coverage - RTL, Indic, CJK rendering all expose subtle issues.

Ranked for products serving 5+ languages.

Picks

  1. #1 Gemma 4 31B 31.0B · Apache 2.0

    31B dense, Apache 2.0, 256K context, multimodal. AIME 2026 89.2%, Codeforces ELO 2150 - leads open dense models in its size class for math and competitive programming. Bridges 'serious work' and 'fits on a 24-48GB GPU'.

  2. #2 Gemma 4 E4B 4.0B · Apache 2.0

    Native multimodal (text, image, video, audio) at edge sizes. Apache 2.0. ~4B effective inference footprint built to preserve RAM and battery on consumer devices.

  3. #3 Mistral Small 3.2 24B 24.0B · Apache 2.0

    Apache 2.0 mid-size all-rounder. ~81% MMLU at 150 t/s, 3x faster than Llama 3.3 70B at similar quality. 128K context. Vision support added in 3.x line.

  4. #4 Qwen3-8B Instruct 8.2B · Apache 2.0

    Strong all-rounder in the 7-8B class. Apache 2.0. 32K native context, 131K with YaRN. Hybrid 'thinking' mode you can toggle per request.

  5. #5 Qwen2.5-VL 7B Instruct 7.6B · Apache 2.0

    Vision-language specialist at 7B. Beats Llama 3.2-Vision 11B on MMMU (58.6), MathVista (68.2), DocVQA (95.7). Apache 2.0. Variable resolution and aspect ratio support, video frames.