Categories - AI Aggregator

All-purpose models for chatbots and assistants. Instruction-following and license trump benchmarks.

Top pick · Qwen3.6-27B

Code completion, generation, and review. Small specialists now beat generalist 70Bs.

Top pick · Qwen3-Coder-Next

Models that think before answering. Small specialists nearly match frontier-scale on math and logic.

Top pick · Phi-4 Reasoning 14B

Models that hold up when stuffed with retrieved context. Production's dominant small-LLM use case.

Top pick · Qwen3.6-27B

Models that emit clean tool calls and recover from errors gracefully.

Top pick · Qwen3-Coder-Next

Small enough to run on a phone, laptop, or embedded device. The 1-4B effective tier is interesting.

Top pick · Phi-4-mini 3.8B

Models that take images alongside text. Native multimodal pretraining is the 2026 default.

Top pick · Qwen2.5-VL 7B Instruct

Models that work outside English without falling off a cliff. Tokenizer choice matters most.

Top pick · Gemma 4 31B

Models that turn messy text into clean JSON. Half of production LLM workloads run on this.

Top pick · Qwen3.5-9B