Qwen3.5: Difference between revisions

From Akripedia
FKemeth (talk | contribs)
No edit summary
FKemeth (talk | contribs)
No edit summary
Line 25: Line 25:
! Benchmark !! Category !! 397B-A17B !! 9B !! 4B !! 2B !! Claude Opus 4.6
! Benchmark !! Category !! 397B-A17B !! 9B !! 4B !! 2B !! Claude Opus 4.6
|-
|-
| [[GPQA Diamond]]<ref>[https://artificialanalysis.ai/evaluations/gpqa-diamond?models=gemma-4-26b-a4b%2Cgemma-4-31b-non-reasoning%2Cgemma-4-e2b%2Cgemma-4-e4b-non-reasoning%2Cgemma-4-e4b%2Cgemma-4-e2b-non-reasoning%2Cclaude-opus-4-6-adaptive%2Cqwen3-5-2b%2Cqwen3-5-9b%2Cqwen3-5-397b-a17b%2Cqwen3-5-4b%2Cqwen3-5-2b-non-reasoning GPQA Diamond Benchmark Leaderboard: Results]. Artificial Analysis, April 2026.</ref> || Science || 89.3 || 80.6 || 77.1 || -- || 89.6
| [[GPQA Diamond]]<ref>[https://artificialanalysis.ai/evaluations/gpqa-diamond?models=gemma-4-26b-a4b%2Cgemma-4-31b-non-reasoning%2Cgemma-4-e2b%2Cgemma-4-e4b-non-reasoning%2Cgemma-4-e4b%2Cgemma-4-e2b-non-reasoning%2Cclaude-opus-4-6-adaptive%2Cqwen3-5-2b%2Cqwen3-5-9b%2Cqwen3-5-397b-a17b%2Cqwen3-5-4b%2Cqwen3-5-2b-non-reasoning GPQA Diamond Benchmark Leaderboard: Results]. Artificial Analysis, April 2026.</ref> || style="text-align:right;" | Science || style="text-align:right;" | 89.3 || 80.6 || 77.1 || -- || style="text-align:right;" | 89.6
|-
|-
| [[SWE-bench Verified]] || Coding || 76.4 || -- || -- || -- || 80.8
| [[SWE-bench Verified]] || style="text-align:right;" | Coding || style="text-align:right;" | 76.4 || -- || -- || -- || style="text-align:right;" | 80.8
|-
|-
| [[MMMU-Pro]] || Multimodal || 79.0 || 70.1 || 66.3 || 50.3 || 73.9
| [[MMMU-Pro]] || style="text-align:right;" | Multimodal || style="text-align:right;" | 79.0 || 70.1 || 66.3 || 50.3 || style="text-align:right;" | 73.9
|-
|-
| [[MMMLU]] || Multilingual || 88.5 || 81.2 || 76.1 || 63.1 || 91.1
| [[MMMLU]] || style="text-align:right;" | Multilingual || style="text-align:right;" | 88.5 || 81.2 || 76.1 || 63.1 ||style="text-align:right;" | 91.1
|}
|}



Revision as of 09:03, 7 April 2026

Qwen 3.5
Developer Alibaba Cloud
Release Date February 15, 2026
Model Sizes 0.8B, 2B, 4B, 9B, 27B (dense), 35B-A3B (MoE), 122B-A10B (MoE), 397B-A17B (MoE)
Architecture Decoder-only Transformer
Modality Image-Text-to-Text
Thinking Yes (toggleable)
Context Length 262,144 (up to 1M via API)
License Apache 2.0
Languages 201 languages and dialects
Hugging Face Qwen 3.5
Paper Link

Qwen3.5 is an open-weight and native vision-language foundation model series developed by Alibaba and released on February 15, 2026.[1] It is build on a hybrid architecture using linear attention with Gated Delta Networks as well as sparse Mixture of Experts. The models support 201 languages and dialects, compared to 119 of their earlier Qwen3 model series.

Benchmarks

Results for the flagship 397B-A17B and 9b, 4B as well as 2B small models.

Benchmark Category 397B-A17B 9B 4B 2B Claude Opus 4.6
GPQA Diamond[2] Science 89.3 80.6 77.1 -- 89.6
SWE-bench Verified Coding 76.4 -- -- -- 80.8
MMMU-Pro Multimodal 79.0 70.1 66.3 50.3 73.9
MMMLU Multilingual 88.5 81.2 76.1 63.1 91.1


References

  1. Qwen3.5: Towards Native Multimodal Agents. Qwen Team, February 2026.
  2. GPQA Diamond Benchmark Leaderboard: Results. Artificial Analysis, April 2026.