FitMyGPU

Company

Qwen

Qwen dense, hybrid, and multimodal families with frequent open checkpoint releases.

Start here

Latest model

Series

Qwen 3.6

Series

Qwen 3.5

Qwen 3.5 397B A17B

Largest Qwen3.5 release in the registry, combining the hybrid multimodal stack with a very large MoE parameter pool and a much smaller active token path.

397B total • 17B active • 262,144 context • 2 KV heads

Qwen 3.5 122B A10B

High-capacity Qwen3.5 MoE release for users who want the family’s hybrid multimodal architecture at a much larger scale without paying dense 122B compute per token.

122B total • 10B active • 262,144 context • 2 KV heads

Qwen 3.5 35B A3B

Hybrid multimodal Qwen3.5 MoE checkpoint with a 35B total parameter pool and a much smaller active path for lower compute than a dense model of similar capacity.

35B total • 3B active • 262,144 context • 2 KV heads

Qwen 3.5 27B

Large dense Qwen3.5 release that keeps the hybrid multimodal stack but pushes into a much heavier single-model serving class than the 9B tier.

27B dense • 262,144 context • 4 KV heads

Qwen 3.5 9B

Largest practical Qwen3.5 release for this batch, pairing a 9B language model with a resident multimodal stack that still targets single-GPU text serving.

10B dense • 262,144 context • 4 KV heads

Qwen 3.5 4B

Mid-sized Qwen3.5 checkpoint with a larger resident multimodal footprint but still practical for careful single-GPU text-only serving.

5B dense • 262,144 context • 4 KV heads

Qwen 3.5 2B

Small hybrid Qwen3.5 release for developers who want longer context and native multimodal training heritage without a large single-card footprint.

2B dense • 262,144 context • 2 KV heads

Qwen 3.5 0.8B

Compact Qwen3.5 checkpoint with a hybrid text-plus-vision stack and a small resident footprint for text-only local experimentation.

900M dense • 262,144 context • 2 KV heads

Series

Qwen 3

Qwen 3 235B A22B

Largest Qwen3 MoE release with 235B total parameters and 22B activated parameters, aimed at frontier-scale open reasoning and agent use.

235B total • 22B active • 131,072 context • 4 KV heads

Qwen 3 32B

Largest dense Qwen3 release for high-capacity reasoning, agent, and multilingual assistant workloads with switchable thinking modes.

32.8B dense • 131,072 context • 8 KV heads

Qwen 3 30B A3B

Qwen3 MoE release with 30.5B total parameters and 3.3B active parameters, built for lower active compute than a comparable dense model.

30.5B total • 3.3B active • 131,072 context • 4 KV heads

Qwen 3 30B A3B Instruct 2507

Non-thinking Qwen3 MoE update with stronger general capabilities, better alignment, and native 256K context packaging.

30.5B total • 3.3B active • 262,144 context • 4 KV heads

Qwen 3 14B

Dense Qwen3 release for higher-capacity reasoning, agent, and multilingual assistant workloads with switchable thinking modes.

14.8B dense • 131,072 context • 8 KV heads

Qwen 3 8B

Dense Qwen3 release for stronger general-purpose reasoning, agent, and multilingual assistant use with switchable thinking modes.

8.2B dense • 131,072 context • 8 KV heads

Qwen 3 4B

Dense Qwen3 release with switchable thinking modes, stronger reasoning, and 131K extended-context support through YaRN.

4B dense • 131,072 context • 8 KV heads

Qwen 3 4B Thinking 2507

Qwen3 update focused on deeper reasoning and longer native context, tuned specifically for more complex thinking-heavy workloads.

4B dense • 262,144 context • 8 KV heads

Qwen 3 1.7B

Small dense Qwen3 release for lightweight reasoning, agent, and multilingual assistant use with switchable thinking modes.

1.7B dense • 32,768 context • 8 KV heads

Qwen 3 0.6B

Smallest dense Qwen3 release with switchable thinking and non-thinking modes in a very light deployment footprint.

600M dense • 32,768 context • 8 KV heads

Series

Qwen 2.5