Model notes
Qwen 2.5 14B
Mid-sized Qwen model with strong long-context behavior and a practical fit for 24 to 80 GB cards.
14.7B dense • 131,072 context • 8 KV heads
Architecture
Model spec
Architecture
Total params
Active params
Layers
Hidden size
Attention heads
KV heads
KV-bearing layers
Context length
Modality
License
Why it matters
Why memory behaves this way
Research highlight
Scaled Qwen long-context stack with grouped attention and strong dense-model generality.
Memory note
The jump from 7B to 14B is mostly resident weight memory; KV cache remains relatively controlled thanks to grouped KV heads.
Checkpoints
Official profiles
Official BF16 checkpoint
BF16 checkpoint
The official Qwen2.5-14B-Instruct checkpoint repository is about 29.6 GB on Hugging Face.
Official GPTQ 4-bit checkpoint
4-bit checkpoint
The official Qwen2.5-14B-Instruct-GPTQ-Int4 checkpoint repository is about 10 GB on Hugging Face.
Official AWQ 4-bit checkpoint
4-bit checkpoint
The official Qwen2.5-14B-Instruct-AWQ checkpoint repository is about 10 GB on Hugging Face.
Sources