Model notes
Gemma 2 27B
Larger Gemma model that trades a shorter native context window for more capacity per token.
27B dense • 8,192 context • 16 KV heads
Architecture
Model spec
Architecture
Dense decoder-only transformer
Total params
27B
Active params
Dense model
Layers
46
Hidden size
4,608
Attention heads
32
KV heads
16
KV-bearing layers
46
Context length
8,192
Modality
Text
License
Gemma terms
Why it matters
Why memory behaves this way
Research highlight
Scaled Gemma dense architecture with more capacity per token than the 9B variant.
Memory note
Because the context window is shorter, most VRAM pressure comes from resident weights rather than cache growth.
Checkpoints
Official profiles
Official BF16 checkpoint
BF16 checkpoint
Google's official Gemma 2 27B Instruct release is exported in bfloat16.
vLLMTransformers
Open checkpointSources