Model notes
GPT-OSS 20B
Smaller GPT-OSS reasoning checkpoint with a routed MoE stack, 128K context, and a relatively light active path.
21B total • 3.6B active • 128,000 context • 8 KV heads
Architecture
Model spec
Architecture
Total params
Active params
Layers
Hidden size
Attention heads
KV heads
KV-bearing layers
Context length
Modality
License
Why it matters
Why memory behaves this way
Research highlight
Each MoE block has 32 experts with top-4 routing, and the stack alternates full and sliding-window attention to keep long-context reasoning practical.
Memory note
More than 90% of GPT-OSS 20B's parameters sit in MoE weights quantized to MXFP4, while the remaining shared weights stay in BF16.
Checkpoints
Official profiles
Mixed MXFP4 + BF16 checkpoint
BF16 checkpoint
OpenAI's GPT-OSS model card lists a 12.8 GiB checkpoint for gpt-oss-20b. The estimator uses that published mixed MXFP4 + BF16 resident checkpoint size directly.
Sources