Model notes

OpenReasoning Nemotron 1.5B

Small dense Nemotron reasoning model built on the Qwen2.5 1.5B geometry, aimed at strong math and code behavior on modest hardware.

1.5B dense • 32,768 context • 2 KV heads

Open base model Open selected checkpoint

Architecture

Model spec

Architecture

Dense decoder-only transformer

Total params

1.5B

Active params

Dense model

Layers

Hidden size

1,536

Attention heads

KV heads

KV-bearing layers

Context length

32,768

Modality

Text

License

CC-BY-4.0 + Apache 2.0

Why it matters

Why memory behaves this way

Research highlight

NVIDIA post-trains the Qwen2.5 1.5B base for reasoning while keeping the dense grouped-query architecture intact, so the memory geometry stays predictable.

Memory note

This behaves like a classic dense Qwen2.5-style checkpoint where resident weights dominate and KV cache follows the standard grouped-attention path.

Checkpoints

Official profiles

Official BF16 checkpoint

BF16 checkpoint

Current

NVIDIA ships OpenReasoning-Nemotron-1.5B in Hugging Face Transformers format, and v1 models it as a standard dense Qwen2.5-derived checkpoint across the supported runtimes.

vLLMTransformers

Open checkpoint

Sources

Reference links

https://huggingface.co/nvidia/OpenReasoning-Nemotron-1.5Bopen