Back to calculator

Model notes

OpenReasoning Nemotron 1.5B

Small dense Nemotron reasoning model built on the Qwen2.5 1.5B geometry, aimed at strong math and code behavior on modest hardware.

1.5B dense • 32,768 context • 2 KV heads

Architecture

Model spec

Architecture

Dense decoder-only transformer

Total params

1.5B

Active params

Dense model

Layers

28

Hidden size

1,536

Attention heads

12

KV heads

2

KV-bearing layers

28

Context length

32,768

Modality

Text

License

CC-BY-4.0 + Apache 2.0

Why it matters

Why memory behaves this way

Research highlight

NVIDIA post-trains the Qwen2.5 1.5B base for reasoning while keeping the dense grouped-query architecture intact, so the memory geometry stays predictable.

Memory note

This behaves like a classic dense Qwen2.5-style checkpoint where resident weights dominate and KV cache follows the standard grouped-attention path.

Checkpoints

Official profiles

Official BF16 checkpoint

BF16 checkpoint

Current

NVIDIA ships OpenReasoning-Nemotron-1.5B in Hugging Face Transformers format, and v1 models it as a standard dense Qwen2.5-derived checkpoint across the supported runtimes.

vLLMTransformers
Open checkpoint

Sources

Reference links