Company

Mistral

Dense and MoE Mistral-family models commonly used in open inference stacks.

Start here

Latest model

Mixtral

Sparse MoE model where runtime compute is closer to one expert pair, but VRAM still pays for resident weights.

46.7B total • 12.9B active • 32,768 context • 8 KV heads

Series

Sparse MoE model where runtime compute is closer to one expert pair, but VRAM still pays for resident weights.

46.7B total • 12.9B active • 32,768 context • 8 KV heads

Series

Long-context dense Mistral checkpoint that remains practical on a single 24 GB card with quantization.

12.2B dense • 128,000 context • 8 KV heads