Company
Dense and MoE Mistral-family models commonly used in open inference stacks.
Start here
Mixtral
Sparse MoE model where runtime compute is closer to one expert pair, but VRAM still pays for resident weights.
46.7B total • 12.9B active • 32,768 context • 8 KV heads
Series
Long-context dense Mistral checkpoint that remains practical on a single 24 GB card with quantization.
12.2B dense • 128,000 context • 8 KV heads