Open-source model catalog

Model catalog

Filter open models by size, memory need and license — we deploy any of them privately, on your hardware.

Model	Params	Min VRAM	License	Best for
Phi-3.5 mini	3.8B	4 GB	MIT	Edge & on-device
Mistral 7B	7B	6 GB	Apache 2.0	Fast assistants
Llama 3.1 8B	8B	8 GB	Llama 3.1	Copilots & chat
Qwen 2.5 7B	7B	8 GB	Apache 2.0	Multilingual chat
Gemma 2 9B	9B	10 GB	Gemma	Lightweight tasks
Gemma 2 27B	27B	20 GB	Gemma	Balanced quality
Qwen 2.5 32B	32B	24 GB	Apache 2.0	Reasoning & analysis
Mixtral 8x7B	47B MoE	24 GB	Apache 2.0	Throughput & RAG
Llama 3.3 70B	70B	40 GB	Llama 3.3	High-quality reasoning
Qwen 2.5 72B	72B	40 GB	Qwen	Complex tasks
Mixtral 8x22B	141B MoE	80 GB	Apache 2.0	Enterprise workloads
Llama 3.1 405B	405B	Multi-GPU	Llama 3.1	Maximum quality
DeepSeek R1	671B MoE	Multi-GPU	MIT	Frontier reasoning
Falcon 180B	180B	Multi-GPU	Falcon	Large-scale serving

Indicative minimum VRAM for quantized inference; we size the exact hardware per use case.