Preskočiť na obsah
Model catalog

Open-source model catalog

Filter open models by size, memory need and license — we deploy any of them privately, on your hardware.

Model Params Min VRAM License Best for
Phi-3.5 mini 3.8B 4 GB MIT Edge & on-device
Mistral 7B 7B 6 GB Apache 2.0 Fast assistants
Llama 3.1 8B 8B 8 GB Llama 3.1 Copilots & chat
Qwen 2.5 7B 7B 8 GB Apache 2.0 Multilingual chat
Gemma 2 9B 9B 10 GB Gemma Lightweight tasks
Gemma 2 27B 27B 20 GB Gemma Balanced quality
Qwen 2.5 32B 32B 24 GB Apache 2.0 Reasoning & analysis
Mixtral 8x7B 47B MoE 24 GB Apache 2.0 Throughput & RAG
Llama 3.3 70B 70B 40 GB Llama 3.3 High-quality reasoning
Qwen 2.5 72B 72B 40 GB Qwen Complex tasks
Mixtral 8x22B 141B MoE 80 GB Apache 2.0 Enterprise workloads
Llama 3.1 405B 405B Multi-GPU Llama 3.1 Maximum quality
DeepSeek R1 671B MoE Multi-GPU MIT Frontier reasoning
Falcon 180B 180B Multi-GPU Falcon Large-scale serving

Indicative minimum VRAM for quantized inference; we size the exact hardware per use case.