// cosmogenic.org — model intelligence — 2026.02

Open vs Closed
Model Landscape

// foundation models · edge viability · VRAM matrix · quantisation guide

The frontier has shifted. Open-weight models now match or exceed closed proprietary systems on most inference tasks. The question is no longer capability — it's deployment: which models run on the hardware you actually have.

Open-weight (self-hostable)

Closed / API-only

Hybrid / partial open

Edge-viable zone

// model parameter size vs edge viability · bubble size = VRAM required · click any model

Edge Viability →

Edge viable zone

← Smaller Model · Parameter Count · Larger Model →

// model catalogue — click to inspect

// VRAM requirements by model size and quantisation

VRAM Requirement Matrix

Quantisation trades a small accuracy penalty for dramatic reductions in VRAM. INT4 quantised 70B runs on a 48GB workstation GPU. A 7B INT4 model fits in 4GB — a gaming laptop. The column headers are GPUs you can actually buy.

Model Size	Precision	VRAM needed	4GB GPU GTX 1650	8GB GPU RTX 3070	16GB GPU Arc A770	24GB GPU RTX 3090	48GB GPU RTX 6000 Ada	80GB A100 SXM
1B–3B	FP16	2–6GB	✓	✓	✓	✓	✓	✓
7B	FP16	14GB	✗	✗	~	✓	✓	✓
7B	INT4	3.5GB	✓	✓	✓	✓	✓	✓
13B	FP16	26GB	✗	✗	✗	✗	✓	✓
13B	INT4	6.5GB	✗	✓	✓	✓	✓	✓
70B	FP16	140GB	✗	✗	✗	✗	✗	~×2
70B	INT4	35GB	✗	✗	✗	✗	✓	✓
70B	INT2	17GB	✗	✗	✗	~	✓	✓
405B	INT4	202GB	✗	✗	✗	✗	✗	~×3
CPU only	INT4	RAM	7B runs on any modern CPU via llama.cpp · ~5 tok/s · No GPU required · R740 viable today

// select a model card above

Model Detail

Click any card to inspect

This panel shows full capability breakdown, edge deployment requirements, quantisation options, and relevance for the cosmogenic federated inference stack.

// decision guide

Edge Deployment
Decision Tree

R740 CPU only · Right now

Llama 3.2 3B · Phi-3.5 Mini · Mistral 7B INT4 · Gemma 2 2B — runs today, no GPU needed

R740 + 5060 GPU · Current setup

Llama 3.1 8B FP16 · Mistral 7B FP16 · Qwen 2.5 14B INT4 · Phi-4 14B INT4

+ RTX 5060 Ti (next step · ~£550)

Llama 3.1 70B INT4 · Mixtral 8×7B · Qwen 2.5 72B INT4 · DeepSeek Coder 33B

Closed models (API only)

GPT-4o · Claude · Gemini Ultra — no local deployment · data leaves your infrastructure