// cosmogenic.org — reference architecture

Compute AI
Building Blocks

The full stack from silicon to inference. Click any layer to inspect its properties, edge vs hyperscale profile, and role in a federated compute architecture.

L8 — Application User Interface & API Gateway BOTH REST · WebSocket · gRPC
L7 — Model Foundation Model / Fine-Tune EDGE-VIABLE 7B → 405B params
L6 — Runtime Inference Engine EDGE-VIABLE vLLM · llama.cpp · TensorRT
L5 — Compute GPU / NPU / CPU Fabric CRITICAL FP16 · BF16 · INT8 · INT4
L4 — Memory HBM / VRAM / DRAM Hierarchy BOTTLENECK HBM3 · 3.35TB/s
L3 — Storage Model Weights & KV Cache EDGE-VIABLE NVMe · S3 · Ceph
L2 — Network Interconnect & Federation Fabric EDGE-VIABLE InfiniBand · Ethernet · 5G
L1 — Infrastructure Power · Cooling · Physical LEVERAGE POINT PUE · Tidal · Solar · Grid

Select a layer
to inspect its
architecture profile

Node Topology // Right-sized compute spectrum
EDGE — R740
Rack Server Node
2× Xeon / 5060 GPU
128GB RAM · 4TB NVMe
Solar + Grid hybrid
The Garioch, Aberdeenshire
EDGE — RPi
Raspberry Pi Node
ARM Cortex-A72
8GB RAM · USB SSD
Low-power inference
Quantised 3B–7B models
EDGE — OIL PLATFORM
North Sea Ruggedised
Tampnet fiber backbone
350+ platform coverage
Subsea routing proven
Harsh environment rated
HYBRID — TIDAL
Renewable Anchor Node
MeyGen / SIMEC power
Pentland Firth capacity
2GW Peterhead HVDC
Grid-forming compute
HYBRID — MYCELIUM
Data Centre + Mushrooms
Server waste heat → grow
Medicinal cultivation
Dual revenue stream
Circular economy model
HYPERSCALE (reference)
AWS / Azure / GCP
Centralised extraction
A100/H100 clusters
Economic leakage model
Single jurisdiction risk
Edge vs Hyperscale // Key differentiators per block
Building Block Edge Advantage Edge Hyperscale
Infrastructure / Power Tidal, solar, waste heat re-use — lowest PUE possible
Network / Interconnect Existing fiber (North Sea proven), no new builds needed
Storage Local NVMe serves 7B–70B weights without round-trips
Memory Bandwidth GPU VRAM scales linearly — rack servers viable for inference
Compute (GPU fabric) Consumer / prosumer GPUs sufficient for quantised inference
Inference Runtime llama.cpp / Ollama run on CPU — no GPU required for small models
Model (Foundation) Open-weight models (Llama, Mistral, Phi) run edge-deployed
Data Sovereignty Computation stays within jurisdiction — economic value local